1. Introduction
Over the past century, the ability of education systems to equip graduates with the necessary professional and career skills needed for the 21st century has been questioned [
1,
2]. Consequently, the need for an effective education system that focuses on the potential and actual abilities of the graduate has become more crucial. In response to this need, several reformation attempts of the traditional education systems have been made since 1950, among which Outcome-Based Education (OBE) is the most prominent. According to Spady [
3], the prime mover of the OBE, developing an OBE system requires identifying a clear set of learning outcomes, around which all the system activities are centered, and establishing the conditions and opportunities within the system that enable and encourage all students to achieve those essential outcomes. Currently, the OBE approach is becoming prevalent in higher education academic programs. It is realized through identifying three types of outcomes: PEOs, PLOs, and Course Outcomes (COs) [
4]. Although PEOs describe, in broad statements, career and professional accomplishments that the program is preparing its graduates to achieve, PLOs describe, in narrower statements, the knowledge, skills, and behaviors those students are expected to attain by the time of graduation [
5]. Similarly, COs are statements that describe the knowledge, skills, and behaviors students are expected to attain as a result of taking a course.
Figure 1 depicts the PEOs, PLOs, COs, and their correlations [
6].
Conceptually speaking, PEOs collectively represent a broad vision of the program that inform all its activities [
7]. They serve as an important nexus point to assess the program; the point at which curriculum, faculty, facilities, and other programmatic components are viewed within a large context of a program’s stockholder’s needs and the mission of the institution [
8]. Broadly speaking, PEOs play a key role in a program’s continuous improvement and provide a mean for the academicians to define what continuous improvement means for the program [
9]. Practically speaking, PEOs are developed by program constituencies and expressed linguistically as short statements that describe graduate attributes and accomplishments within a few years of graduation. Typically, these attributes fall within the following categories [
9]: technical skills, professional, ethical, communication aspects, management and leadership, lifelong learning and continuous education, advanced and graduate studies pursuing, etc. In addition, PEOs must be mapped into a predefined set of PLOs, which are developed by educational authorities or accreditation agencies [
6,
10]. The PEOs must be assessed periodically to continuously improve the program [
7,
11].
Given the hierarchical structure of academic program and the key role of the PEOs in this structure, it is hypothesized that a profound understanding of PEOs at the conceptual as well as the practical level is essential for a successful design and implementation of its processes [
12]. A particular aspect in program structure that merits a thorough investigation is the internal structure of PEOs in terms of the correlation among them. It is expected that the outcome of this investigation would contribute to a better understanding of PEOs correlations, and ultimately lead to more informed systemization and optimization of different processes of academic programs. Unfortunately, as ascribed in [
7,
9], the literature paid very little attention to the study PEOs. Even accreditation bodies, such as the Accreditation Board for Engineering and Technology (ABET), provide little in the way of concrete guidelines for what should be included in the PEO, nor for the processes of generating and assessing them.
Recently, data analytics approaches, particularly learning analytics (LA), are being used actively for a wide range of purposes in tertiary education, to enhance the learning process, evaluate efficiency, improve feedback, enrich the learning experience and support decision making [
13]. In this paper, the power of learning analytics is leveraged to deepen the understanding of PEOs by analyzing the correlation among them. More specifically, this paper applies data similarity methods to analyze the correlations among the PEOs of engineering programs. To do so, a dataset of PEOs of ABET accredited engineering programs has been collected. The PEOs, which are mapped into a set of PLOs developed by ABET, are processed, and represented as vector space models, in terms of ABET PLOs, to measure the correlation among them. Besides the actionable insights that can be obtained from this investigation, computing similarity among PEOs is essential practice for developing practical PEOs-based applications, such as clustering, recommendation and visualization.
The remaining sections of this paper review the relevant works, describe the general methodology of computing data similarity, describe the specific methodology of computing similarities among PEOs, present and discuss the obtained results, and finally conclude this paper.
2. Related Works
The vast amount of data available in digital repositories has provoked the emergence of data mining, as analytical tools can be used to extract meaningful knowledge from such data. Data mining has already been successfully applied to many domains, including medicine, business, robotics and computer vision, to name just a few [
14,
15]. Likewise, the constant upsurge of data in educational institutions has given rise to the emergence of educational data mining and LA, with a focus on developing, researching and applying computer-based methods to discover patterns in large educational data collections that would otherwise be difficult or impossible to analyze [
13,
16]. The increasing interest in the two fields is demonstrated by the increase in research of applying data mining methods to data from a variety of educational repositories.
In many applications, computing data similarity is required and normally used by Machine Learning (ML) algorithms, particularly those that deal with clustering, recommendation and dimensionality reduction [
14]. In education data mining, data similarity methods have many applications, particularly in adaptive learning systems and recommendation systems. For example, in automatic recommendation systems, similarity measures are extensively used for clustering of educational items or users [
17]. In addition, data similarity measures are found to be useful for a better understanding of educational processes and providing decision-makers with actionable insights [
18].
The applications of data similarity to educational data have been reported in various contexts. In the programming domain, data similarity approaches are explored to select a set of the most relevant remedial programming items and worked-out examples to support students who have trouble solving a code comprehension problem in the Java language [
19]. In the same domain, a content-based similarity approach is applied to provides personalized access to a repository of programming examples through adaptive visualization [
20]. An additional example of data similarity application in the programming domain is reported in [
21], which focuses on similarity among non-graded items such as an explanatory text and videos. To measure the similarity between educational items involving both text and images, a similarity measure is proposed in [
22] to measure the similarity between items based on a representation computed by a neural network. The proposed measure is suitable for the mathematical domain, where items containing both text and images are commonly used. In the mathematical domain, the similarity of word problems is specifically studied in [
23]. Moreover, similarity and clustering of users in mathematics learning system is studied in [
24], where the whole processing pipeline for computing similarity is described in detail.
Similarly, data similarity measures are also used to analyze educational item (questions, problems) similarities for many purposes: to be used as input with clustering or visualization techniques [
17]; to detect plagiarism in online exams, particularly cheating in essay questions, multiple-choice questions, and fill-in-the-blank questions [
25]; to measure the degree of similarity for Indonesian essay assessment [
26]; to group documents or contributions to identify the sub-topics and topic evolutions in the graduate discussion forums [
27]; to compare students navigation behavior in different dimensions [
28].
The review of the above-related works reveals the wide variety of contexts where data similarity methods can be employed, as well as the variety of data similarity measures that can be applied. Another variety of choices in applying data similarity methods is whether to compute data similarity via features or directly from the data. In the case of computing data similarity from features, a transformation of the data is performed. Moreover, the review reports that in most domains, measuring item similarity is not a clearly defined problem, and therefore there is no single correct measure of data similarity; a common practice is to use multiple measures. In these settings, it is hard to answer a general question such as which measure is better or worse. Nevertheless, questions such as: “Which choices in the similarity computation are the most important?”, “Which measures are highly correlated (and thus it is not necessary to consider both of them)?”, “How much data do we need for similarity measures to be stable?” can be explored [
18].
Considering all the aforementioned drawn findings from the previous applications of data similarity measures, this paper explores the application of data similarity measures in a new educational context that is the correlation among educational objectives of academic programs. The outcomes of this application would contribute to a better understanding of the PEOs and provide useful actionable insights for the decision-makers for better planning and implementation of academic programs.
5. Results and Discussion
This section presents and discusses the results of applying the three data similarity measures to compute the similarity between PEOs.
Table 3,
Table 4 and
Table 5 show the similarity of PEOs based on the ED, MD and CS measures, respectively. For the sake of illustration, a heatmap data visualization technique is used to represent the similarity values in different colors. As mentioned above, the ED measures the straight distance between points, hence, theoretically, its values fall in the range between zero and infinity. In
Table 3, the ED’s values range between 0 (identical pair of PEOs) and 0.43 (the most dissimilar pair of PEOs). As for MD, this measures the distance between the vector space of two points in terms of the number of horizontal and vertical units between them, hence, theoretically, its values fall in the range between zero and infinity. In
Table 4, the MD’ between the PEOs vector spaces ranges between 0 (identical PEOs) and 1.35 (the most dissimilar pairs of PEO). Finally, the CS measures the cosine of the angle between the vector spaces of two points, hence, theoretically, its values fall in the range between zero and infinity. In
Table 5 the CS’s values of the PEOs vector space models range between 0.38 (the most dissimilar PEOs) to 1 (identical pair of PEOs).
Despite the different mechanisms of measuring similarity and different measurement scales, of the three measures the heatmap visualization of the three measures reflects the degree of consistency between them. Virtually, the three measures show a sort of consistency among them in their evaluation of the similarity between PEOs. However, to quantitatively evaluate the degree of consistency between the three measures, the agreement analysis between them can be employed. Basically, there are two methods for measuring the agreement between similarity measures [
18]. The first method applies a simple correlation, such as Pearson’s correlation, to measure the correlation between the similarity matrices after flattening them into vectors. The second method is based on generating ranking matrices from the similarity matrices of the measures and then compute the agreement between similarity measures based on the ranking matrices. In this work, measuring the agreement based on the ranking is adopted, therefore the ranking matrices of the three measures are computed, as shown in
Table 6,
Table 7 and
Table 8.
Based on the ranking matrices, the agreement between matrices of the three data similarity measures are computed.
Table 9 shows the mutual agreement between every pair of measures in their measuring of the similarity between a particular PEO and the other PEOs. It also shows the average agreement (last row) between every pair of similarity measures across all of them. Obviously, CS indicates full agreement with ED for LL PEO and with MD for L and SC PEOs. Interestingly, the CS measures show higher agreement with ED and MD than the agreement between ED and MD.
In addition, the agreement across the three measures is computed (last column) for each PEO and across all PEOs (which is 0.49). The highest agreement between the three measures is in measuring the similarity between L and other PEOs, while the lowest agreement is in measuring the similarity between T and other PEOs.
Despite the differences between the similarity measures in their evaluation of the similarity among PEOs, the overall similarity matrix, which gives an overall estimation of the similarity among PEOs based on the similarity matrices of the three measures, can be computed. However, the values of the ED similarity matrix and MD similarity matrix need to be normalized first so that their values fall in the range of 0 and 1. This can be done using the following formula:
Then the overall similarity matrix can be computed as the average over the three matrices, as shown in
Table 10.
As illustrated in
Table 10, the similarity is high for the following pairs of PEOs, (CS, GS), (KC, TC) and (TC, CS), while it is low for the pairs (EC, TC), (EC, GS), (L, TC). This is expected, as the PEOs in the first group depend mostly on the soft skills SOs while the PEOs in the second group depend on a different set of SOs skills. Additionally, the spectral visualization of the similarity of each PEO with the other PEOs is shown in
Table 11. A closer look at the spectral representation of PEOs’ similarities disclose several interesting aspects of PEOs correlations. First, the similarity between the spectral representation of PEOs similarities of CS, TC, KC, and GS suggests that they are forming a cluster. This can be interpreted by looking at the PEOs–PLOs mapping, from which it can be observed that these PEOs are highly correlated with a set of PLOs (a, b, c, e, k) that is known as the hard skills PLOs. By the same token, the following set of PEOs (C, L, T, EC, P and SC) show a similarity in their spectral representation of PEOs similarities. Again, this suggests that they are forming another PEOs cluster and can be interpreted by looking at the PEOs–PLOs mapping, in which it can be observed that these PEOs are highly correlated with a set of PLOs (d, f, g, h, i, j) that are known as the soft or professional skills PLOs. An interesting observation of the tow cluster is that the correlations among the PEOs of the first cluster is higher than the correlations among the PEOs of the second cluster.
Finally, with regards to LL PEO, the spectral representation shows that it is mostly correlated with the PEOs of the first cluster; however, its highest similarity with P PEOs from the second cluster is notable. This is suggesting that LL is a common PEO which is related to both PEOs clusters.
The spectral representation of the PEOs’ similarities can give an overall view on the correlations among them. As shown in
Table 11, some of these of these correlations are intuitive and self-explanatory, such as TC and CS, because the technical competency is essential for carrier success; KC and TC because the knowledge and technical competencies have a reciprocal influence on each other. T and CS are also correlated because teaming skills become essential for career success, and T and C are also highly correlated because communication skills are essential for teamwork. LL and P are highly correlated because lifelong learning is indispensable to increasing the professionalism as well. However, some PEOs correlations require further investigation because of unobvious intuitive connections between them, such as CS and GS.
From a practical perspective, the above-drawn PEOs correlations provide actionable insights for the systemization and optimization of various processes in the academic programs, such as design, development, assessment and accreditation. The design of an academic program is a top-down process that involves drafting its PEOs and matching them to a predefined set of PLOs, and then design a program curriculum accordingly. Given this top-down view of the academic program, it is obvious that the insights on the PEOs correlations, located at the top level of the program, are useful for informing their matching to PLOs and the design of program curriculum at the bottom level. For example, the high correlation among the KC, TC, GS and CS PEOs suggests grouping the courses that focus on their related skills in the curriculum together, as well as design correlated course-level teaching and assessment activities accordingly. Another potential benefit of the drawn PEOs correlations during the design stage of the program is the minimization of the number of PEOs when drafting them. It is a recommended practice which can be optimized in light of the drawn insights by drafting the highly correlated PEOs in one objective, thus minimizing the number of PEOs.
Another important academic program process where the drawn insights on correlations among PEOs could be used is the assessment of PEOs. It is an essential process for maintaining the quality of an academic program and obtaining academic accreditation. During the development of the assessment PEOs plan, the insights on the PEOs correlations are a useful tool to optimize the plan in terms of the time and efforts required to implement it. For example, instead of assessing all PEOs, it is possible to obtain an approximate estimation of some PEOs based on the assessment results of others. This is particularly useful when the assessment of some PEOs might be hindered by cost, data availability and so forth.
The development process of academic programs involves reviewing the existing PEOs and introducing changes, such as adding new PEOs or modifying the existing ones. In this process, the insights on the PEOs correlations can be used to inform the developmental decisions. For example, if the developmental decision is to add a new PEO to the existing ones, the PEOs correlations can be used to predict the achievement of the newly added PEOs based on their correlations to the existing ones. In this manner, the development process of the academic program can be systematized and optimized.
The accreditation of academic programs is another process which can be optimized and systematized by the insights on the PEOs correlations. In this process, the main task of program evaluators is to evaluate the consistency and adequacy of the program’s activities for achieving its PEOs and the validity of the assessment process as a whole. With these insights in the minds of program evaluators, their assessment of a program’s quality becomes more informative. For example, the strong correlations between LL and P PEOs allows program evaluators to make assumptions on the degree of consistency and organization between their supportive curricular or extracurricular activities. Furthermore, these insights allow the program evaluators to make assumptions on the levels of PEOs achievements and ultimately judge the quality of the program.
Another process that could benefit from the obtained insights on the correlations among PEOs is the comparison between programs to understand the landscape of education in a particular discipline, such as engineering. Certainly, knowing the similarities among PEOs assists in evaluating the similarities/differences between programs at this level.
Finally, from a software development perspective, the obtained insights on the correlations among PEOs can inform the process of developing computer-based systems that could contribute to the development of computer-assisted academic program designs or accreditation.
6. Conclusions
In this research, the correlations among the PEOs of OBE academic programs were discovered using a learning analytics-based approach. Three data similarity measures, namely ED, MD, CS, were applied to discover the correlation among a set of 11 PEOs extracted and preprocessed from 215 engineering academic programs. The obtained results provide different views of the correlations among PEOs from three different perspectives. Although the three measures are different, the analysis of the agreement among them shows a remarkable consistency in their evaluation of the PEOs correlations. Finally, the average PEOs similarity matrix was computed, after normalizing the scale of measurement of ED and MD to fall in the range 0 and 1. From the average similarity matrix, the spectral similarity vectors of PEOs were drawn, from which two clusters of PEOs were identified. The first cluster involves the PEOs that were highly mapped to hard skills PLOs, while the second cluster involves the PEOs that were highly mapped to soft skills PLOs. It also identifies several interesting PEOs correlations, which are intuitively interpreted, and several PEOs correlation, which need further investigation on their causality correlations. In addition to the practical benefits of the presented approach to the applications that depend on computing similarity such as recommendation and visualization systems, the discovered insights are useful knowledge for the academicians and decision-makers to better understand, design and assess their programs.
Finally, this work can be extended in several directions. First, since this research focuses on the engineering discipline, it can be replicated for other disciplines, such as science, computing, art, etc. Secondly, the outcomes of this research, which are based on quantitative analysis, pave a way for more investigations of the causal correlations or prerequisite correlations among PEOs. Thirdly, based on this research, an interesting correlation analysis between academic programs in a given discipline such as engineering by different regions/countries (and/or by other properties) can be conducted.