1. Introduction
New competencies will be required in the years to come, and a way for future citizens to be well prepared for this is by “learning to know”. According to Jacques Delors, “learning to know” means “learning to learn” to benefit from the opportunities that education provides throughout life [
1].
In Greece, vocational education and training (VET) graduates gain a specialized degree in their field. Moreover, it has been observed that a large percentage of them seek to join the production workforce as specialized personnel. During the period 2015–2019, a highly targeted effort was made in Greece to support VET. The aim of this effort was effective institutional, educational, and pedagogical reforms in various fields [
2,
3].
In order to prevent and tackle early school leaving (ESL) in society, it is fundamental to detect the main factors causing it. The literature proposes a wide set of aspects, ranging from school-based explanations (school segregation) to individual, family, and socio-economic characteristics [
4,
5]. ESL is mainly observed among students of vocational education, where many of them are above 18 years old, students who face financial problems in their families and are forced to work in parallel with their studies, or students who come from a foreign country. The percentage of the latter can be expected to increase by 45 in the future due to the refugee crisis [
6].
ESL is one of the core issues on the European educational agenda. It is a fundamental concern, as it does not only limit the life opportunities of ESLers but also calls into question the foundations of the European educational systems as guarantors of equal opportunities as well as the right to education.
In fact, far from being a phenomenon that affects all social groups equally, ESL is particularly significant among the most economically, socially, and culturally disadvantaged groups and, therefore, must be viewed as a central question in terms of educational equality and social justice [
7]. The UNESCO World Inequality Database on Education (WIDE) provides evidence of the severity of ESL: 258 million children and youths were out of school during the school period 2017–2018 [
8].
The strategic framework for European cooperation in education and training (known as ET 2020) adopted a benchmark to be achieved by 2020, namely, that the percentage of early leavers from education and training in the EU should be no more than 10%. With 9.9%, this target was only just met in 2020 [
9].
In Bulgaria, Denmark, Estonia, Greece, Spain, Croatia, Cyprus, Lithuania, Hungary, the Netherlands, Poland, Romania, and Finland, the highest proportion of early leavers was reported in rural areas. This was also the case in Iceland, Norway, and North Macedonia [
9].
In Greece, the ELS phenomenon was even more intense during the 2008–2009 economic crisis and the recession that followed afterward [
10], strongly proving the significant relationship between socioeconomic circumstances, ESL and absenteeism phenomena.
According to a report published by the Institute of Education Policy [
11], across the country of Greece, student outflows were lower in urban areas followed by semi-urban and rural areas but with a small difference. There were no significant differences in relation to educational levels, except for vocational education, in which the dropout rate was higher in urban areas and lower in semi-urban areas. Although the percentage of early leavers was 1.92% for upper secondary general education, it climbed to 11.02% for upper secondary VET [
11].
Our study claims that a learning analytics (LA) system could fulfill the requirements for informing all parties concerned (students, instructors, parents, school administration, etc.) on the dot. Although higher education has already taken advantage of the benefits of LA systems [
12,
13,
14,
15,
16], especially with the mediation of online learning environments, the fact that these systems are addressed to non-information-technology experts (non-IT experts) to make sense of forms a barrier for potential use in secondary education. As a result, making these systems easy to use is of great importance. VET remains overlooked in learning analytics research [
17].
The lack of guidelines on what to consider when implementing the LA prevents its full adoption; hence, there are important issues to be considered when approaching the design of the LA experience in terms of data preparation [
18]. Undoubtedly, data preparation is a fundamental stage of data analysis in predictive learning analytics (PLA) for student dropout [
19].
Developing and using an early warning system (EWS) is a good solution for detecting students at high risk of dropout as early as possible. An EWS is any system designed to alert decision makers of potential dangers. Its purpose is to allow for prevention of the problem before it becomes an actual danger [
20,
21]. In this line, it is important to realize that identifying students at risk of dropping out by using an EWS is only the first step in truly addressing the issue of school dropout. The next step is to identify the specific needs and problems of each individual student who is in danger of dropping out and then to implement programs to provide effective and appropriate dropout prevention strategies [
21].
This study aimed to present the results of a methodology based on a prototype application that was developed in order to manage and visualize students’ data for early decision making such as ESL. Through this application, the factors that are related to absenteeism and ESL are highlighted. In more detail, the application collects and analyses students’ data concerning their school presence. Afterwards, it creates visuals that can be easily read by non-experts who can then be informed regarding students’ behavior, spot students in dropout danger, and generally come to conclusions that can be used to reduce absenteeism.
2. Related Work
A significant observation is that in Greece, vulnerable social groups, such as immigrants or other ethnic minorities, present a five times higher risk of dropping out of secondary education than the general population [
22]. Other factors driving this phenomenon include cognitive ability, family composition, socioeconomic situation, or school location [
23]. Interventions should not and cannot always be aimed exclusively at a person or his/her individual family. Instead, school policies can be proven to be effective. The same factors were acknowledged by teachers in a relevant study [
24]. Additionally, several behavioral problems, such as depression, anxiety, or attention deficit hyperactivity disorder (ADHD) syndrome, increase the risk of dropping out of school at an early stage [
25]. Health and family problems are also important reasons for absenteeism; in India, a biometric system in combination with home visits was proposed for the decrease of chronic absenteeism and, thus, of the dropout rate [
26]. Socioeconomic factors were identified as a major predictor of expected ESL in the research in [
27].
Another more recent study [
28] makes a distinction between risk factors for absenteeism and ESL. The authors mention that the factors with large effects on absenteeism include a negative attitude towards school, substance abuse, externalizing and internalizing problems of the juvenile, as well as low parent–school involvement. As far as the issue of dropout is concerned, the risks that indicated a large effect on absenteeism include a negative attitude towards school, substance abuse, externalizing and internalizing problems of the juvenile, as well as low parent–school involvement. According to them, as far as dropout issue is concerned, the risks that indicated a large effect include a history of grade retention, having a low IQ, or experiencing learning difficulties and low academic achievement.
A significant distinction is to understand whether a student is absent due to the presence of an insuperable barrier (illness, serious family reasons, etc.) or by choice. Truancy is a common reason for systematic absenteeism that can lead to significant loss of knowledge, poor grades, and even school dropout. A truant has a 3.4 times higher risk of leaving school than his/her peers [
29]. Moreover, peer group effects are emerging. Students’ social behavior relates to the tendency of missing classes for no important reason. Bembich [
30] highlighted the dysfunctional relational structures that raise the risk within groups of students. Therefore, having the opportunity to detect patterns in students’ interactions when the number of their absences is becoming high for no obvious reason becomes of crucial importance.
A different approach to this problem is to investigate the level of school leavers. In the study in [
31], students were separated into four different groups (i.e., those without any qualification, with low qualified, apprentices, and with full qualification). It was shown that these four groups revealed clear differences in the effects of different factors on the risk of ESL. Level-related factors were recognized also in the research in [
32]. Higher levels of absenteeism appeared to be more closely related to lower achievement orientation, active-recreational orientation, cohesion, and expressiveness.
The common denominator in the research concerning absenteeism and ESL is the recognition of the importance of the early detection of students at risk. As Lyche [
33] well stated: “Early identification enables broader, less costly measures to be set up earlier and leaves the more costly one-on-one measures for later stages of education to the remaining at-risk students that have not yet been picked up”. The idea of taking action in the shortest possible time is central in the European Union policy as well. That is, in order to cultivate knowledgeable and successful teachers and schools, by emphasizing and empowering the elimination of social and educational exclusions, it needs to be an integral part of the pursuit of progress and development of the society and the economy [
34]. Thus, predictive and prescriptive analytics should be established at every educational level to provide the necessary data for decision making. Leveraging school data educators can take action rapidly to prevent ELS. Early warning systems have been used in six European countries (i.e., Poland, Lithuania, Germany, Sweden, Ireland, and the United Kingdom) with positive results [
35].
In higher education, efforts are usually aimed at creating predictive systems that will notify tutors for on-time interventions concerning students at risk, based on the prediction of their academic achievements. Machine learning techniques provide an accurate prediction of dropping out, allowing educators prompt action to prevent students’ failure. Lykourentzou et al. [
36] compared three different techniques achieving high accuracy of prediction in e-learning course students. Márquez-Vera et al. [
21] identified significant predictive factors for students at risk, including low academic performance, substance use (alcohol and smoking), hours of work, maternal education, and the number of students in each class, using machine learning techniques. A recent study [
37] provided results of high accuracy to a time series prediction problem by proposing a model with an accuracy that can reach up to 84% in the final weeks of study, provided that the course demands students’ involvement in online learning activities. Forum participation was found to be an indicator of higher grades, thus leading to a lower dropping-out risk. Clustering methods were also used to provide evidence of cheating in written assignments, resulting in poor final performance despite high grades during the semester [
14]. Finally, Gontzis et al. [
38] presented a predictive analytic tool based on participation-related attributes that can early predict students that are in danger of dropping out.
Delis et al. [
39] presented the methodology and technological aspects of the application that is used in the Greek educational community to organize and manage students’ data.
3. Materials and Methods
This section describes the methodology followed in order to create a multiple visualization environment that would instantly be informed about students’ absentee status and can easily be accessed by educators. In more detail, firstly we present an overview of the This section describes the methodology followed in order to create a multiple visualization environment that would instantly be informed about students’ absences status and easily be accessed by educators. In more detail, firstly we present an overview of the Greek educational system focusing on vocational schools and the geographical area of this study. Subsequently, the research questions, methodology, and finally the tools and the implementation design are explained.
3.1. Setting and Dataset Description
Education in Greece is compulsory for all children between the ages of six and fifteen. K12 education is divided into primary and secondary education. Primary education includes the k1 to k6 grades. Secondary education is divided into lower secondary (i.e., K7, K8, and K9 grades), which is compulsory, and upper secondary (i.e., K10, K11, and K12 grades) which is optional. The grades up to k9 constitute a compulsory education, while upper secondary education is optional. During upper secondary education, students can either attend a general high school and be prepared to enter higher education or a vocational high school and be qualified with specific hard skills by receiving a vocational training.
This study focused on the vocational high school of Prosotsani in the rural area of the municipality of Drama that belongs to the administrative region of Eastern Macedonia and Thrace in Northern Greece. Drama is a nonprivileged area with limited career opportunities for young people. Eastern Macedonia and Thrace is the administrative region in Greece with the second-lowest gross domestic product (GDP) per capita and is one of the poorest regions in the EU [
40]. The difficult financial conditions along with a significantly low birth rate are affecting the quality of life negatively and add even more importance to the role of education and training. Especially for adolescents, it was found that socioeconomic status inequalities influence their literacy abilities, leading to poorer educational attainment and working memory task results [
41], proving the augmented need for quality public education in low privileged areas.
Teachers and educational stakeholders in the reference school have empirically pointed out the significance of the absenteeism problem. Therefore, since 2012, the vocational high school’s director has been using an early warning system to track the exact number of students’ absences in order to inform their legal guardians. It must be noted that this warning system was an initiative undertaken by the director, as there was no general strategy for using an early warning system in Greek education. However, tracking the number of absences is not sufficient to solve the ELS problem. Visualizing results in an interpretable way is also necessary.
Data over three successive years were retrieved from the vocational high school of Prosotsani. These data came from the database of the early warning system and included the absences of students along with the characteristics that affect absenteeism. This area was chosen because it reflects the challenges of attending a nonprivileged, public, provincial school with the background of the Greek economic crisis. Students from 13 small towns and villages in the wider area of Prosotsani attended courses in the reference school (
Table 1). There were a total of 258 students in the high school over the three successive years. The majority of students were boys (64%). Students from the wider area during the school period used public transportation to attend their courses.
The retrieved data also contained information concerning age, gender, personal information (address, phone number, etc.), and a detailed record of absences on an hourly basis.
3.2. Research Questions
Investigating the situation presented above at the vocational high school of Prosotsani, since 2012, and over three successive years, also based on the need for a visualization of the data during this period, the research questions addressed were the following:
- RQ1
Can we identify those factors that affect the absenteeism phenomenon so as to create a cube of important data for real-time visualization?
- RQ2
Which types of visualization can provide a clear view of students’ behavior in a way that even nonexperts in the field of learning analytics can easily draw conclusions for decision making?
- RQ3
Does the strict policy of frequent parental briefing, based on the available visualization of data followed by this school unit, discourage students from enrolling, resulting in students’ population decrease?
- RQ4
Are there any patterns in students’ absences concerning the day, the hour, or the class?
- RQ5
Are there any correlations between students’ absences and the other variables of the data warehouse?
3.3. System Implementation Design
Our team designed and developed a new educational application with the main goal of reducing ESL and school and university dropouts (
Figure 1). It is a Windows Desktop application and can be described as Extract-Transform-Load (ETL).
It receives data from other educational applications, such as from MySchool, Moodle, Leak, and Classter. In more detail, MySchool is an online application designed by the Greek Ministry of Education, and it is used by all Greek secondary schools, whereas Moodle is a widely used web application used for asynchronous education. Leakage is a windows desktop application designed and developed by our team that offers telematics services to reduce ESL and educational dropout. Finally, Classter is a web application of the company Vertitech that aims to manage the educational process in the life cycle of an organization such as primary, high school tuition, and general or vocational high school.
Our application was programmed in the C# programming language in Microsoft Visual Studio 2017. It utilizes Microsoft Virtual Machine.NET Framework 3.5 libraries so that our application can work in all versions of Microsoft Windows. The application sends automatically training data to the warehouse database from Excel, Access, SQLSERVER. It also sends information via SMS to parents such as information about their children’s attendance. An essential requirement of the application is that the data it sends to the cloud warehouse database have to be anonymous, not only because it must comply with the community privacy requirement but also because personal data do not help us make educational decisions.
Our application transfers data to the cloud warehouse database, which was developed in SQL SERVER 2017. The warehouse database is designed with a table of events and dimensions tables in a snowflake schema. The table of facts has in each line the attendance of each student per day and the dimension tables most of the characteristics that affect the attendance of the student such as environment, schoolteacher, family, etc.
From the cloud warehouse database, we isolated the data that we needed to make educational decisions. The technique we used was that of business intelligence used for many years by companies with many data to determine the behavior of their customers and their organization. Thus, with CUBE and roll-up techniques, we succeeded in isolating an educational phenomenon and managing it per period, per area, per age of students, etc.
With the data we isolated with CUBE techniques and with the capabilities of the Tools for Visual Studio (RTVS) tool, we could more easily visualize an educational phenomenon and look for the prediction.
The snowflake model, as an extension of the star model, with each dimension table extending outward, supported the role or explained details to multiple tables for external connection. The main point of these tables was for a detailed description of the fact table for some dimensions, which can reduce the fact table properties, to enhance the efficiency of the query [
42].
As shown in
Figure 2, the design of the cloud warehouse database was in the form of a snowflake schema. The transaction fact table included the students’ absences per hour in the daily school schedule, the total students’ absences per school day, and the students’ absences due to the fact of illness. The dimensional tables included project characteristics and characteristics that affected students’ attendance such as weather, teachers, grades, and student characteristics (without including properties that reveal the student’s identity).
In
Figure 3, there is a snapshot of the cloud warehouse database. In the “Absences Fact Table”, information regarding the presence of each student per school hour is shown. Namely, it shows whether the student is present or absent at a specific hour. In the case that the student is absent, it also shows whether he/she is excused with a parents’ note or with a doctor’s note accordingly. Additionally, it records when a student has missed the class due to the fact of an exploitation. In the rare case that a group of students have a free hour between courses, the corresponding field in the fact table takes the value: “No classes”. Enterprise data warehouses (EDW or simply, DW) are complex systems serving as a repository of an organization’s data. Apart from their role as enterprise data storage facilities, they include tools to manage and retrieve metadata, tools to integrate and cleanse data, and, finally, business intelligence tools for performing analytical operations. Conceptually, data warehouses are used for the timely translation of enterprise data into information useful for analytical purposes. In doing so, they have to manage the flow of data from operational systems to decision support environments.
The process of gathering, cleansing, transforming and loading data from various operational systems that perform day-to-day transaction processing (hereafter, sources or source data stores) is assigned to the ETL processes [
43]. However, when it comes to the environment of online analytical processing (OLAP), which is performed over simply but neatly organized cubes, these two tasks, along with profiling (i.e., data quality assurance), have already been completed, either by the organization ETL workflow or by a do-it-yourself data wrangling. The rest of the high-level tasks are too few and too high for our purpose here [
44].
4. Results and Discussion
Simple and more complex graphs were created by our application (
Figure 4) to investigate the most important factors that affect students’ absenteeism. The choice of the graphs was driven by their ability to provide clear and interpretable relations of the data. The number of absences per student and in total is visualized per school hour (i.e., 1st to 7th hour), per day, per region, or per school year. Additionally, the use of maps can reveal problems related with distant or problematic regions. Possible relations between the variables that were measured can be proven by using correlation analysis that can be accessible at a glimpse using correlation plots.
To begin with, certain simple graphs allow educators to draw simple conclusions regarding students’ data. The bar chart from the three-year period (
Figure 5) indicated that students from certain regions (i.e., Mikropoli, Petroussa, Volakas, Kali Vrysi, and Anthochori) tended to miss courses more frequently. The abovementioned areas have a long distance from the school, and students from these areas had many absences during the first year (2012–2013) out of the three years in our research, but, subsequently, they were able to reduce them with the help of the early warning system (
Table 1). Moreover, it was obvious that during every school year (between 2012 and 2015) it was always the first hour of the daily schedule that was the most likely to be missed (
Figure 6). This is usually due to the fact of transport-related problems, bad weather and bad sleeping habits.
Students in the second grade of the vocational high school tended to skip classes unexcused more than the other students (
Figure 7). This happened mostly in the middle of the week.
As shown in
Figure 8, the students who lived in region 4 were late for school more than their peers from other areas. Since reg.04 is not the most distant area, this might indicate problems in transport connection.
The pie chart in the left part of
Figure 9 provides information about the percentage of students’ enrollments during three successive school years. Additionally, in the right part of
Figure 9, the quota of total absences for the same period is shown. That is, in the total of the three years 2012–2015 of our research, the students in the 1st year were 31.91%, in the 2nd year 31.14%, and in the 3rd year 36.6%. There was a descending percentage of absences while the number of students during the same period was either almost the same or increased. Vocational schools in Greece follow an inclusive policy, gathering many students from vulnerable groups. Practices that can be considered as strict or demanding may discourage student admission by increasing the fear of failure. The eventual rise in the enrollment of students and the parallel drop in students’ absences indicate that the implementation of our system had a positive acceptance rather than created a feeling of freedom restriction.
Radar charts (or spider charts) are mainly used to compare categorical attributes. Each axis represents an area, so it is easy to rank them by ascending order. This is important because it allows teachers to instantly draw conclusions regarding the progression of absenteeism compared with other schools, indicating performance and improvement. The three-fold diagram of
Figure 10 was created with Microsoft Radar Chart (ver.2.0.2.0, 2019, Microsoft Power BI, USA), and it indicated a change in the school’s rank depending on the type of absenteeism. The absences per student are compared to the frequency of skipping the first class and the absence during the last hour of school. Differences in the areas’ order between these three diagrams can be indicative of factors affecting absenteeism. The visualization is dynamic, allowing teachers to choose the school year and have a constantly updated view of the new incoming data.
To imprint the contribution of a variable in the overall flow, the flow diagram called Sankey was used. In particular, in
Figure 11 the contribution of each region presented in
Table 1 to the total number of absences during three successive school years is shown. Links are weighted to using the number of total absences so thicker lines indicate a greater contribution to the absenteeism phenomenon.
Figure 12 was compiled with our system and the ArcGIS application, which helps to visualize the data using interactive maps. The star schema storage allows us the choice of visualizing different schools or different school years. In
Figure 12, the total number of students’ absences in the area per student is shown. Different colors represent different years: the school year 2012–2013 is red, the year 2013–2014 is yellow, and the year 2014–2015 is blue. This visualization helps the teachers to identify absenteeism problems in students from a specific area and thus to relate them to relevant factors, such as transportation problems and eventually to determine if the problem was effectively managed over time.
Microsoft Power BI was used to create dynamic visualization with the same data format as in the previous graphs to develop the online map in
Figure 13. In this case, the user can select the school year and the area of interest on the left side of the map. The number at the left top of the image expresses the total absences of the school per student for the school year selected. The following graphs were developed using Microsoft Power BI’s Horizontal Bar Chart by (ver.1.5.4.0, 2019) and Map Bing. Drill Down Donut PRO by Zoom Charts(ver.1.5.19, 2019) and Chord by Microsoft (ver.2.0.4, 2017) were also used.
To investigate the dependence between multiple variables time and to highlight the most correlated variables in our data table we used the correlation plot. The statistical measure of Pearson correlation coefficient was used where the value of +1 (or −1) indicates a perfect correlation between two variables, with +1 indicating a positive correlation and −1 a negative (inverse) correlation; a value in the range from 0.6 to 1 (or from −0.6 to −1) indicates a strong correlation; a value between 0.4 and 0.6 (or between −0.4 and −0.6) indicates a moderate correlation; a value in the range from 0 to 0.4 (or from 0 to −0.4) indicates a weak correlation [
45]. The correlation coefficients are colored according to the value providing in a simple and interpretable way. The students were divided into three groups according to their level of absenteeism. In the first group students who had to repeat the class because they exceeded the limit of absences (N = 25). The second group contains students with high levels of absenteeism who successfully completed the class (N = 118), while the students in the third group had rarely been absent (N = 116). As it was expected due to the criterion used for divided the students in the groups, a one-way ANOVA revealed that there was a statistically significant difference in the total number of absences between these three group of students (
F(
between groups 2,
within groups 256) = 234.35,
p = 0.00). Apart from the correlation matrix for each group, the
p-values are provided in
Appendix A to evaluate the statistical significance of the results.
The correlation matrix in
Figure 14 shows that most of the parents did not contact the school to justify their children’s absences. There was a weak negative correlation between the total number of students’ absences and the number of absences that their parents justified. This result may seem contradictory. However, it could be indicative of a lack of interest on behalf of parents concerning their children’s schooling or an attitude of low expectations.
The number of absences due to the fact of expulsion was positively but not significantly correlated with the distance (
r(23) = 0.45,
p = 0.16) but weakly and negatively correlated with the population (
r(23) = 0.17,
p = 0.02). Additionally, there was a positive correlation between the number of expulsion absences and the total number of absences (
r(23) = 0.38,
p = 0.00), showing that the tendency to skip classes regularly often comes with behavioral difficulties. In the study by Sara et al. [
46], which was conducted in Denmark, four characteristics that influenced school dropout were highlighted: class size, school size, last month’s absences, and average income per postal code.
In the second group of students (
Figure 15), parents justified their children’ absences. There was a strong positive correlation between the number of justified absences from the parents and the total number of absences (
r(116) = 0.75,
p = 0.00). Students that lived in more privileged areas tended to skip more classes because there was a very strong positive correlation between the market value of students’ residence and the total number of absences (
r(116) = 0.89,
p = 0.00). There was a moderate positive correlation between the total number of absences and the distance that student had to make to get to school (
r(116) = 0.35,
p = 0.00).
In their study, Nascimento et al. [
47] used correlation tables to show that the level-age dispersion had the highest positive correlation with school dropout, while, on the other hand, the adequacy of teacher training had the highest negative correlation.
In the group of students who rarely skipped classes (
Figure 16), the total number of absences was highly correlated with the market value of their residence, also for students with a low number of absences (
r(114) = 0.063,
p = 0.00). Additionally, there was a strong positive correlation between the total number of absences and the number of absences that were justified by parents (
r(114) = 0.60,
p = 0.00). Finally, there was a moderate positive correlation between the number of absences and the distance to the school from students’ residence (
r(114) = 0.46,
p = 0.00).
5. Conclusions
In this paper, we presented a two-fold effort to diminish absenteeism that was implemented in the vocational high school of Prosotsani. The system proposed combined with the early warning system that the school already uses provided educational stakeholders with simple and complex graphs that can be easily interpreted. The results showed a significant impact in gradually reducing the number of students’ absences during three successive school years. At the same time, there was a rise in students’ enrolment, indicating that this policy was well received by the local community.
Concerning RQ1, it was shown that factors such as the hour, region, or the day could provide useful knowledge from students to their tutors, so that they can control the absenteeism phenomenon. As it was shown in the previous section, there were significant correlations between students’ absences and factors such as distance and the market value of the properties where they live. This information should be provided in simple graphs, such as bar and pie charts, to be straightforward and easy to read. Additionally, radar charts and maps containing rich information can be used by nonexperts for sense making such as knowing in which area the absenteeism has decreased over the years. In addition, elaborated graphs, such as Sankey diagrams and correlation plots, can be used to present relationships between different factors (RQ2). Additionally, simple comparisons, such as those in
Figure 9, answer to RQ3, showing that the strict policy of frequent parental briefing did not discouraged students from enrolling even when there were aware of this policy in the two following years. Graphs 5, 6, and 7 revealed information concerning patterns connected with the hour, the day, and the class that students attend (RQ4). There, the educators can spot which day has the most absences or whether students from a certain region tend to skip classes more often. Those patterns may vary from time to time. Thus, the proposed system’s ability to daily load data and provide updated information to educators is one of its main benefits. This actionable knowledge can be used to introduce new policies when needed. For example,
Figure 7 can lead tutors to investigate what led the students of the second to skip classes more often than their peers and make an appropriate intervention to reduce them. Finally, the correlation plots showed several statistically significant correlations, especially for students with high number of absences (i.e., students at a high dropping out risk) and the students who eventually had to repeat their class (RQ5).
On a deeper level, the constantly updated visualization of students’ absences combined with related variables (school year, hour in the daily school schedule, and region) provides teachers with the opportunity to spot and prove possible reasons for systematic absences and identify students at risk of realizing social aspects of the problem. This way valuable time can be gained, allowing teachers to act and ask for the intervention of specialists (phycologists or social workers) when needed. The potential long-term positive contribution of our system to prevent absenteeism and ESL could be a robust argument towards its implementation in the common policy of every high school towards the elimination of these phenomena.