1. Introduction
In Colombia, the gaps of regional inequalities and inequality of social opportunities permeate people’s economic, political, and social participation. Colombia is divided politically and administratively into thirty-three regions. For instance, in 2019, three areas had half of the economic growth of the country, and six areas represented two-thirds of the same [
1]. Furthermore, in the same year, according to PNUD [
2], in the urban areas, the incidence of multidimensional poverty was 12.3% of the population, while in the rural areas, it was 34.5%. There are achievements, but the great challenge of closing the gaps in the regions persists.
As a result, enrollment and dropout rates in the educational system reproduce the situation in both initial education and higher education. Individuals choose between programs with different fees and quality. Additionally, the initial endowments of individual and socioeconomic background and barriers to financial aids or academic and personal skills restrict the decision-making about their studies. In sum, all aspects determine the access to and type of program, persistence, and graduation.
According to the OECD [
3], only 9% of students from the poorest families have higher education, while 53% are from advantaged families. These gaps begin when the students are children, where the proportion is 61% from higher-income families in contrast to 2% from lower-income families when enrolling in private schools. Additionally, the gross enrollment rates change between regions—around 20% in rural areas, 60% in most urban areas, and the highest rates in the most developed regions. There are few opportunities in rural areas; for example, the average years of school attendance is 5.5 compared to 9.4 in urban areas.
Although enrollment has increased worldwide over the past few decades and, with supply diversification, students can choose between enrolling in public or private institutions [
4], one of the most significant challenges is reducing the dropout rate in higher education. In Colombia, only 30% of students who complete secondary studies access higher education, and between 10.4% in undergraduate and 22.2% in technical and technology programs complete their studies [
3].
On the other hand, dropout is a subject that has increased academic interest because the dropout generates negative impacts on the well-being of students and the community [
5]. First, higher education institutions lose money if the students do not pay fees or the government does not transfer subsidies or scholarships [
6]. Second, there are high social costs because financing education implies using public resources to keep public institutions open and promote inclusion and quality [
7]. The dropout rates are even included in performance and efficiency models from universities [
8].
This phenomenon has been studied in Colombia by several academics. One study analyzed the effects of enrollment and dropout from lower-resource students beneficiaries of the educative loan program called ACCES (access with quality to higher education). Their findings showed the program was effective because the number of potential students increased, and there were improved academic performance and reduced dropout rates [
9]. In addition, another study used tree and decisional analyses to study social and educational integration in university dropout rates in the framework of public policies adopted by the government to reduce the dropout rates [
10]. Finally, an experimental design was used for research about education decisions from lower-income students, controlled by costs and information from quality institutions, financial aid, and other factors [
11].
In this context, the main objective is to analyze the determinants of dropout in Colombia and the differences between the type of institutions, fields of study, and regions. Then, we contribute to literature about dropout in at least three aspects. First, we provide evidence of the relation between socioeconomic background, financial aid, and personal characteristics in a context particular to Colombia. Second, we discuss the regional and inequality gaps and the decision-making between programs with differences in costs, quality, and recognition. We offer a perspective from the supply point of view (institutions and cycles). Third, we use a methodology that combines multiple correspondence analysis techniques and a lineal hierarchical model to explain the effect of variables operating at different levels. All of the elements highlight the design of public policy programs to give better efficiency and equity in higher education.
This paper is structured as follows;
Section 2 briefly reviews the theoretical approaches and develops the hypothesis;
Section 3 describes the methodological approach, including the research environment and data, the analysis techniques, and the identified variables;
Section 4 presents the results and discussion. Finally,
Section 5 offers the conclusion and proposes recommendations for policymakers.
2. Literature Review and Hypothesis
According to the literature review, many factors can affect dropout rates. Following the interactionist approach proposed by Tinto [
12], permanence is usually associated with the degree of congruence between the characteristics of the students, such as family backgrounds and the students’ academic skills and preferences, with institutional aspects focused on the graduation objective and the institution [
13]. For research purposes, we grouped dropout determinants into five components: socioeconomic background, individual or personal characteristics, access to support systems, contextual and geographic elements, and characteristics of the studies.
2.1. Socioeconomic Background
Several studies showed that a higher parental educational level and socioeconomic background offer greater opportunities to complete the studies, while students with limited financial resources will have increased dropout rates. In general, better-educated parents can provide more moral and financial support to their children in terms of finding better programs for them and, in turn, promoting perseverance in studies rather than dropping out [
14]. Even in contexts with few access restrictions, students from vulnerable socioeconomic backgrounds are more likely to drop out [
15] or are more likely to choose short-cycle programs. For example, in a study conducted for the University of Turin, a public university in Italy, gender differences and individuals who come from better backgrounds chose longer careers such as medicine and law, and, consequently, there is a higher probability of dropout in individuals with a vulnerable background [
16].
However, the parents’ socioeconomic background and educational level only influence the beginning of the university period; later, the institution’s type and admission to their first option play an essential role [
17]. Some scholars suggest being careful with the analyses carried out from the perspective of socioeconomic background. They propose to carry out the study by controlling the unobserved effects of family differences by comparing the individuals who drop out versus their siblings who complete their studies, together with the implications in economic difficulties later in terms of employment and income, which are explained by social and economic antecedents that should not be limited only to classrooms [
18].
2.2. Individual or Personal Characteristics
We include variables such as gender, high school, or academic performance in this component. In particular, the number of semesters, gender, and grades explain both the time needed to complete a course and the risk of dropping out for management students at the Federal University of Brazil [
19], which is a public university. In contrast, Bonaldo and Pereira [
20], in the same context but at private universities, found that the variables that explain dropout were associated with age, change of marital status during studies, and funding. Olaya et al. [
5] highlighted the importance of pre-enrollment attributes, math test scores, number of family members, and attendance at private schools.
Likewise, elements such as secondary schools with insufficient knowledge and little motivation are highlighted as determinants of dropout [
21]. Furthermore, conscientiousness and cognitive skills influenced student dropout rates in a cooperative study program in Germany [
22]. Additionally, individuals’ capacities tend to be determinants of access and permanence when they start from vulnerable socioeconomic conditions, mainly because they tend to acquire lower academic skills when they graduate from high school [
23]. Along the same lines, Meens et al. [
24] found that motivation is associated with academic achievement in the first year of higher education, which, together with a well-explored identity, seems essential for study success. In addition, unexpected changes in their academic scores—measured by differences in the college and first-year scores and controlled by the institution, field of study, performance, and gender—can cause difficulties in the program and increase the probability of dropping out [
14].
2.3. Access to Aid Programs
Some authors have studied the relationship of financial aid with the phenomenon of desertion. In effect, the implementation of financing programs encourages good students and those from better schools, whose parents have restricted access to credits, to complete their university studies in the Chilean context [
25]. In the same country, another study highlighted that persistence decisions are also associated with financial aid, mainly grants and loans, especially among lower-income students attending technical institutions. However, when the student received subsidized loans or stayed in a university program, there were no effects on persistence [
26]. Furthermore, in an empirical study from the USA, the authors found that distinct types of aids are differentially associated with the dropout rates across diverse income groups over time. In this case, the students from low-income families who received grants helped to reduce the dropout rates compared to their middle-income peers, but there were no differences if they received loans or work–study aid [
27].
Moreover, in a theoretical model supported by California laws that include exogenous tuition rates, two types of students (freshman and seniors) and two public subsidy rates, the tuition costs, and the subsidies awarded determine university access and graduation success, although the effects are not the same for first-year students compared to others, generating retention rate benefits aimed at high-ability students [
28]. On the other hand, tuition subsidies are much more effective in increasing coverage and completing the degree because they are not repaid. In contrast, the choice not to go to university or take a loan is made considering factors such as the physical costs of studying, the economic returns to education, the opportunity cost, the uncertainty of completing the studies, and the labor market results [
29].
Add to this that desertion is higher for those who receive loans and lower for those who receive work–study aid compared to those who do not receive any aid; the above is explained from the assumption that the loans can be seen as a drain of future income, while study and work aid can integrate the student into a closer relationship with the institution and provide a convenient source of income [
13]. Likewise, financial support programs are designed to help students by allowing them to work less than they would in other cases; considering work as a substitute for study time, then reducing it increases student retention and eventual graduation [
30].
Considering the elements identified in the literature review about socioeconomic background, individual and/or personal characteristics, and aid programs, we propose our first research hypothesis.
Hypothesis 1a (H1a). Attrition is directly related to the existence of inequities and inversely related to programs and policies aimed at compensating them.
Hypothesis 1b (H1b). Individuals with more vulnerable socioeconomic backgrounds are more likely to drop out than those from more disadvantaged backgrounds.
Hypothesis 1c (H1c). The higher the individual’s academic ability in high school, the lower the probability of dropping out of tertiary studies.
Hypothesis 1d (H1d). The implementation of aid programs promotes a decrease in dropout rates.
2.4. Contextual and Geographical Elements
In this grouping, we include aspects such as proximity to the institutions where higher education is imparted, town’s size, population density, and simply whether it is an urban or rural sector. Previous studies show that controlling the existing differences by location is crucial when trying to approach the persistence of regional educational inequalities [
31]. In addition, proposing differential labor market structures between regions in terms of required occupations and the demand offered or facing the concentration of economic activities leads to educational specialization [
32]. In this sense, our second hypothesis is generated.
Hypothesis 2 (H2). The existence of regional inequalities influences dropout rates from tertiary studies.
2.5. Characteristics of the Studies
From this perspective, institutional elements and types of study or areas of knowledge were included. Certain studies have highlighted the importance of previous academic performance and the student’s dedication to attending class, the relationship established with teachers in guiding teaching functions, and the school climate in the permanence analyses [
33]. Additionally, implementing remedial education and retention measures, which offer previous courses and deal with the repetition of courses within the studies, is an excellent policy to improve the graduation rates of students who start with low performance [
34]. However, according to Chen [
35], both institutional and financial resources focus on students and their social development, and institutional expenditures on instruction and academic support seem to promote student persistence at their institutions.
Additionally, prospects for future jobs, student study skills or interaction with professors, and institutional resources or extracurricular activities offered by the institution are essential for students to remain in the educational system at the University of Coimbra, Portugal [
36]. In the same line, various studies have explored the differences in study areas. For example, the dropout rates increase in biology courses but decrease in administration, language, and literature courses [
37]. The risk of dropping out is also higher for STEM students than for business students, but, in general, students prefer to continue their studies in the face of deteriorating conditions in their labor markets [
6]. In addition, Lassibille and Navarro [
38] found significant differences across subject areas after controlling for skills, motivation, and socioeconomic factors in Spain.
A distinction between public and private education is proposed for schools in Brazil, where the decision is usually motivated by socioeconomic and academic differences; the same distinction exists for universities that differ in tuition costs. The system is accompanied by student financing funds to increase access and permanence in private institutions’ programs [
39].
Hence, we consider the third hypothesis.
Hypothesis 3 (H3). There are institutional differences by program cycle and field of study that affect dropout rates.
3. Methodology
3.1. Research Context and Data
Higher education in Colombia is channeled through three undergraduate academic programs: technical, technological, and university. Both technical and technological studies offer training in occupations of an operational and instrumental nature, while university (long-cycle) studies focus on liberal professions or specific academic disciplines. Higher education is given in higher education institutions (HEIs), which can be public or private, depending on whether they are recipients of fiscal resources assigned by the government (public) or if their primary financial resources are obtained from the tuition paid by students (private).
According to data from the Ministry of National Education (MEN) for 2018, there are 292 HEIs throughout the national territory, of which only 62 (21.2%) are public or official, serving just over 1.2 million students (51% of the country’s total enrollment). Enrollment was distributed as follows: 3.5% in technical programs, 27.8% in technological programs, and 68.7% in university programs, and the total number of enrolled students is equivalent to 52.8% of the population between 17 and 21 years old in the country (coverage rate).
We use data from the System for the Prevention of Desertion from Higher Education (SPADIES in Spanish), the Colombian Institute for the Evaluation of Education (ICFES in Spanish), and the Colombian Institute of Educational Credit and Technical Studies Abroad (ICETEX in Spanish). The databases used consolidate information on the academic, socioeconomic, institutional, and individual conditions of the country’s students, insofar as it allows the cross-referencing of the information of the individuals who take the Saber 11 test at the end of their secondary education (ICFES) with which institution they enter higher education (SPADIES) and if they have obtained any financial aid from the government (ICETEX).
We have collected data with 446,423 observations for the period 2000–2012, of which 279,715 correspond to individuals who do not access the higher education system after having completed their secondary studies, a relationship in which 4 out of 10 individuals who would access the system have completed the requirements to do so. For our dropout analysis, we consider a subsample of 166,708 records of those who are at the base of higher education studies.
Nevertheless, we had several boundaries with the data. First, it is limited by the availability of variables; for example, the data does not provide information about the father’s educational level. However, we consider that the levels of education of each parent are highly correlated and household income and socioeconomic level capture the same kind of information [
15]. Second, it should be noted that missing data are presented for some of the variables and groupings used in the first analysis. In this case, we retain the missing cases as a category.
3.2. Empirical Analysis and Variables
We applied two analysis techniques to explore the determinants of higher education dropout in Colombia. First, multiple correspondence analysis (MCA) is used to reduce many categorical variables and identify the factors associated with socioeconomic background and access to financing. Second, multilevel probit estimates are made to determine how the factors identified in the literature affect attrition, considering that multilevel analysis allows studying the simultaneous effect of individual and contextual characteristics in addition to their interactions.
3.2.1. Exploratory Technique: Multiple Correspondence Analysis
Multiple correspondence analysis (MCA) is an exploratory technique used to analyze the structure of more than two categorical variables. Although several types of indicators exist, the Burt method was implemented, the results of which are interpreted based on the categories’ relative positions and their distribution along the dimensions. To determine the number of dimensions that should be retained, we consider the methods based on eigenvalues or dimensions with inertia, having as a criterion at least 75% of the percentage of inertia.
We carried out the MCA considering socioeconomic background variables and programs to support financial access to higher education. The financing mechanisms refer to the aid offered indirectly with public resources to individuals through the Colombian Institute of Educational Credit and Technical Studies Abroad—ICETEX, which can be subsidies or conventional credits. The subsidies constitute aid that covers only part of the educational costs, usually supporting and aimed at the population in vulnerable conditions. The main long-term credit line, called ACCES, is agreed at an interest rate and with a specified term of time; it finances the tuition’s total value and is aimed at Colombians with good academic performance and limited resources.
Table 1 describes the variables used.
Following the classical approach to MCA [
40], in our study, we had a matrix
that involves the elements 0 and 1, where 1 represents that an individual is classified into a category, and 0 indicates that it does not share that characteristic. Here,
is an n ×
matrix, where n is the unit number, and
is the number of categories of that variable, for k = 1, 2, ..., 5. The sum of each row of
is, therefore, 1. Moreover, X summarizes the status (present/absent) of each of the n individuals that are classified into five categorical variables. In this case, X = [X1|· · · |X5] was the concatenation of each of the Xk matrices forming a super-indicator matrix of dimension n × J, where
, is the total number of categories among the five variables.
Now, we define the J × J Burt matrix as B = X’X or B = X’D(w)X, where w is the weight for the analysis and D(w) is a J × J square matrix with the weights on the diagonal and 0 off-diagonal.
After, we calculated four new variables or dimensions with the MCA coordinates. The standard row coordinate for the
tth dimension for the
ith observation with indicator matrix elements
Xih was computed as
where
A is the matrix of standard coordinates,
q is the number of active variables in the analysis, and
is an eigenvalue of the MCA on the Burt matrix.
3.2.2. The Multilevel Hierarchical Models
Regarding the estimation of probit models, linear or multilevel hierarchical models allow the capturing of the effect of variables that operate at different levels since the observations that correspond to individuals belonging to the same group cannot be considered independent. If individuals are grouped into clusters, it can be expected that two of them selected from the same group will tend to be more similar than two individuals selected from the different groups.
In this sense, the first level considers the factors identified in the literature review at the individual level: the variables associated with socioeconomic background, personal and/or individual characteristics, and financial aid. In addition, the second level considers the contextual components and the characteristics of the studies. In a specific way, we estimated four models. In the first model (REG), we performed the estimation with regional characteristics that make up the context in which the individual operates and explain the labor market’s behavior and socioeconomic conditions at the departmental level. A second model (FIELD) incorporates the fields of study in which the programs are grouped. A third model (HEI) differentiates between higher education institutions that offer the programs, and the fourth model (CYCLE) includes the short cycle and long cycle program types.
Table 2 summarizes the groupings of the hierarchical models.
Mathematically, we can consider the two-level model as a probit regression that contains both fixed effects and random effects for a series of M independent clusters as
where
represents a set of fixed effects and
represents a set of random effects. Here,
j = 1, …, M clusters, with cluster
j consisting of
i = 1,…,
nj observations.
includes regression coefficients (fixed effects), and we used
to represent both random intercepts and random coefficients. Moreover,
includes the covariates for the fixed effects, analogous to the covariates in a standard probit regression model, and
represents M realizations from a multivariate normal distribution with mean 0 and qxq variance matrix ∑. Finally,
H(.) is the standard normal cumulative distribution function [
41].
On the other hand, the variables used in the estimates are detailed below. The dependent variable dropout is a dummy variable that takes the value 1 if the student does not present enrollment for two consecutive periods or more at the time of the study and 0 otherwise. The explanatory variables correspond to those identified in the theoretical review.
In the component of individual and/or personal characteristics, there are demographic aspects such as gender and age; academic ability, represented by the score obtained in the high school exit exam (total score); the number of siblings; and the student’s works. In the aid programs, there are, in addition to the variables used in the correspondence analysis, other types of aid. They are granted once the student is part of the higher education system, offered mainly by educational institutions or in association with them. Here, we consider academic support to be tutorials or reinforcement monitoring, financial support, including other types of credits or scholarships, and other types of support in general. In the component of contextual elements, a variable is added that notes if the student who joined the system has changed departments to do so (origin); in the characteristics of the studies, the HEI’s nature and the program’s cycle are included.
Table 3 describes the variables used in the estimates.
4. Results and Discussion
Following the criteria explained in the previous section, the multiple correspondence analysis implemented allows us to retain four dimensions that explain about 75% of the inertia (see
Table 4). The extracted dimensions were used as new variables to represent the individuals’ socioeconomic and financial conditions. These dimensions are called: (i) vulnerable individuals (dim1), (ii) favored individuals without funding (dim2), (iii) individuals with funding (dim3), and (iv) middle class without funding (dim4).
Regarding the estimation of hierarchical linear models, the empirical approach suggests that socioeconomic backgrounds, individual and/or personal characteristics, support systems, contextual elements, and characteristics of the studies influence higher attrition in Colombia.
Table 5 presents the results for estimating the probit models with two levels, and
Table 6 presents the estimation of the probit models with three levels.
The positive coefficients obtained for the variables of age, number of siblings, and student employment, and origin indicate that these factors are positively correlated with dropout rates. All other factors are negatively associated; being a woman, having excellent academic performance in high school, academic and financial support, and favorable socioeconomic conditions reduce the probability of dropping out. These findings are partially in line with what has been found in other studies.
In general, the coefficients obtained for the dimensions found by multiple correspondence analyses allow us to check hypotheses H1b and H1d. Individuals with more vulnerable socioeconomic backgrounds have a higher probability of dropping out than those from more advantaged contexts; this is evidenced in the positive and statistically significant coefficient at 1% for the variable dim1, which includes vulnerable individuals in all groups, that is, for hierarchical estimates at regional level (REG), type of institution (HEI), the field of study (FIELD), and cycle (CYCLE).
Moreover, the implementation of support programs promotes a decrease in dropout rates. It is particularly evident in the negative sign of the variable dim3, which brings together individuals with funding, but which is significant only for the HEI model that distinguishes by type of institution, which indicates the importance of public financial aid for access and permanence in the face of tuition costs and differentiated offers. On the other hand, the negative and significant signs at 1% for the variables of academic support, financial support, and other support show the importance of the aid programs implemented within educational institutions or alliances when the student has to overcome access barriers.
Regarding individual and/or personal characteristics, being older, having a greater number of siblings, and supplementing studies with work increases the probability of dropping out, consistent with what has been found by various studies. However, the negative and statistically significant sign obtained for being a woman indicates a reduction in desertion, which contradicts what has been empirically found by other authors. This situation is possible due to the increased participation of women in tertiary studies, and it reflects their greater commitment and responsibility. Additionally, the negative and statistically significant sign at 1% for the score variable allows us to check hypothesis H1c. In all estimates, the higher the individual’s academic ability in high school, the lower the probability of dropping out in tertiary studies.
When analyzing the findings for the origin variable, which presents a negative and statistically significant sign in all the estimates, we show that a student who changes departments has a lower probability of dropping out. It may be because migration may occur for supply reasons or personal preferences, rather than an unwanted displacement situation, which may be associated with a more outstanding student and family commitment to complete their studies.
Along the same lines, the program cycle influences dropout in those estimates where technical and university variables are controlled; it is found that students in short-cycle technical studies increase their probability of dropout compared to students in technological studies. This is in contrast to long-cycle students, where the probability of dropping out decreases. The variable that collects the distinction between public and private universities is not significant in the estimates that include it; this suggests that the differences appear more than in the character of the institution, for example, in the diversity of existing institutions (enrollment costs, quality, participation in rankings, teaching staff, among other factors) (see HEI output).
The results obtained in the estimates with three levels are consistent with those previously described. It is highlighted that individuals with funding (dim3) reduce their probability of desertion when both regional differences and field of study are reflected (REG-FIELD model), where the coefficient is negative and significant at 10% level and when considering the differences between the program cycle and the educational institution (HEI-CYC) with a coefficient that is also negative and statistically significant at 5%. The preceding evidence reinforces the importance of implementing financing programs to eliminate access barriers and promote permanence. We also observed the same with the variable that collects academic support, which has important magnitudes considering the characteristics of the studies, that is, in the models that combine the interactions of the field of study and the cycle (FIELD-CYC) and those that combine the institution and the program cycle (HEI-CYC).
All of the estimated variances are significant, with lower magnitudes for the estimates that include regional differences (at the departmental level) and a greater magnitude for those models that incorporate the studies’ characteristics, mainly at the institutional and program cycle levels. Consequently, we can partially test hypotheses H2 and H3. The existence of regional inequalities influences dropout from tertiary studies, and there are institutional differences by program cycle and field of study that affect dropout rates.
5. Conclusions
We analyzed the dropout rates in higher education in Colombia based on data from the System for the Prevention of Desertion from Higher Education (SPADIES), the Colombian Institute for Evaluation of Education (ICFES), and the Colombian Institute of Educational Credit and Technical Studies Abroad (ICETEX); we used multiple correspondence analysis and multilevel models to assess both the influence of the contextual variables grouped in the regions and the influence of the characteristics of the studies on dropout rates.
Therefore, our study provides evidence of dropout differences by regions, institutions, and field of study and the effects of financial aid on student outcomes across these groups. Findings suggest that academic support, financial support, and other support within educational institutions or alliances when the student has to overcome access barriers significantly reduce dropout rates. This result can be constructive for managers to establish or adjust student retention programs at the institutional level. In this way, they can design aid programs for students who need flexibility and special conditions, such as students who study and work.
The analysis is carried out considering the significant regional gaps in the country, as reflected in fewer educational access opportunities in mainly rural regions. Our results show that the difference found between departments seems to explain only 1.4% of desertion, which may be due to the “filter” that occurs when accessing the system rather than its permanence. Likewise, we also analyzed the inequalities from the differences found on the supply side, where the variation between institutions (11%) and the interaction between institutions and the program cycle (17.8%) confirm the influence of inequities on desertion, given that the student has to choose between programs with differences in tuition and study costs in general, such as quality, social recognition, and employment relationship, which increases social gaps.
Finally, the findings obtained from multilevel modeling suggest the existence of contextual and institutional disparities in the dropout phenomenon’s behavior, which are explained mainly by the supply conditions in these regions and the individuals’ socioeconomic backgrounds. These results reinforce the importance of implementing financing programs to eliminate access barriers and promote permanence. Therefore, the government should keep providing loans and subsidies and creating financial aid focused on regions with particular characteristics of supply and gaps; it should even promote persistence in short-cycle studies and the fields of study of relevance in the regions. These elements should be included in public policy programs to improve Colombia’s efficiency and equity in higher education.