Factors influencing university drop out rates

Factors influencing university drop out rates

Computers & Education 53 (2009) 563–574 Contents lists available at ScienceDirect Computers & Education journal homepage: www.elsevier.com/locate/co...

719KB Sizes 1 Downloads 34 Views

Computers & Education 53 (2009) 563–574

Contents lists available at ScienceDirect

Computers & Education journal homepage: www.elsevier.com/locate/compedu

Factors influencing university drop out rates Francisco Araque a, Concepción Roldán b,*, Alberto Salguero a a b

Department of Software Engineering, University of Granada, C/Periodista Daniel Saucedo Aranda s/n, E-18071 Granada (Andalucía), Spain Department of Statistics and Operations Research, University of Jaén, Las Lagunillas s/n, E-23071 Jaén (Andalucía), Spain

a r t i c l e

i n f o

Article history: Received 26 November 2008 Received in revised form 12 March 2009 Accepted 17 March 2009

Keywords: Drop out Data Warehouse Evaluation methodologies Country-specific developments Innovation

a b s t r a c t This paper develops personalized models for different university degrees to obtain the risk of each student abandoning his degree and analyzes the profile for undergraduates that abandon the degree. In this study three faculties located in Granada, South of Spain, were involved. In Software Engineering three university degrees with 10,844 students, in humanities nineteen university degrees with 39,241 students and in Economic Sciences five university degrees with 25,745 students were considered. Data, corresponding to the period 1992 onwards, are used to obtain a model of logistic regression for each faculty which represents them satisfactorily. These models and the framework data show that certain variables appear repeatedly in the explanation of the drop out in all of the faculties. These variables are, among others, start age, the father’s and mother’s studies, academic performance, success, average mark in the degree and the access form and in some cases also, the number of rounds needed to pass. Students with weak educational strategies and without persistence to achieve their aims in life have low academic performance and low success rates and this implies a high risk of abandoning the degree. The results suggest that each university centre could consider similar models to elaborate a particular action plan to help lower the drop out rate reducing costs and efforts. As concluded in this paper, the profile of the students who tend to abandon their studies is dependent on the subject studied. For this reason, a general methodology based on a Data Warehouse architecture is proposed. This architecture does most of the work automatically and is general enough to be used at any university centre because it only takes into account the usual data the students provide when registered in a course and their grades throughout the years. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Higher education in Spain has recently been undergoing an important process of restructuring because of the need to converge with other members of the European Union and to introduce information and communication technologies (ICT) in the teaching processes. These changes demand certain reforms in order to adapt the goals of the institution to the new social needs. The increasing interest in studying university drop out comes from the increase of cases registered in the Spanish universities together with the elevated cost that the education of every undergraduate means to Public Administration. According to the statistics of the Spanish Coordination University Council (National Evaluation’s Plan of the Quality of the Universities, PNECU), presented in December 2002, 26% of the undergraduates leave their studies or change their degrees. The data provided by the Organization for Cooperation and Economic Development (OCDE), presented the same year, show that the academic failure in Spain is set at over 50%, related fundamentally to the rates of drop out. With other data provided by the Spanish Center of Research, Documents and MEC’s Evaluation (CIDE), MEC (Spanish Department of Education and Science), drop out rates are set between 30% and 50%. This phenomenon began in the rest of Europe rather than in Spain, reaching 45% in Austria. According to ‘‘The Standards for Educational and Psychological Testing, 19991, every year the percentage of students that abandons their studies or changes the degree increases, results obtained as from the analysis of the data of the rates registered in the college. Other studies accomplished in Central Europe and the United States, show a similar percentage, although these studies are made with minority populations and perhaps this is the explanation for a bigger level of drop out (see Callejo, 2001; Feldman, 2005; Last & Fulbrook, 2003; Orazem, 2000).

* Corresponding author. E-mail address: [email protected] (C. Roldán). 1 The Standards for Educational and Psychological Testing is a set of testing standards developed jointly by the American Educational Research Association (AERA), American Psychological Association (APA), and the National Council on Measurement in Education (NCME) 0360-1315/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.compedu.2009.03.013

564

F. Araque et al. / Computers & Education 53 (2009) 563–574

As far as 1968, Rubio García Mina carried out one of the first studies on the university drop out in Spain and analyzed the cohorts from 1960 to 1966 in the technical superior schools of Madrid. This study and others were approximations to an incipient phenomenon, coinciding with some institutional reforms and social changes, like the access of a bigger percentage of students to the university, implantation of the Spanish law for the General Organizing of the Educational System (LOGSE), the reform of University Curriculums, new requirements of higher education (new methodologies, new technologies, practices at companies), etc. The disconnection between the laws of obligatory education with the university study programs, and the strong linkage of these with the business world, together with other institutional circumstances that did not suit the new student’s characteristics well enough, have had the effect of a great increase in the percentage of drop out, specially in technical degrees. These circumstances together with the overcrowding also produce in humanities and social sciences. The problem is unsolved, because every year the drop out rates at all the universities, and in all the degrees, increase, although the differences between all of them continue being significant. 2. Previous work There are several attempts to build theoretic models that explain the phenomenon of drop out from the university studies. The majority of them reveal a series of common characteristics and centre their analyzes in the following groups of variables: the student body, the teaching staff, the institution and the family contexts. For several authors including Forbes and Wickens (2005), the students’ decision of changing or continuing his formative university process is determined basically by the level of social integration that these students achieve at the university institution, that is, the students will feel more integrated inasmuch as their capacities allow them to cope with the intellectual and other demands of university life. Another kind of factor that has been identified in the student body is the deficiency of capacities or abilities to face up to the demands of university studies, previous inadequate knowledge, inappropriate attitudes toward learning and low psychological resilience, etc. (see Kirton, 2000; Wasserman, 2001; Landry; 2003; Lightseym, 2006; Saunders, Davis, Williams, & Williams, 2004). In relation to the teaching staff factors, it is pertinent to highlight the pedagogic deficiencies (not much clarity in the presentation of the subject, etc.) or deficiency of attention individualized to the student. The association between the student–teacher relationship and dropping out has also been demonstrated in the literature (see Lessard, Fortin, Joly, Royer, & Blaya, 2004). Moreover, the conflicts with teachers frequently escalated and ultimately resulted in a pivotal moment that precipitated leaving education (Lessard et al., 2008). Institution-related risk factors associated with dropping out include absence of clearly defined objectives, restriction in the offer of determined degrees (see Thomas, 2002), etc. Another large group of causes related to drop out point to the family-related risk factors such as social origin, socioeconomic status, family turmoil (see Fortin, Marcotte, Potvin, Royer, & Joly, 2006). Low affective support, low cohesion among family members and a high rate of conflict fuel the family turmoil and increase the drop out probability (see Lessard et al., 2008). With all of this, nowadays the drop out has become a worrying theme of great interest at all levels. In addition, all these data show the need to accomplish studies which identify the causes of the drop out, with the aim of putting actions into practice to lead to a decrease in the incidence of this problem at the institution. Drop out is one of the multiple indicators of the quality of a educational system, that highlights the existence of serious failures in the processes of orientation, transition, adaptation and the student body’s promotion. For this reason in the frame of the European Space of Higher Education, a lot of universities are taking into account in their strategic plans a primary objective: reducing the rate of the student’s drop out. The objective of this study is to analyze which variables influence the drop out as well as the creation of a model of prediction that allows the measurement of the risk of drop out of a student using the academic and certain personal data that the university centres have for every student. We conceived the drop out as the situation in which the students have abandoned the university or the initiated degree when they do not register for a subsequent school year. The hypothesis of departure is that there exists a student profile susceptible to the abandonment of the university studies. Therefore, the use of different statistical techniques is proposed to identify this intervening profile using historical data, preferably stored in a Data Warehouse. Logistic regression model techniques (Hosmer & Lemeshow, 2004) are frequently used to analyze factors influencing a response variable of great interest in education (see Braak, 2001; Martins, Steil, & Todesco, 2004). Once the profile has been identified, several actions can be included in the strategic plans of the universities. Taking into account these models and the results obtained, individual candidates belonging to the profile identified as susceptible to abandonment among the students signed on the faculty can be found and guided to a system of orienteering or tutorial action. 3. Materials and methods There are many alternatives for studying the drop out phenomenon. The easiest and most common way is to take some surveys of the students from time to time. The problem of this alternative is that it is very intrusive. It requires the participation of the students and the processing of the collected data. For these reason this alternative usually considers only a small part of the students. On the other hand, we can make use of the personal information the students provide when registered in courses every year as well as all of their grades throughout the courses. The latter is less powerful than the former because the surveys can be designed as needed, focused in the drop out issue. The main advantages of the latter alternative are that we have more data and that the process is transparent to the students because we do not need extra information rather than that provided by them when registered in courses. We opted for the latter alternative. The information technology department of the university provides us with all the data we need about the students, modified to be completely anonymous in order to avoid legal issues (the name of the students is confidential and was removed but the rest of the personal information was kept). After some initial analysis we promptly noticed that the information was not in a suitable form to be analyzed. The data is structured in such a way that its employees can easily manage it: add, edit and delete information. Since the data by themselves are useless, they must be put together and reorganized to produce useful information. In turn, information becomes the basis for decision making. To facilitate the decision-making process, a new piece of technology more sophisticated than usual database systems has been developed: a Data Warehouse. A Data Warehouse (DW) can be generally described as a decision-support tool that collects its data from oper-

F. Araque et al. / Computers & Education 53 (2009) 563–574

565

ational databases and various external sources, transforms them into information and makes that information available to decision-makers (top managers) in a consolidated and consistent manner (Inmon, 2005; Kimball & Ross, 2002). The persistence of huge amounts of data opens a new perspective for various statistical analysis methods which are essential for tactical decisions. Inmon (2005) defined a DW as ‘‘a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management’s decision-making process”. A DW is a database that stores a copy of operational data whose structure is optimized for query and analysis. The scope is one of the DW defining issues: it is the entire organization. The DW is usually implemented using related databases (Rainardi, 2007) defining multidimensional structures. The generic architecture of the developed system is illustrated in Fig. 1. Data sources include basically existing operational databases. The data are extracted from the sources and then loaded into the DW using an ETL tool (Araque, Salguero, Martínez, Navarro, & Calero, 2007a; Araque et al., 2007b). ETL stands for extract, transform and load, the processes that enable companies to move data from multiple sources, reformat and cleanse it, and load it into another database or on another operational system to support a business process. This is usually the most complex task in the process and, in our case, we had to develop an ad hoc application. It basically consists of the following steps:  Clean the data. The application firstly has to clean the data. There are some erroneous and missing values which have to be put right. Some of those values are recoverable: the starting age is supposed to be the age the student is when the first of his courses is recorded; the country of origin is supposed to be Spain; it is also supposed that the students are single and that they do not work. . . Other values can be completed using a mode-value imputation process: the population was considered divisible into a number of disjoint classes using complete variables forming these classes and the mode-value imputation method was applied. All the variables used in the study satisfy a non-response less than 25% (maximum value established by the Office of Management and Budget). All variables with missing values higher than 25% were removed (Levy & Lemeshow, 2003, 2008).  Aggregate the data. This is basically the reason because we use a DW based approach. The data is not analyzable in the original form. It is normalized and there are many relations connecting the facts. The data is aggregated in such a way that the historical information about the courses each student has been enrolled in through the years is summarized in a unique record. Some rates, denoted using asterisk notation in Table 1, are defined for this reason.  Generate new measures. The last step of the process consists in determining which of the students have actually abandoned their studies. This information is the basis of the consequent analysis and it is, obviously, not defined in the original data. It has to be derived. The DW is then used to populate the various subject (or process) oriented Data Marts (DM) and On-Line Analytical Process (OLAP) servers. Data Marts are subsets of a DW categorized according to functional areas depending on the domain (problem area being addressed) and OLAP servers are software tools that help a user to prepare data for analysis, query processing, reporting and data mining (Fig. 2). Several software choices are available to build a data mart such as MS Access, Oracle or any other database management software. MS Access is selected in this research for its user friendliness and easy availability. It is important to note that OLAP software packages such as MS OLAP server or SQL server are also available in the market. These software packages allow users to write their queries using any chosen analytical model. The advantage is that the users can use different graphical wizards to easily write their queries and display results in different graphical formats. Since our intention is not to develop a commercial software product, we have used MS Access. Due to the huge quantity of data the process takes, for instance, 297 h for processing the history of 25,745 Economic Sciences students in a modern computer (Core2, 8 GB). It has been implemented using the Microsoft .net framework and more than 1500 lines of code have been required. In order to improve the performance of the application some part of it has been implemented using the .net native threads library, which makes possible the parallelization of some tasks.

4. Student data collection and statistical analysis The population under investigation consisted of all the students signed on the following three faculties of the University of Granada from 1992 onwards: Software Engineering, Humanities and Economic Sciences. These faculties include 27 degrees and have had 75,830 students signed on. Every year the students fill in the registration with personal information (family name, first name, address, city, country, sex, age, birth, etc.) and with the subjects interested in studying. This information together with the academic results is included in a database. Therefore the first step of our study was to develop, on the basis of historical data about the drop out of these students, a Data Warehouse named Data Mart, with the variables needed for the aims of this study.

Fig. 1. System architecture.

566

F. Araque et al. / Computers & Education 53 (2009) 563–574

Table 1 Description of the variables used in the study. Variable

Description *

The student leaves the school: yes (1), no (0) Marital status of the student Student job: more (1) or less (2) than 15 h weekly or unemployed (3) Mark obtained in the degree access Year of the degree access Year of the studies ending Academic performance rate (passed credits/enrolled credits) Success rate (passed credits/examined credits) Average mark (from 1 to 4) Mode round (first round, second round, third round, etc.) Year the university studies start Last year signed on the degree Degree Sex of the student Date of birth Family city Native land Father’s job Mother’s job Father’s education level Mother’s education level Form of the degree access Sort of round more used (standard or extraordinary)

Abandon Marital status Student job Access mark Access year End degree Academic performance rate* Success rate* Average mark* Mode round* Start year End year Degree Sex Birth Family city Country Father’s job Mother’s job Father’s education level Mother’s education level Access form Exam round

Fig. 2. The ETL process.

The statistical analysis achieved for each faculty using the previous variables and SPSS (Statistical Package of Social Science), included: – – – –

Descriptive analysis: This study shows the main characteristics of the students of each faculty. Studies of the relationships between ‘‘abandon” and the remaining variables. Principal components analysis (PCA): This analysis showed the underlying framework of data. Logistic regression model for ‘‘abandon”: Logistic regression modeling was used to obtain conclusions about the drop out problem in each faculty and predict the possible effect of the remaining variables on the dichotomous ‘‘abandon” variable. This paper discusses the main results obtained in the first three studies briefly and focuses on the logistic regression models obtained.

5. Results In each faculty, a first step after the imputation procedure was to achieve a descriptive analysis of the variables in the study and analyze the relationships between ‘‘abandon” and the remaining variables. For the last purpose, chi-square testing and contingence tables were employed. This analysis showed differences between the group of students that abandon the degree and the rest of students for all the variables considered. Following, by the partial method and using stepwise as method of selection by steps, the logistic regression models have been obtained. For different cut points the respective models have been calculated, choosing the model that sorts the data better in the two categories of the variable ‘‘abandon”. The models obtained only contained a few variables. This phenomenon is explained by the results obtained in principal component analysis included too. 5.1. Software Engineering In general, most students signed on in these degrees are men. In this study, 81.6% of the students were men. In addition 96.5% are single and 3.5% were married. Most of them (98.3%) are Spanish and 1.4% is Moroccan. The largest group of fathers (40.5%) have received primary

567

F. Araque et al. / Computers & Education 53 (2009) 563–574

education and 21.4% have a degree. But only 14.5% are classified as ‘‘high level professional” and 21% work in the Public Administration. The largest group of mothers (44.5%) have received primary education and only 6.8% have a degree. The largest group of mothers (38.5%) have never had a remunerative job, 11.8% of the mothers are housewives but a big percent of them (16.7%) have a company with more than 10 workers. Most students are unemployed (93.5%), 5.4% work more than 15 weekly hours and only 1.1% work less than 15 weekly hours. Most students (61.5%) are admitted to the university degree with an entrance examination, 27.3% of students are admitted as undergraduate, 10.5% come from vocation courses for 14 to 18 years olds and 0.6% has been admitted in the university by passing a special entrance examination for those over 25. The academic performance rate is an average of 0.3868, the success rate is an average of 0.61 and the average mark is 1.3169. Most students (97.2%) pass the exam in the first round, 2.3% in the second round, 0.4% in the third round and 0.1% in the fourth round. In this faculty the drop out rate is very high (49.6%). PCA (see Table 2) with a KMO (Kaiser–Meyer–Olkin) measure of 0.647 shows the need of using seven components for describing the data (these components explain 64.27% of the variability). Drop out is associated with the academic performance rate, the success rate and the average mark. The sign of the coefficient in the rotated component matrix indicate that when the academic performance rate, the success rate and the average mark go up the drop out goes down. For explaining the drop out a logistic regression model has also been obtained after nine iterations of the stepwise method. This model explains 78.8% of data, that is, the 78% of those students who do not drop out and the 85.3% of those students who abandon. In addition, with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts show, with a p-value of 0.999, that the model obtained included the variables of Table 3. Only values with significant odds have been included in Table 3. According to the results of Table 3, the odds ratios show the following information: – Student’s average mark: If the student’s average mark is increased by one, the odds that the student do not drop out will be more than doubled (1/0.410 = 2.6). – Age of the student at the end of his degree: Considering an increase of one unit, odds ratio is significant, but its value is 1. For this reason, an increase of 5 years is considered. This means if the age of the student at the end of his degree is increased in 5 years, the risk of dropping out will be multiplied by 1.5 (e5*0.074). – Degree: The risk of abandoning the degree is 181 times bigger for the students of Management of Technical Software Engineering than for the students of Software Engineering. The risk of abandoning the degree is 119 times bigger for the students of System of Technical Software Engineering than for the students of Software Engineering.

Table 2 Rotated component matrix of the principal component analysis with n = 10,844. Component 1 Academic performance rate Success rate Abandon Average mark Access year Birth Access form Father’s job Mother’s job Father’s education level Mother’s education level Marital status Student job Family city Country Average round Mode round Degree Sex Exam round

2

3

4

5

6

7

0.882 0.929 0.631 0.870 0.784 0.676 0.728 0.709 0.748 0.644 0.726 0.672 0.643 0.815 0.809 0.698 0.892 0.677 0.539 0.427

Total variance explained is 64.27%.

Table 3 Results of the stepwise logistic regressiona. Variable

B

SE

Wald coef

Sign of B

Exp(B)

Average mark End year Degree Degreee (1) Degree (2) Exam round(1)

0.893 0.074

0.239 0.037 0.542 0.541 0.400

0.000 0.046 0.000 0.000 0.000 0.000

0.410 1.077

5.201 4.778 1.605

13.892 4.000 94.176 92.209 78.025 16.085

a

n = 10,844.

181.381 118.840 0.201

568

F. Araque et al. / Computers & Education 53 (2009) 563–574

– Exam round: The risk of abandoning the degree is five (1/0.201) times bigger for those students who take their exams in February and June than for those who take their exams in September and December.

5.2. Humanities In this faculty, 40.5% of the students were men. In addition 95.5% are single and 3.7% were married. Most of them (99.6%) are Spanish and only 0.2% is Moroccan. Most fathers (60.5%) have received primary education and 16.9% have a degree. The 10% are classified as ‘‘high level professional” and most of them (60.6%) work in the Public Administration. Most mothers (63.8%) have received primary education, 9.1% have not the primary education certificate and only 11.4% have a degree. Most mothers (69.7%) have never had a remunerative job, 12% of the mothers are housewives. Only 5.8% are classified as ‘‘high level professional” and 8.2% work in the Public Administration. Most students are unemployed (95.2%), 3.9% work more than 15 weekly hours and only 0.9% work less than 15 weekly hours. Most students (51.2%) are admitted to the university degree with an entrance examination, 31.5% of students are admitted as undergraduate, 14.4% have been admitted passing a special entrance examination for those over 25. The academic performance rate is an average of 0.48, the success rate is an average of 0.68 and the average mark is 1.4351. Most students (97.4%) pass the exam in the first round, 2.5% in the second round and 0.4% in the third round. In this faculty the drop out rate is 63.5%, the highest percent in the three faculties considered. PCA (see Table 4) with a KMO (Kaiser– Meyer–Olkin) measure of 0.636 shows the need of using seven components for describing the data (these components explain 70.51% of the variability). Drop out is associated with the academic performance rate, the success rate, the average mark again. The sign of the coefficient in the rotated component matrix indicate that when the academic performance rate, the success rate and the average mark go up the drop out goes down. For explaining the drop out problem a logistic regression model has also been obtained after eleven iterations of the stepwise method. This model explains 83.2% of data, that is, the 88.1% of those students who do not drop out and the 79.7% of those students who abandon. In addition, with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts show, with a p-value of 0.999, that the model obtained included the variables of Table 5. Only values with significant odds have been included in Table 5. According to these results the odds ratios show the following information: – Mode round: The risk of abandoning is approximately double (1/0.557 = 1.8) for those students who are admitted in first round than for those who are admitted in second round. – Access form: Approximately, the risk of abandoning is two times (e0.576 = 1.779) bigger for those students who are admitted by a degree than for those who are admitted by passing the university entrance test. – Father’s education level: Approximately the risk of abandoning is two times bigger (1/0.565 = 1.77) for the students whose fathers have not studies than for those students whose fathers have completed secondary school. The risk of abandoning is approximately two times (1/0.529 = 1.89) bigger for the students whose fathers have not studies than for those students whose fathers are graduates as well. – Father’s job: The risk of abandoning is approximately ten times (e2.278 = 9.754) bigger for the students whose fathers do not have a qualified job than for those students whose fathers work in the Public Administration. – Marital status: The risk of dropping out of philosophy is approximately three times (1/0.358 = 2.8) bigger for single students than for married students. – Degree: If the student is signing on Philosophy the risk of abandoning is lower, that is, this degree has less risk of abandoning.

Table 4 Rotated component matrix of the principal component analysis with n = 39,241. Component 1 Academic performance rate Success rate Average mark Average round Abandon Mother’s job Father’s job Mother’s job Access year Marital status Student job Degree Birth Mode round Father job Family city Country Sex Exam round Total variance explained is 70.51%.

2

3

4

5

6

7

8

0.872 0.937 0.881 0.654 0.607 0.719 0.852 0.902 0.732 0.735 0.674 0.841 0.758 0.876 0.456 0.746 0.754 0.816 0.941

569

F. Araque et al. / Computers & Education 53 (2009) 563–574 Table 5 Results of the stepwise logistic regressiona. Variable

B

SE

Wald coef

Sign of B

Exp(B)

Academic performance rate Success rate Average mark Average round Mode round Mode round (1) Access year Degree Degree (1) Degree (2) Degree (3) Degree (4) Degree (5) Degree (6) Degree (7) Degree (8) Degree (9) Degree (10) Degree (11) Degree (12) Degree (13) Degree (14) Degree (18) Marital status Marital status (1) Father’s job Father’s job (22) Father’s education level Father’s education level (2) Father’s education level (5) Access form Access form (3)

3.995 1.912 0.469 3.044

0.300 0.487 0.128 0.317 0.617 0.026

0.646 2.039 1.027 1.213 2.749 2.002 7.101 1.675 1.069 1.210 2.125 4.523 1.514 1.376 1.046

0.206 0.453 0.263 0.501 0.555 0.366 0.776 0.424 0.268 0.281 0.560 1.206 0.278 0.261 0.365

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.000 0.000 0.015 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.004 0.019 0.002 0.000 0.001 0.000 0.018 0.018 0.000 0.000

0.018 0.148 1.599 0.048

3.494 0.197

177.018 15.414 13.363 92.185 32.044 32.044 56.603 131.726 9.785 20.253 15.262 5.870 24.569 29.859 83.737 15.587 15.862 18.607 14.402 14.058 29.580 27.790 8.193 9.977 9.250 89.804 10.301 26.687 5.593 5.599 20.137 13.880

a

1.028 2.278 0.570 0.636 0.576

0.338 0.710 0.241 0.269 0.155

32.917 0.821 1.907 7.684 2.792 3.364 15.627 7.405 1213.195 5.341 2.913 3.354 8.376 92.106 4.547 3.961 2.846 0.358 9.754 0.565 0.529 1.779

n = 39,241.

– Access year: If the age of the student at the beginning of his degree is increased by 3 years, the risk of abandoning will be approximately doubled (1/e3*( 0.197) = 1.8). – Mode round: The abandoning risk for students who have passed in second round is much greater (e3,494 = 32.917) than for the students who are passed in first round. – Average round: If the average number of rounds to pass the subjects is increased by one, the risk of dropping out will be twenty times higher (1/0.048 = 20.8). – Average mark: If the student’s average mark is increased by one, the odds that the student does not abandon will be multiplied by 1.6. – Academic performance rate: If the success rate is increased in 0.3, the odds that the student does not abandon will be approximately doubled (1/e0.3*( 1.912) = 1.7). – Success rate: If the success rate is increased by 0.3, the odds that the student does not abandon will be tripled (1/e0.3*( 3.995) = 3).

5.3. Economic Sciences In this faculty, 45.9% of the students were men. In addition 96.4% are single and 3.1% were married. Most of them (98.2%) are Spanish and only 0.6% is Moroccan. The largest group of fathers (43.1%) have received primary education, 25.2% have a degree and 9.2% have not the primary education certificate. The 12.9% are classified as ‘‘high level professional” and most of them (38.8%) work in the Public Administration. The largest group of mothers (43.1%) have received primary education, 11% have not the primary education certificate and 16.7% have a degree. Most mothers (67.3%) have never had a remunerative job, only a 4.9% of the mothers are housewives, 7.3% are classified as ‘‘high level professional” and 14.6% work in the Public Administration. Most students are unemployed (92.5%), 6.3% work more than 15 h weekly and only 1.1% work less than 15 h weekly. Most students (67.6%) are admitted to the university degree with an entrance examination, 20.3% of students are admitted as undergraduate and 10% have been admitted passing a special entrance examination for those over 25. The academic performance rate is an average of 0.40, the success rate is an average of 0.60 and the average mark is 1.27. Most students (97.7%) pass the exam in the first round, 2.1% in the second round and 0.1% in the third round. In this faculty the drop out rate is 43.6%, the lowest in the three faculties. PCA (see Table 6) with a KMO (Kaiser–Meyer–Olkin) measure of 0.677 shows the need of using six components for describing the data (these components explain 67.24% of the variability). Drop out is associated with the academic performance rate, the success rate, the average mark again. The sign of the coefficient in the rotated component matrix indicates that when the academic performance rate, the success rate, and the average mark go up the drop out goes down. For explaining the drop out problem a logistic regression model has also been obtained after eleven iterations of the stepwise method. This model explains 73.9% of data, that is, the 68.9% of those students who do not drop out and the 81% of those students who abandon. In addition, with a signification of 5% the Hosmer and Lemeshow test shows that the model fits the sample data well and the Wald’s contrasts show, with a p-value of 0.999, that the model obtained included the variables of Table 7.

570

F. Araque et al. / Computers & Education 53 (2009) 563–574

Table 6 Rotated component matrix of the principal component analysis with n = 25,745. Component 1 Academic performance rate Success rate Average mark Abandon Father’s job Mother’s job Father’s education level Mother’s education level Access year Birth Marital status Student job Exam round Average round Mode round Family city Country Degree Sex

2

3

4

5

6

0.869 0.905 0.822 0.683 0.613 0.703 0.812 0.847 0.834 0.815 0.510 0.561 0.375 0.701 0.860 0.761 0.759 0.696 0.754

Total variance explained is 67.24%.

Table 7 Results of the stepwise logistic regressiona. Variable Father’s education level Father’s education level (3) Father’s education level (5) Mother’s education level Mother’s education level (5) Academic performance rate Average round Mode round Mode round (1) Mode round (3) Access year a

B

SE 0.715 1.057

0.312 0.321

0.862 4.555 2.814

0.364 0.253 0.264

3.279 4.710 0.082

0.466 1.717 0.027

Wald coef

Sign of B

24.405 5.253 10.845 14.322 5.610 324.936 113.907 53.371 49.621 7.521 9.071

0.000 0.022 0.001 0.014 0.018 0.000 0.000 0.000 0.000 0.006 0.003

Exp(B) 0.489 0.348 2.368 0.011 0.060 26.557 110.998 1.086

n = 25,745.

Only values with significant odds have been included in Table 7. According to these results the odds ratios show the following information: – Father’s education level: The risk of abandoning is two times (1/0.489 = 2.04) bigger for the students whose fathers have not studies than for those students whose fathers have completed secondary school. The risk of abandoning is three times bigger (1/0.348 = 2.9) for the students whose fathers have not studies than for those students whose fathers have a degree. – Mother’s education level: The risk of abandoning is approximately 2.4 (e0.862 = 2.368) times bigger for the students whose mothers have not studies than for those students whose mothers have a degree. – Academic performance rate: If the academic performance rate is increased by 0.2, the advantage that the student does not abandon will be multiplied by 2.5 (1/e0.2*( 4.555) = 2.49). – Mode round: The risk of abandoning for students who are often passed in second round is 26 times bigger (e3.279 = 26.557) than for the students who are often passed in first round. The risk of abandoning for students who are often passed in fourth round is 120 times bigger (e4.710 = 119.998) than for the students who are often passed in first round. – Access year: This means that if the age of the student at the beginning of his degree is increased by 8 years, the risk of abandoning will be doubled (e8*0.081 = 1.93). In the Economic Sciences faculty, as the access year of the degree increases, the academic performance decreases and the possibility of abandoning the degree increases. If the father has not studies, the student’s drop out tends to increase. 6. Discussion Behind the observed results, we could verify that the rate of drop out in the different disciplines studied is over 40% and even exceeds 60% in the case of Humanities. This corroborates the data of the Spanish Ministry of Education that calculates the rates of drop out at the Spanish universities at around 40% rising, and, in the case of humanities and the technical sciences like Software Engineering, registered rates are higher. Besides these data coincide with various reports of the University Council on Indicators of Academic Performance that

F. Araque et al. / Computers & Education 53 (2009) 563–574

571

reveal that the degrees of humanities present a low backwardness rate (15%), and the highest rate of drop out (43%); and engineering presents the higher rate of backwardness and drop out (40%). Taking into account public opinion and the information of some newspapers, we found more and more titles like these: ‘‘University studies abandoned in Spain approximately doubles the European average, (the Spanish newspaper, ABC, Madrid, March 2006)”, 5000 students approximately abandon the university in the islands, (headline news in Canary Islands 7, August 2004). All of this is in agreement with the results of drop out that we have seen in the degrees of our study (Fig. 3). Our objective was to analyze the main variables that seemed to be behind the drop out in different faculties: Software Engineering, Humanities and Economic Sciences. The results obtained suggest that certain variables appear repeatedly in the explanation of the drop out in all of the faculties. These variables are among others, start age, the father’s and mother’s studies, academic performance, success, average mark in the degree and the access form and in some cases also, the number of rounds needed to pass (Fig. 4). On the one hand the start age, we have seen through the results when start age increases the probability of drop out increases too, in fact, the great majority of the students that abandon a degree have begun their studies at over 20 years, while the great majority of the students that do not abandon have begun at 18 or 19. So, in Software Engineering the majority of students that abandon are over 21 years and in Humanities are over 22. This supports the previous studies in which many of the students that abandon come from other degrees, and are not very clear what degree to choose for their formation. In addition to this reason, it is also true that a lot of students that are admitted to the university are older than 25 years and in many instances, the level of adaptation of an older student to the requirements and learning needs, as the university to different profiles of students, is not the optimum, so that a high rate of drop outs is triggered. For this reason, the variable access form (selectivity, FP2 or those over 25 years among others), also relates meaningfully with the drop out in the numerous disciplines that have been studied (Figs. 5 and 6). On the other hand, the father’s and the mother’s studies and job (highly correlated variables in the second or third component of the PCA, see Tables 2, 4 and 6) also have an important level of influence on the drop out for the degrees studied. There can be no doubt that, in nearly all the studies under review, the factors of psychoeducational character have a big influence on the drop out in these moments (Last & Fulbrook, 2003). The relationship between the family factors and high school drop out has been well documented (Fortin et al., 2006). We could observe in our study that the parents’ socioeconomic status correlate with the drop out showing that low socioeconomic status and parental low academic performance contribute to increase the dropout probability. Some family economic difficulties or the lack of financial help for studying force some students to simultaneously study and work which in some cases induces situations of incompatibility that cause drop out (Sinclair & Dale, 2000). Nowadays a big percentage of students come from broken homes and the divorce of the parents also contributes to financial hardship (Lessard et al., 2008). Looking for other causes of drop out, numerous studies show the family pressure is a determinant of great influence. When the students are taking vocational decisions of academic and professional character, a lot of parents exercise such a strong pressure on the students that they cannot control and it leads them to drop out. In Root, Rudawski, Taylor, and Rochon’s study (2003) carried out in Wisconsin with aspirant university students hoping to obtain teacher’s qualification, they found that the family pressures had a great weight at the time of making a decision to leave studies, particularly in men. The need to reproduce the professional roles of the parents to continue the family business frequently makes those parents force the children to study determined degrees or else abandon the one that they began for lack of connection with the family role (Figs. 7 and 8). 0.7 63.5%

0.6 0.5

56.4% 50.4% 49.6% 43.6% 36.5%

0.4 0.3 0.2 0.1 0

Software Engineering

Humanities

Economic Sciences

Non drop out

Drop out

Fig. 3. Drop out rates in the three faculties involved in the study.

22.5 22

22 21.5

21

21

21 20.5

20

20 19.5 19

19

19

18.5 18 17.5 Software Engineering

Humanities Non drop out

Economic Sciences

Drop out

Fig. 4. Age the university studies start.

F. Araque et al. / Computers & Education 53 (2009) 563–574 0.9

84.0%

0.8 72.9%

67.2%

0.7

64.6% 58.6%

0.6

52.8%

0.5 0.4 0.3 0.2 0.1 0

Software Engineering

Humanities Non drop out

Economic Sciences

Drop out

Fig. 5. Statistical results of drop out for students whose fathers have not completed secondary school.

0.9

84.0%

0.8

72.9%

0.7

67.2%

64.6% 58.6%

0.6

52.8%

0.5 0.4 0.3 0.2 0.1 0 Software Engineering

Humanities Non drop out

Economic Sciences Drop out

Fig. 6. Statistical results of drop out for students whose fathers have completed secondary school.

0.7 61%

0.6

Percentage

51%

49%

0.5 40%

0.4 0.3

28%

26%

0.2 0.1 0 Software Engineering

Humanities

Non drop out

Economics Sciences Drop out

Fig. 7. Average academic performance rates.

0.9 0.8

79% 73% 68%

0.7 62%

Percentage

572

0.6 0.5

51%

49%

0.4 0.3 0.2 0.1 0

Software Engineering

Humanities Non drop out

Economic Sciences Drop out

Fig. 8. Average success rates.

F. Araque et al. / Computers & Education 53 (2009) 563–574

573

Regarding the academic performance, success and the students’ average mark, we have noticed that these three variables are also very associated with the drop out showing that a low level of academic performance and success is connected with a big probability of abandoning. A great majority of studies declared that the students that present a high motivation and positive expectations toward academic performance do not consider abandoning, in spite of the fact that many of them suffer a lot of difficulties, but in the end, they often achieve academic success (Landry, 2003). For some students, the adaptation to university life constitutes a challenge and a personal liability that leads them to make an effort and to look for the necessary help to attain the goals that they have set. However, many of them fail in the attempt and stop half way. Therefore we found that students with a good psychological profile to overcome obstacles, have greater persistence and in consequence a better adaptation. In this sense, Kirton (2000) found that the perception of the university environment and the academic self-efficacy had a great influence in the drop out the first year, during the first semester of the course. Another determining factor is academic failure, that is, the non attainment of success. This situation has been widely studied in the Spanish State recently, and the conclusions point to a weak previous education that affects determined degrees specifically. 7. Conclusions Summing up, the conclusions that we obtain after analyzing all variables of our study are the following: – Nowadays the rates of drop out of our students are higher than other previous studies reflected. – In many instances, factors associated with drop out have a multi-causal nature, and they are related as much with psychological, vital, generational characteristics as with the student’s educational characteristics. – Practically the totality of the considered variables evidences a significant relation with the variable drop out. However, the model of logistic regression obtained only considers some of them. This is happening, as the structural analysis shows, because some variables are highly correlated among themselves and then they contribute the same information. – It is possible to obtain a model of logistic regression that enables us to calculate the students’ risk of drop out for each university faculty. – The presence of characteristics in the student body like not much control of learning strategies, low capacity of persistence to attain his aims, translated into low rates of academic performance and success, supposes a high risk of drop out from the degree. There are many universities that have begun to design, to implement and to evaluate programs and strategies to increase the rates of persistence and to reduce the rates of drop out. In the strategic plan of the University of Granada there are numerous reports that have analyzed the university context. A doctoral thesis carried out at the department of Psychology in the University of Granada, has created a program of intervention whose efficiency, efficacy and benefit has been confirmed by the University Council of Coordination. The Tutorial Program between Partners has incorporated as main strategy of intervention the realization of sessions of tutorship between students of different age and academic course with more knowledge and ability (for example students of doctorate and from previous courses). A very interesting study for the future would be evaluating students that use this kind of program and students that do not, in order to check if significant differences in the rates of drop out exist. From the results obtained in this study it would be convenient that each university, in the elaboration of its plans acts to decrease the rate of drop out, takes into account the risk of abandoning (based on a model of logistic regression as we did) or that at least takes into account the academic performance and success rates to elaborate a more effective program which supervises the students with a big risk of drop out. As stated before, the drop out phenomenon is highly subjected dependent. The profiles of the students vary at different faculties in the same university. For this reason, a method which can be applied straight forwardly is needed. We have developed a methodology based on a DW architecture which is general enough to be applied in any university centre. It uses the common data any university has about their students (personal information and grades) without them having to be involved in the process. The most tedious part is the development of an application capable of transforming all the operational data, i.e. the data to manage the students and their grades, into a DM focused on the drop out. Some of the tasks this tool has to perform are the cleaning of the data, the determination of the drop out of each student and the summary of the academic history of each student in a unique record (defining some rates). Acknowledgements We are grateful to the referees for their constructive comments and to Mª Carmen Aguilera and Mª Belén García for their assistance with the manuscript. This work has been supported by the Spanish Research Program under projects EA-2007-0228 and TIN2005-09098-C05-03 Research Program under project GR2007/07-2. and by the References Araque, F., Salguero, A., Martínez, L., Navarro, E., & Calero, M. D. (2007a). Data warehousing for improving web based learning sites. International Journal of Emerging Technologies in Learning, 2, 1–8. Araque, F., Salguero, A., Calero, M. D., Fernández-Parra, A., Jiménez, M. I., Vives, M. C., et al. (2007b). E-learning platform as a teaching support in psychology. Lecture Notes in Computer Science, 4739. Braak, J. (2001). Factors influencing the use of computer mediated communication by teachers in secondary schools. Computer and Education, 36, 41–57. Callejo, J. (2001). A cohorty study on UNED students: An approximation to drop-out analysis. Revista Iberoamericana de Educación a Distancia, 4(2), 33–69. Feldman, R. S. (2005). Improving the first year of college: Research and practice. Mahwah, NJ: Lawrence Erlbaum Associates. Forbes, A., & Wickens, E. (2005). A good social live helps students to stay the course. Times High Education Supplement, 1676, 58–63. Fortin, L., Marcotte, D., Potvin, P., Royer, E., & Joly, J. (2006). Typology of student at risk of dropping out of school: Description by personal, family and school factors. European Journal of Psychology of Education, 21(4), 363–383. Hosmer, D. W., & Lemeshow, S. (2004). Applied logistic regression. Textbook and solutions manual (2nd ed.). New York, USA: John Wiley and Sons. Inmon, W. H. (2005). Building the data warehouse (3rd ed.). New York, USA: John Wiley and Sons. Kimball, R., & Ross, M. W. H. (2002). The data warehouse tool kit: The complete guide to dimensional modelling (2nd ed.). John Wiley and Sons. Kirton, M. J. (2000). Transitional factors influencing the academic persistence of first semester undergraduate freshmen. Dissertation Abstracts International Section A: Humanities and Social Sciences, 61(2-A), 522.

574

F. Araque et al. / Computers & Education 53 (2009) 563–574

Landry, C. C. (2003). Self-efficacy, motivation and outcome expectation correlates of college students’ intention certainty. Dissertation Abstracts International Section A: Humanities and Social Sciences, 64(3-A), 825. Last, L., & Fulbrook, P. (2003). Why do student nurses leave? Suggestions from a Delphi study. Nurse Education Today, 23(6), 449–458. Lessard, A., Fortin, L., Joly, J., Royer, É., & Blaya, C. (2004). Students at-risk for dropping out of school: Are there gender differences among personal, family and school factors? Journal of At-Risk Issues, 10(2), 91–127. Lessard, A., Butler-Kisber, L., Fortin Marcotte, D., Potvin, P., & Royer, É. (2008). Shadows disengagement: High school dropouts speak out. Social Psychology of Education, 11, 25–42. Levy, P. S., & Lemeshow, S. (2008). Sampling of populations: Methods and applications (4th ed.). New York, USA: John Wiley and Sons. Levy, P. S., & Lemeshow, S. (2003). Sampling of populations: Methods and applications. Solutions manual (3rd ed.). New York, USA: John Wiley and Sons. Lightseym, O. R. (2006). Resilience, meaning and well-being. Counseling Psychologist, 34(1), 96–107. Martins, C. B. M. J., Steil, A. V., & Todesco, J. L. (2004). Factors influencing the adoption of the internet as a teaching tool at foreign language schools. Computer and Education, 42, 353–374. Orazem, V. (2000). Understanding why they stay and why they leave: A grounded theory investigation of undecided students at a rural grant institution. Dissertation Abstracts International Section A: Humanities and Social Sciences, 61(6-A), 2214. Rainardi, V. (2007). Building a data warehouse: With examples in SQL server. Berkeley, California, USA: Apress. Root, S., Rudawski, A., Taylor, M., & Rochon, R. (2003). Attrition of Hmong students in teacher education programs. Bilingual Research Journal, 27(1), 137–148. Saunders, J., Davis, L., Williams, T., & Williams, J. (2004). Gender differences in self-perceptions and academic outcomes: A study of African American high school students. Journal of Youth and Adolescence, 33(1), 81–90. Sinclair, H., & Dale, T. (2000). The effect of student tuition fees on the diversity of intake within a Scottish new university. Paper presented at British Educational Research Association Annual Conference, 7–9 September, 2000, Cardiff University. Thomas, L. (2002). Student retention in higher education: The role of institutional habitus. Journal of Education Policy, 17(4), 423–442. Wasserman, K. N. (2000). Psychological and development differences between students who withdraw from college for personal-psychological reasons and continuing students. Dissertation Abstracts International Section A: Humanities and Social Sciences, 62(3-A), 915.