Challenges to assessing motivation in MOOC learners: An application of an argument-based approach

Challenges to assessing motivation in MOOC learners: An application of an argument-based approach

Journal Pre-proof Challenges to assessing motivation in MOOC learners: An application of an argument-based approach Kerrie A. Douglas, Hillary E. Merz...

1MB Sizes 0 Downloads 13 Views

Journal Pre-proof Challenges to assessing motivation in MOOC learners: An application of an argument-based approach Kerrie A. Douglas, Hillary E. Merzdorf, Nathan M. Hicks, Muhammad Ihsanulhaq Sarfraz, Peter Bermel PII:

S0360-1315(20)30031-2

DOI:

https://doi.org/10.1016/j.compedu.2020.103829

Reference:

CAE 103829

To appear in:

Computers & Education

Received Date: 7 September 2019 Revised Date:

6 January 2020

Accepted Date: 2 February 2020

Please cite this article as: Douglas K.A., Merzdorf H.E., Hicks N.M., Ihsanulhaq Sarfraz M. & Bermel P., Challenges to assessing motivation in MOOC learners: An application of an argument-based approach, Computers & Education (2020), doi: https://doi.org/10.1016/j.compedu.2020.103829. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier Ltd.

Kerrie A. Douglas: Conceptualization, Methodology, Writing – Original Draft, Writing – Review & Editing Hillary E. Merzdorf: Formal Analysis, Writing – Original Draft, Writing – Review & Editing, Visualization Nathan M. Hicks: Formal Analysis, Data Curation, Writing – Original Draft, Writing – Review & Editing Muhammad Ihsanulhaq Sarfraz: Formal Analysis, Data Curation, Writing – Original Draft, Writing – Review & Editing, Visualization Peter Bermel: Conceptualization, Writing – Original Draft, Writing – Review & Editing

Challenges to Assessing Motivation in MOOC Learners: An Application of an ArgumentBased Approach Kerrie A. Douglas (Corresponding Author), Hillary E. Merzdorf, and Nathan M. Hicks School of Engineering Education, Purdue University, Seng Liang Wang Hall, 516 Northwestern Ave., Purdue University, West Lafayette, IN 47906, USA. Email: [email protected], [email protected], [email protected] Muhammad Ihsanulhaq Sarfraz School of Electrical and Computer Engineering, 465 Northwestern Avenue, Purdue University, West Lafayette, IN 47906 Email: [email protected] Peter Bermel School of Electrical and Computer Engineering, Birck Nanotechnology Center, 1205 West State Street, Purdue University, West Lafayette, IN 47907, USA. Email: [email protected]

Abstract While data for examining learner activity are abundant, there are relatively little frameworks for principled interpretation of the behavior. Through application of Kane’s argument-based approach to assessment validation, the authors conducted several analyses to combine motivation assessment with online behavioral data for further validating inferences made about MOOC learners. The EVC motivation scale was administered to learners in three advanced engineering MOOCs, and event log data was collected from the online learning platform. The EVC items were comprehensively examined (n = 661) through factor analysis, item response theory, and linear regression. Results indicated that the instrument retained a three-factor structure as well as strict and structural invariance across age groups and education levels, but the Expectancy and Value items suffered from significant ceiling effects, a key difficulty in assessing motivation using this approach. As part of a regression model with learners’ intentions, EVC scores did not account for a significant amount of variance in learners’ assessment outcomes and course behavior. Challenges remain for adjusting to learners’ high expectancy and value of courses at the beginning, and for fully understanding how scores relate to learner behavior. Keywords: validation, motivation, assessment, MOOC

Running head: Motivation in MOOC Learners

Examining Challenges to Assessment of Motivation in MOOC Learners through a Principled Validation Approach

Author 1 (Corresponding Author), Author 2, & Author 3 Address 1 Email 1, Email 2, Email 3 Author 4 Address 2 Email 4 Author 5 Address 3 Email 5

Abstract While data for examining learner activity are abundant, there are relatively little frameworks for principled interpretation of the behavior. Through application of Kane’s argument-based approach to assessment validation, the authors conducted several analyses to combine motivation assessment with online behavioral data for further validating inferences made about MOOC learners. The EVC motivation scale was administered to learners in three advanced engineering MOOCs, and event log data was collected from the online learning platform. The EVC items were comprehensively examined (n = 661) through factor analysis, item response theory, and linear regression. Results indicated that the instrument retained a three-factor structure as well as strict and structural invariance across age groups and education levels, but the Expectancy and Value items suffered from significant ceiling effects, a key difficulty in assessing motivation using this approach. As part of a regression model with learners’ intentions, EVC scores did not account for a significant amount of variance in learners’ assessment outcomes and course behavior. Challenges remain for adjusting to learners’ high expectancy and value of courses at the beginning, and for fully understanding how scores relate to learner behavior. Keywords: validation, motivation, assessment, MOOC

Motivation in MOOC Learners 2

1. Introduction The digital educational landscape has changed in many ways in recent years, increasing the challenges and opportunities in educational assessment and validation. Recent advances in educational technology, data science, and artificial intelligence enable practitioners and researchers to assess learners in innovative ways. Intelligent tutors and artificial intelligences can capture every move a learner makes in performing a task and then assess the learner’s competence in that skill (e.g., Keshavabhotla et al., 2017). Process data in online courses can contribute to assessment data, as it includes behavioral information about how individuals interact with the course (Thille et al., 2014). Learning management systems and other online platforms can capture learner study habits and then link this data to their grades (e.g., Kline, 2017). Researchers have begun to make inferences about what students are learning and how they make decisions, in part through their observed clickstream data (e.g., Chao et al., 2017). Although learner interaction data are at times used for assessment purposes, there is relatively little discussion or research connecting modern views of validity from educational assessment to the new forms of data or the advancing data science methods (Romero & Ventura, 2016). The evidence and rationale supporting inferences based on analyses of learner behavior patterns are frequently unavailable or are based solely on statistical validation techniques not specific to contexts where one seeks to make inferences about a latent variable, such as motivation or knowledge (Williams, Douglas, Yellamraju, & Boutin, 2018). Findings corresponding to predictions made by applying machine learning algorithms to educational data are validated with respect to intrinsic metrics (e.g., accuracy of the prediction model with the available data). While model fit is certainly important, is does not capture all of the necessary information for justifiably interpreting what the behavior means from a learning perspective. To interpret the findings and make inferences about learners’ latent traits requires a principled and holistic approach to validation. The more that sophisticated data science methods are utilized to make decisions of personal consequence, the more evidence and rationale should be provided to justify the interpretation and use. The educational measurement community has spent decades debating and redefining what the term validity means. Historically, validity was understood according to the type – criterion validity, construct validity, content validity (Cronbach & Meehl, 1955). However, in the 1980’s and 1990’s, Messick argued for a unified validity theory, where all types of validity were subsumed under construct validity (Messick, 1995). Messick asserted that the historical three types of validity were important, but insufficient to justify all the interpretations and uses of educational assessment (Messick, 1995). Rather than types of validity, the unified approach emphasized that validity is concerned with accurately measuring the latent construct. Messick (1995) identified six aspects of validity, including areas such as consequential and substantive. Validity is not a property of an assessment, but rather how the assessment results are interpreted

Motivation in MOOC Learners 3

and used (APA, AERA, & NCME, 2014). Kane (1992) extended the conceptualization of unified validity and linked it to the process of argumentation as articulated by Toulmin in the context of legal reasoning. The 1990’s Standards (APA, AERA, & NCME, 1999) adopted the unified view of validity, and the 2014 Standards (APA, AERA & NCME, 2014) reaffirmed unified validation and stated that it is the users’ responsibility to ensure validity evidence supports the intended use. Rather than speaking of whether an assessment is valid in a broad or generic way, validity is an argument about how the assessment scores can be justifiably interpreted and then used. Similar to the understanding that hypotheses can never be proven, only supported, there is no such thing as a perfectly valid assessment (Douglas & Purzer, 2016). At the core of modern assessment validation is argumentation: making a claim based on relevant evidence (cite Toulmin). The claims made are about learners’ competencies based on theory and research regarding the nature of the competency and other relevant evidence (Kane, 1992; Pellegrino, 2013). The argument-based approach to validity emphasizes evaluating evidence to support a specific use and interpretation, in light of the potential resulting consequences (Kane, 2016). The validity of use and interpretation of an assessment score is essentially an evaluative judgement formed on testing the plausibility of a desired claim. To make the underlying assumptions more explicit, one first identifies hypotheses that would need supported in order to make the desired claims. From this approach, establishing the validity of interpretations and claims made from scores is the primary vehicle for ethical assessment practice. With advances in data science used for understanding latent learner attributes, there is a need for principled approaches regarding how to make justifiable inferences based on results. The argument-based approach to validity can serve as a methodological framework guiding how to combine traditional forms of assessment data with learner interaction data analyzed through emerged data analytic techniques. By having such a framework, emerging data science techniques can be combined with assessments to provide stronger levels of evidence to justify claims made about learners based on the results. The purpose of this work is to demonstrate applications of the argument-based approach to validity to support the valid interpretation of assessment scores and learner behavior in the context of learner motivation in massive open online courses (MOOCs). Inferences can be made about learners’ latent constructs (e.g., knowledge, motivation) from behavior that is captured via multiple online mechanisms, not just test scores, for a variety of educational decisions and purposes (Kline, 2017; Madhavan & Richey, 2016). Understanding learner motivation requires an instrument validated for use within the setting of a MOOC learning environment and audience. First, we review the recent literature on the motivations of MOOC learners. We then examine the plausibility of using the ExpectancyValue-Cost Scale, a previously published motivation scale established as predictive of academic success in K-12 settings (Kosovich, Hulleman, Barron, & Getty, 2015). To develop an argument

Motivation in MOOC Learners 4

for applying the EVC scale to MOOC environments, we outline claims from the scores and what would need to be true about the instrument to make those desired claims. We investigate sources of evidence for the validity of using our motivational scale in this context. Using modern approaches to measurement, learner performance, and data science techniques from learning analytics, we articulate the rationale and evidence supporting the valid use and interpretation of the EVC scale in MOOC settings and the corresponding usage barriers. Finally, we conclude with a discussion of the key findings and limitations of this combined attempt at validation, and implications for future work on the topic of MOOC assessment validity. 1.1. Literature Review 1.1.1. Massive Open Online Courses Of the many open educational resources developed in the past two decades, MOOCs have come under the most scrutiny for their potential and performance. Praised early on for their promise of revolutionizing the educational landscape (Head, 2012), they have more recently received criticism for low completion rates (Kolowich, 2013; Jordan, 2015) and ineffective pedagogy (Ross, Sinclair, Knox, Bayne, & Macleod, 2014; Riel & Lawless, 2017). However, criticism grounded in comparisons to more traditional educational formats fail to account for key characteristics that differentiate MOOCs. In addition, just as each university or college campus likely has much variation in terms of course quality, it is understood that MOOCs have a great deal of variability in terms of quality. MOOCs face a number of challenges that are different from formal education: participation is completely open, voluntary, and free and there are no penalties or consequences for disengagement. Since MOOC enrollment is open to anyone, there are no mechanisms to prevent individuals who lack crucial prior knowledge from enrolling—a particularly relevant issue in more advanced, technical MOOCs. Anyone in the world with Internet access can become a learner, which means that MOOCs can assemble incredibly large and heterogeneous audiences. These audiences typically include learners from a wide variety of backgrounds, with potentially different or more diverse reasons for participating than in a more traditional educational setting (Liyanagunawardena, Lundqvist, & Williams, 2015; Author et al., 2016c). As MOOC audiences have grown in size and diversity, MOOC provider platforms have recognized a need to support both independent learners and students at formal educational institutions. Some platforms, for instance, have created structured programs, and degrees, such as edX’s MicroMasters and Coursera’s Micro-Credential, which provide organized progressions of classes for students to earn certificates. Similarly, FutureLearn offers access to courses required for postsecondary degrees at six universities on their platform (FutureLearn, 2019). These course offerings have grown common—the 2018 Class Central Report lists 370 open online versions of full college courses which can be attended for credit by tuition-paying students or can be audited

Motivation in MOOC Learners 5

by non-students (Shah, 2018). Even with recent shifts towards for-profit education, MOOCs represent the evolution of open learning by supporting traditional institutions as well as workplaces and non-traditional students. While the ability of open online courses to reach massive numbers of students, sometimes in the tens of thousands, is advantageous for improving the global quality of education, MOOCs do present several logistical challenges for course instructors. With large, diverse audiences, instructors often know little about who enrolls in a course and their reasons for doing so. To measure learner and course success, instructors and researchers often rely on high-level data such as demographics and overall ratings of course satisfaction (e.g., Morris, Hotchkiss, & Swinnerton, 2015). Learner completion rate is still a commonly-analyzed variable in MOOCs as an indicator of impact or outcome (Pursel, Zhang, Jablokow, Choi, & Velegol, 2016; Jordan, 2015; Allione & Stein, 2016). However, the varying intentions of learners upon enrollment make completion rates a highly questionable metric of performance (Koller, Ng, Do, & Chen, 2013). Furthermore, the intentions and motivation of many groups of learners remain unclear, in part due to inconsistent conceptualizations of motivation in the MOOC literature (Author et al., 2016b). Thus, to help instructors make justifiable interpretations about learners and learner outcomes, we must develop a stronger understanding of MOOC learner motivation. 1.1.2. Motivation in MOOCs Motivation is an active topic in the MOOC literature associated with many aspects of learner experiences and intentions (Littlejohn, Hood, Milligan, & Mustain, 2016). Despite being an active topic, there has been little convergence of focus, as motivation has been inconsistently defined and measured across the MOOC literature. While many studies embrace a single view of motivation for student learning, a wide range of conceptualizations have been used. In some cases, multiple perspectives of motivation are accepted (e.g., Barak, Watted, & Haick, 2016). On the other hand, other studies present motivation but do not fully relate it to student learning (e.g., Li, Wang, & Tan, 2018). Across the MOOC literature, several different views of motivation have been studied. One approach examines the link between intentions and motivation using self-determination theory (Ryan & Deci, 2000) which points to externally-controlled or internally-autonomous sources of motivation (Zhou, 2016). This perspective suggests that the basic needs of autonomy, competence, relatedness, and belonging characterize learner experiences in MOOCs (Durksen, Chu, Ahmad, Radil, & Daniels, 2016). Another perspective focuses primarily on the internallyautonomous, or intrinsic, source of motivation in MOOCs by measuring a variety of goaldirected constructs, including MOOC learners’ engagement with others in the course, their control of over their learning, and their abilities to attain challenging goals or satisfy curiosities (Shroff, Govel, Coombes, & Lee, 2007). Kizilcec, Pérez-Sanagustín, and Maldonado (2017) examined the ability of learners’ intrinsic and extrinsic motivations, such as personal growth,

Motivation in MOOC Learners 6

relevance to job, school, or research, career change, or social interactions, to predict their use of self-regulation strategies. While relevance of the course to research was a significant positive predictor of self-regulation strategies, Kizilcec et al. (2017) found moderate predictiveness of extrinsic motivation factors, and low to negative predictiveness of intrinsic motivation. Motivation has been indirectly measured through learners’ future goals and objectives such as networking, career advancement, and self-improvement (Sooryanarayan & Gupta, 2015). Defining motivation as the intention to persist, factors predictive of course completion have been common metrics (Pursel et al., 2016; Onah, Sinclair, & Boyatt, 2015). Under this perspective, incomplete assignments, signs of difficulty, and low time commitments indicate low motivation (Thille et al., 2014), while high interactivity, support, and presentation of content suggest high motivation (Deshpande & Chukhlomin, 2017). While these indicators of motivation are supported by educational research, not all may be equally constructive in understanding MOOC learners. As these different perspectives show, motivation can be measured in a variety of ways, which may not be equally relevant for any given context. For instance, some scales for motivation might be meaningful in a professional work setting, but not in an educational setting. Unfortunately, prior surveys of MOOC learners’ motivations and goals possess minimal explanatory power in terms of learner behavior or achievement (Evans, Baker, & Dee, 2016). Thus, the need remains for MOOC surveys of motivation that more accurately assess the values they are intended to measure. We previously attempted to understand learners’ motivations in advanced nanotechnology MOOCs by characterizing learners using the intrinsic and extrinsic facets of motivation, derived from Ryan and Deci’s (2000) self-determination theory (Author at el., 2016c). While some learners in the advanced STEM MOOCs took the courses for official professional development credits, many learners enrolled voluntarily based on specific or broad interests, or aspirations of changing careers or being promoted (Author et al., 2016c). As expected, these learners demonstrated high levels of intrinsic motivation and variable levels of extrinsic motivation. We also saw that learners in these courses exhibited usage behaviors that classified them into one of approximately five behavioral clusters (Author et al., 2016c). However, we found that that these two motivation scales were ineffective at identifying differences in motivations across these clusters, due to a ceiling effect witnessed for the intrinsic motivation scale and a relatively uniform distribution of extrinsic motivation levels across users. This result points to a key challenge: if motivation drives differences in learner behavior, an alternative theoretical framework is necessary to develop a new set of motivation scales, for classifying learners and making predictions about their usage and performance in the class. One candidate for such a theoretical framework and corresponding scale is discussed next. 1.1.3. Expectancy-Value-Cost Theory

Motivation in MOOC Learners 7

An alternative model of motivation is the Expectancy-Value-Cost model (Barron & Hulleman, 2015). As an extension of the older Expectancy-Value model (Eccles et al., 1983), it classifies motivation with regards to three broad questions, each addressing one model component: “Can I do the task?”; “Do I want to do the task?”; and “What will doing this task cost me?” The expectancy dimension measures ability beliefs about students’ current self-performance expectations, and expectancy beliefs about their anticipated performance. The value dimension measures intrinsic motivation where the task is its own reward, utility value of the task to achieve other goals, and attainment value that confirms students’ identification with the task. The cost dimension is believed to subsume costs associated with time, effort, and emotional or psychological well-being. The EVC model’s focus on learners’ beliefs about ability to learn and achieve outcomes, about the enjoyability and usefulness of achieving outcomes, and about the cost required to do so make the model particularly well-suited for student motivation in coursework (Barron & Hulleman, 2015). In an early analysis of the EVC scale, Kosovich et al. (2015) conducted a confirmatory factor analysis (CFA) and analyzed measurement invariance across gender, subject area, and time using a group of 547 middle school students. Their results indicated that a three-factor structure remained intact for the given population. They also saw the EVC scale displayed scalar invariance for gender and equal error invariance across academic domains and time (Kosovich et al., 2015). Given its relatively recent introduction to the literature, cost has been under-analyzed in motivational literature compared to prior studies testing expectancy-value scales (Flake, Barron, Hulleman, McCoach, & Welsh, 2015). Flake et al. (2015) determined that cost is composed of four primary elements: 1) task effort, which is the time and work required by participation; 2) outside effort, which are external commitments and responsibilities that prevent a student from putting in enough time and work; 3) loss of valued alternatives, which are other desirable activities sacrificed in order to participate; and 4) emotional costs, which are the negative reactions of stress, fatigue, or anxiety produced by participation. In a recent review of 63 learning interventions seeking to improve student motivation, Hulleman et al. (2015) found that expectancy interventions often include retraining cognitive attributions and promoting more growth-oriented mindsets, whereas value interventions emphasize the utility of subject material and allow more student choice and control. Cost interventions frequently helped students reaffirm their personal values and feelings of belonging in the educational context. Such pedagogical changes may or may not be helpful for MOOC learners, as they may have a different sense of expected outcomes, participation value, and necessary costs when enrolling and persisting. While studies have been conducted to understand MOOC learners’ motivation in general, it is expected that audiences will vary across types of courses. Therefore, learners in more advanced, technical courses warrant their own independent analyses. Once a sensitive

Motivation in MOOC Learners 8

assessment of motivation can be determined, potential interventions to support MOOC learners achievement of personal goals can begin to be tested. 1.2 Theoretical Framework Oftentimes, researchers or evaluators desire to use a published assessment instrument, but do not necessarily want to be the person to undergo validation of an assessment. Author et al. (2017) presented an application of Kane’s (1992, 2013) framework for users of published instruments. The framework emphasizes the need to examine the coherence, completeness, and plausibility of claims regarding inferences and interpretations of the implementation of an assessment instrument. It suggests a chain of reasoning to determine the appropriate use of a previously published assessment for one’s own purpose. The steps are: 1. Articulate the desired use by outlining the purpose in terms of who will be assessed, why, and the information expected to be obtained with the assessment. 2. Specify the intended use in terms of the context it will be administered, the decisions expected to be made with the results, and identification of potential consequences of use. 3. Identify the specific desired inferences in terms of how the scores will be interpreted and the claims that will be made. Once the intended use has been articulated, the framework requires the construction of arguments based on necessary claims to justify the instrument’s use, the assumptions associated with those claims, and the sources of evidence that would support them. The EVC scale is a strong theoretical model of motivation for traditional educational settings. However, before the model can be used in a non-traditional setting, the measurement instrument must be validated for that particular use case. Following the recommended procedures outlined by Author et al. (2017), the expected purpose, use, and inferences for our assessment are outlined in Figure 1. The next step presented by Author et al. (2017) is to critically think about the assumptions behind each desired claim. To make the inferences described in the articulation for use, it is necessary to identify the underlying claims that contribute to those inferences and the underlying assumptions behind the claims. To make a claim, one must depend upon assumptions about the instrument and the learners. The validity of these assumptions must be supported by empirical evidence using an appropriate method of analysis. Table 1 outlines the claims that we are making to support the use of the EVC instrument to assess learner motivation in MOOCs, as well as the assumptions that must be made to make those claims, and the evidence we will use to support those assumptions in terms of our methods of analysis. 1.3 Research Questions

Motivation in MOOC Learners 9

As the EVC scale is still relatively new, it has not yet been tested on a wide range of populations. In contrast to the students surveyed by Kosovich et al. (2015), students in MOOCs are likely to have a stronger sense of their expected outcomes, participation value, and necessary costs when deciding to enroll in open access courses; however, the population consists of a much wider range of ages, education levels, and nationalities than prior studies, which may influence interpretations of the instrument. Thus, we wanted to determine if the original trait structure seen in traditional educational settings remains consistent when the scale is used in advanced nanotechnology MOOCs. We also sought to obtain statistics on the performance of each item in the scale. Based on our theoretical framework, this goal gives rise to four sub-questions: 1. What is the factor structure of the EVC instrument for learners in advanced nanotechnology MOOCs? 2. How is the factor structure interpreted across sub-populations that differ from previously validated use cases (e.g., across age groups and education levels)? 3. To what extent are items on the EVC scale sensitive enough to capture variation in learners’ motivation? 4. To what extent do scores on the EVC scale predict how learners engage with MOOC materials? In theory, measuring EVC can help support learner success by indicating appropriate expectancy, value, or cost interventions and course modifications that would benefit individual learners or groups of learners. By examining the properties of the scale before implementing it widely across MOOC courses, we will support the validity of inferences and conclusions about MOOC learner motivation using the EVC scale.

2. Methods 2.1 Study One Methods The first stage of our investigation (Study One) examined the psychometric properties of the EVC instrument using quantitative techniques. We attempted to collect evidence to support our first three claims in Table 1, which map to the first three research questions. 2.1.1. Context, Participants, and Data Cleaning Data for this study were obtained from three advanced nanotechnology MOOCs offered by [blinded] on the edX platform: [names blinded]. Because these courses are self-paced, learners could participate either in live or archived mode. After removing preview and test responses, Course 1 had 381 learners (live = 262, archived = 119), Course 2 had 291 learners (live = 225, archived = 64), and Course 3 had 442 learners (live = 320, archived = 122). In each of the three

Motivation in MOOC Learners 10

courses, learners are shown several pages of syllabus-like content titled “Course Overview” with course prerequisites, learning objectives, structure, grading policy, schedule, and instructions for using the MOOC platform. Next, learners are expected to take a pre-course assessment, a feature implemented in more recent MOOCs to better estimate the learners’ incoming content knowledge. Immediately following the pre-course assessment is the pre-course survey, which includes the EVC scale and consists of between 39 and 46 questions, depending on whether follow-up questions are asked after selected responses. However, like the pre-course assessment, the course design permits a learner to skip the survey instrument altogether. Unlike the traditional learning environments in which the original EVC scale was used, MOOCs attract diverse learners from across the world by age, education level, and nationality. The demographics of the learners across the three courses used in these studies are presented in Table 2. The pre-course survey consisted of several demographic questions, questions about learning goals, intended use of the course, reasons for taking the course, and questions about motivation. Motivation was measured by a modified version of the EVC instrument, based on the original scale developed by Kosovich et al. (2015) (see Appendix). We removed the first cost item (“My [subject] classwork requires too much time.”), as it assumes prior experience with a course to answer. Further, we had to modify the wording of some items slightly to fit our context. Answer choices used a six-point Likert scale from 1 (Strongly Disagree) to 6 (Strongly Agree). Learners who responded to less than two-thirds of questions within each subscale were removed (n = 719), as well as two learners whose number of page clicks were fewer than their number of responses (n = 717). This removal resulted in negligible missing data. For learners who answered twothirds of a given set of subscale questions, we imputed data for the incomplete responses based on the average of the other two responses in the subscale. Using a cumulative distribution function, we show the completion time of learners by the percentage of total responses to the survey (Figure 2). Based on this plot, we set our cutoff at the initial point of inflection, excluding responses submitted in less than 25 seconds (n = 661). Figure 3 shows a histogram of completion times for learners who answered the surveys. 2.1.2 Factor Analysis We used confirmatory factor analysis (CFA) to model the underlying factor structure of the instrument using the Mplus 8.0 program (Muthén & Muthén, 2017), in order to identify the latent variables measured by each item (Fabrigar & Wegener, 2011). In an exploratory context where researchers cannot predict the instrument’s trait structure, factor analysis is useful for selecting a parsimonious number of latent constructs for the model, based on factor loadings and a scree plot of eigenvalues to suggest a likely number of underlying traits (Fabrigar, Wegener, MacCallum, & Strahan, 1999). When an item strongly loads onto a single factor, it is likely measuring a single trait. Given ample literature supporting a three-factor structure for the EVC scale, we used a confirmatory factor analysis only.

Motivation in MOOC Learners 11

2.1.3. Measurement Invariance Measurement invariance, or multiple-group invariance, is a method to determine the extent to which a factor model is consistent across different groups (Dimitrov, 2010). Using CFA models in Mplus 8.0, we conducted the forward or sequential constraint imposition approach to testing measurement invariance for age and education level (Dimitrov, 2010). The EVC scale has been shown previously to possess acceptable invariance across gender and subject area (Kosovich et al., 2015). Given previous findings of invariance for subject area and that our three courses were all within electrical and computer engineering, we chose not to test for invariance across courses. Further, the three courses had large sample size differences for gender, making invariance analysis for gender unfeasible; however, the previous findings of invariance across gender was considered sufficient evidence given the circumstances. Across these analyses, the fit of each model was determined using goodness-of-fit indices (Jӧreskog, 1967). Specifically, Brown (2006), Byrne (2012), and Dimitrov (2010) recommend evaluating degrees of freedom, Chi-square, comparative fit index (CFI), Root Mean Square Error of Approximation (RMSEA), the change in degrees of freedom, Chi-square, and CFI between nested models, and the p-values of whether Chi-square changes are significant. It is noted, however, that Chi-square values are strongly affected by sample size and often result in significant p-values with large samples. As the distribution of responses to the expectancy and value questions were heavily skewed toward agreement, there were several subgroups that did not select all response categories for all items. Mplus 8.0 requires each group to have data in all response categories to perform measurement invariance. Therefore, the scale was collapsed for the Expectancy and Value items such that all disagreement responses were combined into a single response level, resulting in four levels of response rather than the six levels maintained for the Cost items. Table 3 shows each model tested and their constraints. 2.1.4. Item Response Theory Analysis We used item response theory (IRT) to model the relationships between the three dimensions of the EVC scale and MOOC learner responses, as well as obtain discrimination indices and information curves for each item. A graded response model (GRM), also known as a Cumulative Category Response Function (DeMars, 2010), was fit to the data, as it was developed by Thissen, Pommerich, Billeaud, & Williams (1995) for use with scores having ordered polytomous response categories (i.e., can be divided into more than two distinct branches). It is a twoparameter IRT model used for polytomous items that approximates the probability, at a given trait level, of choosing each response option or higher for each question in the instrument (Samejima, 1997). Each likelihood is estimated using equation (1),    =

       

(1)

Motivation in MOOC Learners 12

where    represents the probability of selecting a response option k or higher for item i,  represents the discrimination parameter that determines the slope of response curves for each item,  is the threshold parameter at which there is a 0.5 probability of choosing the response immediately above or below k, and  represents each respondent’s trait level, which is measured by item i. The discrimination parameter,  , indicates how steeply the probability of selecting response category k changes for item i as the person’s position on the trait continuum  increases or decreases. 2.3 Study Two Methods Following our psychometric analyses, the second stage (Study Two) consisted of a quantitative analysis of behavioral data to investigate the extent to which the pre-course EVC instrument related to learner engagement and performance in a course. A correlation between motivation scores measured through the EVC scale and course engagement provided external evidence for validation for using the EVC scale in an advanced nanotechnology MOOC. Pre-course EVC responses were linked to course event logs to perform multiple regression analysis. This analysis is used to support inferences regarding the instructional relevance of EVC scores for practical use in these courses. 2.3.1. Context, Participants, and Data Cleaning Data for this study is a subset of the data used in Study One, and was obtained from an advanced STEM MOOC offered by [Institution Blinded] on the edX platform. The data is a composition of pre-course survey data with nine Expectancy-Value-Cost items as described in Study One, clickstream (event logs) data and assessment scores for one of the three courses. Such data provides a comprehensive overview of what transpires in an advanced STEM MOOC, allowing the combination of self-reported survey data with actual behaviors and outcomes. The pre-course survey administered to students in this MOOC was specifically designed to test the validity of EVC and was created with this evaluation in mind. The first step was to apply a similar sampling method for learners who took the pre-course survey that was used in Study 1. While thousands of students took the course, the fraction of learners that took the survey was 5.1%, resulting in an initial dataset of n = 445. Learners who responded to less than two-thirds of the questions within each subscale were removed (n = 286). We evaluated how long respondents took on the survey and removed careless responses, identified by respondents going through the survey more quickly than is cognitively plausible (< 25 s). In addition, data with a lack of change in responses within each subscale (i.e., respondents that always answer a 5) were removed (n = 273). Finally, we imputed data for learners with incomplete responses who had answered only two of any subscale, based on their responses to the other items of that subscale. Single imputation was used as the number of cases was

Motivation in MOOC Learners 13

relatively small, and because we limited the mean imputation to within a single subscale. Next, we mapped the sampled learners that took the pre-course survey to their event logs and course score. Clickstream data captures the date and time when the learner accesses any material and consists of: number of lecture views, access to power point slides, tutorials, assignments, and assignment solutions. Data is recorded every time a learner clicks on any course resource. Two types of course materials were made available to the learners every week: learning materials like lectures, homework, and tutorials, as well as assessment materials like quizzes and exams. In this study, we identified a total of 67 course materials during the five weeks of the course. The assessment scores collected for analysis refer to exam and quiz scores. There was one quiz per lecture, and it consisted of four to five multiple choice questions with four choices per item, directly related to material covered in the lecture. In total there were 28 quizzes and 138 questions to assess learners directly after presentation of the material. In addition, there were three exams, with average of 12 questions aligned to the concepts covered in the past weeks. In total, the three exams comprised of a total of 37 questions. 2.3.2. K-means Clustering With the mapping established, we subsequently employed a method known as K-means clustering to group learners into clusters of similar behavior (Madhavan et al., 2018). The Kmeans algorithm groups data points into k partitions, such that the distances of individual data points from each cluster’s centroid are minimized (Tibshirani, Walther, & Hastie, 2001). The number of clusters chosen, k, is determined using an elbow test. The test looks at a plot of variance described versus number of clusters to identify the number of clusters after which the change in additional described variance is minimal (i.e., after a large change in slope, or elbow) (Tibshirani et al., 2001). For this particular analysis, each learner was represented by a onedimensional vector equal to the number of unique modules in the course accessed by the learner. 2.3.3. Multiple Regression Analysis Dealing with latent constructs, the most ideal analysis technique would be a structural equation model, as it includes the measurement model in the analysis and accounts for the measurement error. However, due to the small sample size and resulting lack of statistical power, we instead used multiple regression to see the relationship between the scores on the EVC (a score for each factor) and behavior (activity of clusters) as well as between EVC scores and overall course assessment scores. Since the EVC scores follow a non-normal distribution, we performed a rankorder transformation and then modeled using non-parametric regression. We observed from the assessment scores that there were users who took the pre-survey but never took any assessments (n = 250). To that end, we added intention and actual completion of goals as a moderator when we analyzed EVC against behavior and assessment score. Since there is strong correlation between different kinds of assessments as shown by the multicollinearity analysis in Table 11,

Motivation in MOOC Learners 14

we took their mean when using it as a moderator to analyze EVC against behavior and assessment scores.

3. Results 3.1. Study 1 Results We conducted a Confirmatory Factor Analyses on pre-survey responses for students from three MOOC courses. After confirming the factor structure of the overall instrument, we examined the measurement invariance (via multigroup confirmatory factor analysis) to ensure that the instrument works consistently across groups that are notably different than groups that were previously studied with this instrument. Next, we investigated the individual performance of each item on the instrument using Item Response Theory to further analyze the ability of the instrument to differentiate between learners. 3.1.1. Confirmatory Factor Analysis We tested a confirmatory factor analysis (CFA) assuming a three-factor structure to the Expectancy-Value-Cost scale. CFA examines the fit of a model that requires each item to load onto a single factor, thereby preventing items from cross-loading onto multiple factors. We used M-plus 8.0 and set each item as ordinal with the WLSMV estimator. The expectancy items were set to one factor, value items set to another, and cost items set to the final factor. Individual item loadings were unrestrained and factor variances were set to 1. For model fit indices, the Chi-square value was significant and supported rejecting the null hypothesis that this model fit the data ( 2 = 49.130, p < 0.01) (see Table 5). However, the RMSEA, CFI, and TLI all suggested a good model fit. Interpreted together, fit statistics indicate that the model is an acceptable fit for MOOC learner scores. After fixing the loading of the first item in each factor at 1, we examined the loading of the remaining two items in the dimension. Factor loadings for the CFA model remained high for all three factors (Table 6), supporting an interpretation of the EVC scale as measuring three separate latent constructs. Item 9 continued to have a weaker loading onto Factor 3 than other items (0.698). This item was not as strongly associated with Cost (Factor 3) as Items 7 and 8. 3.1.2. Measurement Invariance Performance of measurement invariance analyses in Mplus 8.0 for ordinal data using the WLSMV estimator requires that each comparison group contains at least one respondent to select each level of the ordinal variable. However, the skewed responses for the Expectancy and Value items led to groups lacking some response categories altogether. As a result, it was necessary to collapse all levels of disagreement into a single response category for each of the

Motivation in MOOC Learners 15

Expectancy and Value items in order to conduct the analyses. This had to be performed when examining invariance across both age and education. Dimensional invariance was seen across all groups when comparing age and education. As shown in Table 8, the 3-factor CFA for each independent subgroup resulted in acceptable fit statistics. Although the Chi-square value for the “Age 24 and under” group was statistically significant to the 0.05 level, the remaining fit statistics were within acceptable ranges. Further, the fit statistics for all other subgroups fell into acceptable ranges for good fit, while also having insignificant Chi-square values. Successively constrained models comparing both age groups and education groups indicated strict and structural measurement invariance for the EVC scale in our courses. Tables 8 and 9 summarize results, showing acceptable CFIs, RMSEAs, and ∆CFIs for each nested model. 3.1.3. Item Response Theory Analysis To explore the discrimination ability and difficulty of the nine items on the EVC scale, we fit a two-parameter Graded Response Model (GRM) with three factors to the learner response data. Using weighted least squares means and variance (WLSMV) to estimate for categorical data, fit statistics for the model yielded a low Chi square value ( 2 = 47.326, p < 0.0031), a low root mean square error of approximation (RMSEA = 0.038), and a high Tucker-Lewis index (TLI = 0.997), indicating that a two-parameter model is a good fit for the data. Table 11 contains the parameter estimates from the GRM model. By freeing the discrimination parameter, the traditional GRM model is less restrictive than other Rasch-type models and allows the category response function slopes to vary for each item. Discrimination indices represent the slope of the response function for each answer choice in the survey. Despite the different intercepts for Cost items, discrimination indices show that the probability of choosing each of the response, depending on motivation level, was consistent for all learners. Intercepts for the Expectancy and Value items suggest that these items capture the higher levels of motivation, as lower categories initially were collapsed for these items, and the first three remaining categories fall well below 0 on the trait scale. Response categories 1 and 4 (see Figure 4) were the most likely to be endorsed by learners. However, the range of motivation on the trait scale is wide, with the intercept between the third and fourth category at approximately 0. These items demonstrated a ceiling effect, where learners with high levels of motivation responded positively to the Expectancy and Value items, and very few chose response options below Agree. For Cost items, having more than 20 learners responding in each category justified keeping the six-point scale intact. Response categories 1 and 6 again had the highest probability of being selected, but category 1 measured a higher motivation level than Expectancy and Value questions. These items were distributed around the mean of the motivation trait scale, indicating that middle items measured average motivation, and higher and lower items measured extremes (see Figure 5). Narrow response curves for the items suggest they are useful for measuring a

Motivation in MOOC Learners 16

precise but limited range of motivation. While it is still more probable that learners will choose the lowest and highest response categories, the middle items performed better than the Expectancy and Value items by providing uniform estimates of the trait level. Although the exploratory factor analysis, confirmatory factor analysis, and item analysis demonstrates the psychometric properties of the EVC instrument in a quantitative manner, it is crucial to next apply the EVC instrument to a specific advanced STEM MOOC, to better elucidate the relation between motivation and behavior in this context. 3.2. Study Two Results 3.2.1. K-means Clustering The cluster analysis was used to find similarity in learner usage patterns in accessing the course. The clusters were created to understand learner behavior in terms of their access patterns with respect to course content. Figure 6 shows the output clusters as a result of K-means cluster analysis. Running the clickstream data through the clustering algorithm provided the set of edX user IDs representing members of each of the identified clusters. Through an association table, we matched the user IDs with the hash IDs generated by each pre-survey instantiation. This matching enabled the sorting of pre-survey data into the appropriate behavioral clusters. We performed a similar mapping to retrieve the assessment score for each learner who took the presurvey. 3.2.2. Multiple Regression Analysis Learner engagement and achievement in MOOCs may differ substantially from other contexts. Approximately 60% of learners enrolling in a MOOC state that they intend to participate in most aspects of the course, which is already less than a traditional classroom; however, even this level of participation is rarely seen. Additionally, a large percentage of learners who begin fully interacting with course materials, disengage within the first two weeks. As a result, it appears that many learners may not be achieving their goals in MOOCs. In order to better design interventions to assist learners, there is a need to test motivation in the MOOC environment. To that end, this second study aims to examine the concurrent aspects of validity of using an expectancy-value-cost pre-course survey instrument to predict learner motivation and achievement in advanced STEM MOOCs. The results suggest that the Expectancy-Value-Cost scale provides a small amount of predictive power in terms of behavior and assessment outcomes. However, since correlation (as measured by R values, shown in Table 13) is relatively weak, the results may not be fully replicable. We did analyze the data using several methods, including clustering the data first as well as examining all the learners and aggregates without clustering. We tested the predictiveness of Expectancy, Value, and Cost independently as well as combined. In all cases, the correlation was

Motivation in MOOC Learners 17

not strong, as it only explained a small amount of variance (R2 < 0.1). While there may be some slight level of predictability, it is unclear if this would be highly reproducible, given the many factors influencing learners’ behaviors and assessment outcomes, which were not included in the model. As a result, it is difficult to disprove the null hypothesis in this small sample for the tested instrument.

4. Discussion We evaluated the plausibility of interpreting the EVC scale as an indicator of learners' motivation to achieve and engage in MOOC courses. For this use case, items are expected to measure the same construct related to motivation possess internal consistency, that the scoring structure makes sense, that items are able to differentiate between learners who truly vary in motivation, that the scoring structure is invariant across groups of learners, and that the EVC scale is related to learner behavior and achievement, as postulated by the theory. We use both learner behavior analytics and psychometric assessment as two sources of data for a validity argument for the EVC scale. Not only do both types represent different aspects of learner interactions, but they serve to triangulate claims of use in other MOOC environments. We tested the scale for our purposes, in a setting the developers had not created it for. In this new population, the dimensions of motivation are consistent with the scale’s previous use and show that the constructs are perceived consistently by learners across three courses. However, we saw a ceiling effect for expectancy and value as learners almost exclusively agreed with all items in these dimensions compared to cost, meaning the full range of learner intentions to participate in MOOCs is not represented. We also saw low predictiveness of the EVC scores for learner engagement and assessment outcomes. Based on our findings, the scale should be revised with more sensitive items that precisely capture differences in learners’ expectancy and values when enrolling in a MOOC. Our findings are consistent with previous research that learners are highly intrinsically motivated to enroll in MOOCs and self-directed learners, but that they experience barriers to completing (Loizzo, Ertmer, Watson, & Watson, 2017). However, our results did not reflect the positive predictiveness of expectancy and value, or the negative predictiveness of cost, on performance (Barron & Hulleman, 2015). This is not surprising, because MOOC learning environments are dramatically different from traditional K-12 classrooms. MOOCs are self-paced and informal, and often there is no consequence to failing. Learners anticipate performing well and consistently engaging throughout the course, but their intentions are not always reflected in the reality of their persistence (Hicks et al., 2016). Without outside regulation or strong extrinsic motivation, they may experience low commitment and be less inclined to self-regulate their learning (Kizilcec et al., 2017). At the same time, learners highly value course information and often enroll in courses with skill acquisition and career goals in mind. These learners need the support of employers

Motivation in MOOC Learners 18

who value their professional growth and allow time spent learning to be part of their job. It is also the responsibility of MOOC developers to ensure their courses realistically estimate the expected time investment upfront, for learners to make more informed decisions.

5. Conclusions and Implications In this paper, we demonstrate applications of the argument-based approach to the validation of the Expectancy Value Cost scale (Kosovich et al. 2015). The argument-based approach established by Kane (2013) can be used to evaluate the plausibility of interpretation of learner interaction as assessments or the use of learner interaction as part of the validity argument for traditional methods of assessment. Assessment methods relying on multiple sources of evidence are inherently challenging, as they are affected by contextual factors and as such are often not generalizable to new contexts. Nonetheless, the future of assessment depends on flexibly incorporating new types of information to make inferences and judgments of learning. Continued research is imperative if instructors in future learning contexts are to have high-quality assessment tools at their disposal. It is an exciting time to be in educational assessment with new technologies emerging to make sense of diverse forms of learner data. When machine learning and AI techniques produce nonobvious findings or predictions about learners, it is essential to develop more rigorous frameworks that justify educational decisions based on those findings. Validity becomes an evaluative judgment about how these findings are interpreted, used, and the consequences of their intended use. To fully realize the power of data science in educational assessment, validation must be grounded in principled approaches that best use its potential for assessing learners’ competencies and motivation. Applied properly, one can envision a not-too-distantfuture where even high-stakes educational assessment begins to include AI and deep learning techniques to make validated inferences about learners’ proficiencies. Regardless of the domain, whether knowledge, practice, interpersonal, or intrapersonal skills, the argument-based approach can still be applied to guide and validate interpretations of learner competencies, ideally resulting in ethical uses of validation approaches to educational assessment.

Motivation in MOOC Learners 19

References Allione, G., & Stein, R. M. (2016). Mass attrition: An analysis of drop out from principles of microeconomics MOOC. The Journal of Economic Education, 47(2), 174-186. https://doi.org/10.1080/00220485.2016.1146096 American Educational Research Association, American Psychological Association, & National Council of Measurement in Education. (1999). Standards for educational and psychological testing (1999 ed.). Washington, DC: American Educational Research Association. American Educational Research Association, American Psychological Association, & National Council of Measurement in Education. (2014). Standards for educational and psychological testing (2014 ed.). Washington, DC: American Educational Research Association. Authors (2016a). Authors (2016b). Authors (2016c). Author (2017). Barak, M., Watted, A., & Haick, H. (2016). Motivation to learn in massive open online courses: Examining aspects of language and social engagement. Computers & Education, 94, 49– 60. doi:10.1016/j.compedu.2015.11.010 de Barba, P. G., Kennedy, G. E., & Ainley, M. D. (2017). The role of students’ motivation and participation in predicting performance in a MOOC. Journal of Computer Assisted Learning, 32, 218-231. https://doi.org/10.1111/jcal.12130 Barron, K. E., & Hulleman, C. S. (2015). Expectancy-Value-Cost model of motivation. In J. D. Wright (Ed.), International Encyclopedia of the Social and Behavioral Sciences (2nd ed., Vol. 8, pp. 503–509). Oxford: Elsevier. Bennett, R. E. (2004). How the Internet will help large-scale assessment reinvent itself. In M. Rabinowitz, F. C. Blumberg, & H. T. Everson (Eds.), The design of instruction and evaluation: Affordances of using media and technology, (pp. 101-128). New York, NY: Psychology Press. Breslow, L., Pritchard, D. E., DeBoer, J., Stump, G. S., Ho, A. D., & Seaton, D. T. (2013). Studying learning in the worldwide classroom: Research into edX’s first MOOC. Research and Practice in Assessment, 8, 13-25. https://doi.org/ Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press. Byrne, B. M. (2012). Structural equation modeling with Mplus. New York: Routledge. Chao, J., Xie, C., Nourian, S., Chen, G., Bailey, S., Goldstein, M. H., Purzer, S., Adams, R. A., & Tutwiler, M. S. (2017). Bridging the designscience gap with tools: Science learning and design behaviors in a simulated environment for engineering design. Journal of Research in Science Teaching, 54(8), 1049-1096.

Motivation in MOOC Learners 20

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. doi:10.1037/h0040957 DeBoer, J., Ho, A. D., Stump, G. S., & Breslow, L. (2014). Changing “course”: Reconceptualizing educational variables for massive open online courses. Educational Researcher, 43(2), 74–84. 10.3102/0013189X14523038 DeMars, C. (2010). Item response theory. Oxford, UK: Oxford University Press. Deshpande, A. & Chukhlomin, V. (2017). What makes a good MOOC: A field of study of factors impacting student motivation to learn. American Journal of Distance Education, 31(4), 275-293. https://doi.org/10.1080/08923647.2017.1377513 Dimitrov, D. M. (2010). Testing for factorial invariance in the context of construct validation. Measurement and Evaluation in Counseling and Development, 43(2), 121–149. doi:10.1177/0748175610373459 Douglas, K. A., & Purzer, Ş. (2015). Validity: Meaning and relevancy in assessment for engineering education research. Journal of Engineering Education, 104(2), 108-118. https://doi.org/10.1002/jee.20070 Durksen, T. L., Chu, M. W., Ahmad, Z. F., Radil, A. I., & Daniels, L. M. (2016). Motivation in a MOOC: A probabilistic analysis of online learners’ basic psychological needs. Social Psychology of Education, 19(2), 241-260. https://doi.org/10.1007/s11218-015-9331-9 Eccles (Parsons), J., Adler, T. F., Futterman, R., Goff, S. B., Kaczala, C. M., Meece, J. L., & Midgley, C. (1983). Expectancies, values, and academic behaviors. In J. T. Spence (Ed.), Achievement and achievement motivation (pp. 75–146). San Francisco, CA, USA: W. H. Freeman. Espinosa, B. J. G., Sepúlveda, G. C. T., & Montoya, M. S. R. (2015). Self-motivation challenges for student involvement in the Open Educational Movement with MOOC. International Journal of Educational Technology in Higher Education, 12(1), 91-103. https://doi.org/10.7238/rusc.v12i1.2185 Evans, B. J., Baker, R. B., & Dee, T. S. (2016). Persistence patterns in massive open online courses (MOOCs). The Journal of Higher Education, 87(2), 206-242. Fabrigar, L. R., & Wegener, D. T. (2011). Exploratory factor analysis. Oxford University Press. Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272. https://doi.org/10.1037/1082-989X.4.3.272 Flake, J. K., Barron, K. E., Hulleman, C., McCoach, B. D., & Welsh, M. E. (2015). Measuring cost: The forgotten component of expectancy-value theory. Contemporary Educational Psychology, 41, 232-244. https://doi.org/10.1016/j.cedpsych.2015.03.002 FutureLearn (2019). Online degree courses. [Online]. Retrieved from https://www.futurelearn.com/degrees. Gamage, D., Fernando, S., & Perera, I. (2015). Quality of MOOCs: A review of literature on effectiveness and quality aspects. In 2015 8th International Conference on Ubi-Media Computing (UMEDIA), 224-229. https://doi.org/10.1109/umedia.2015.7297459.

Motivation in MOOC Learners 21

Head, S. K. (2012). MOOC’s—The revolution has begun says Moody’s. University World News, 23. Henry, M., & Marrs, D. (2015). Cada Día Spanish: An Analysis of Confidence and Motivation in a Social Learning Language MOOC. International Association for Development of the Information Society. Hicks, N. M., Roy, D., Shah, S., Douglas, K. A., Bermel, P., Diefes-Dux, H. A., & Madhavan, K. (2016, October). Integrating analytics and surveys to understand fully engaged learners in a highly-technical STEM MOOC. In 2016 IEEE Frontiers in Education Conference (FIE) (pp. 1-9). https://doi.org/10.1109/fie.2016.7757735 Hong, B., Wei, Z., & Yang, Y. (2017). Discovering learning behavior patterns to predict dropout in MOOC. In The 12th International Conference on Computer Science & Education (ICCSE 2017). University of Houston, USA. Hughes, G., & Dobbins, C. (2015). The utilization of data analysis techniques in predicting student performance in massive open online courses (MOOCs). Research and Practice in Technology Enhanced Learning, 10(1), 10. Hulleman, C. S., Barron, K. E., Kosovich, J. J., & Lazowski, R. A. (2016). Student motivation: Current theories, constructs, and interventions within an expectancy-value framework. In Psychosocial skills and school systems in the 21st century (pp. 241-278). Springer, Cham. Jaconds, J. E., Lanza, S., Osgood, D. W., Eccles, J. S., & Wigfield, A. (2002). Changes in children’s self-competence and values: Gender and domain differences across grades one through twelve. Child Development, 73(2), 509–527. Jordan, K. (2015). Massive open online course completion rates revisited: Assessment length and attrition. The International Review of Research in Open and Distributed Learning, 16(3). https://doi.org/10.19173/irrodl.v16i3.2112 Jöreskog, K. G. (1967). A general approach to confirmatory maximum likelihood factor analysis. ETS Research Bulletin Series, 34(2), 183–202. http://doi.org/10.1007/bf02289343 Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527. Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000 Kane, M. T. (2016). Explicating validity. Assessment in Education: Principles, Policy & Practice, 23(2), 198-211. https://doi.org/10.1080/0969594X.2015.1060192 Khalil, H., & Ebner, M. (2014, June). MOOCs completion rates and possible methods to improve retention-A literature review. In World Conference on Educational Multimedia, Hypermedia and Telecommunications (Vol. 1, pp. 1305-1313). Keshavabhotla, S., Williford, B., Kumar, S., Hilton, E., Taele, P., Li, W., Linsey, J., & Hammond, T. (2017). Conquering the cube: Learning to sketch primitive in perspective with an intelligent tutoring system. In Proceedings of the Symposium on Sketch-based Intergaces and Modeling, ACM, Los Angeles, CA, July 28-29, 2017. https://doi.org/10.1145/3092907.3092911

Motivation in MOOC Learners 22

Kline, G. (2017). Forecast student success app now offers pointers for succeeding in individual courses. Purdue Information Technology. Retrieved from https://www.itap.purdue.edu/newsroom/news/170103_forecastcourses.html Koller, D., Ng, A., Do, C., & Chen, Z. (2013). Retention and intention in massive open online courses: In depth. Educause Review, 48(3), 62-63. Kolowich, S. (2013). The professors who make the MOOCs. The Chronicle of Higher Education, 3. [Online]. Retrieved from https://www.chronicle.com/article/TheProfessors-Behind-the-MOOC/137905. Kosovich, J. J., Hulleman, C. S., Barron, K. E., & Getty, S. (2015). A practical measure of student motivation: Establishing validity evidence for the expectancy-value-cost scale in middle school. The Journal of Early Adolescence, 35(5-6), 790-816. https://doi.org/10.1177/0272431614556890 Koutropoulos, A., Gallagher, M. S., Abajian, S. C., de Waard, I., Hogue, R. J., Keskin, N. Ö., & Rodriguez, C. O. (2012). Emotive vocabulary in MOOCs: Context & participant retention. European Journal of Open, Distance and E-Learning, 15(1). Kulkarni, C. E., Bernstein, M. S., & Klemmer, S. R. (2015, March). PeerStudio: rapid peer feedback emphasizes revision and improves performance. In Proceedings of the Second (2015) ACM Conference on Learning@ Scale (pp. 75-84). ACM. Li, B., Wang, X., & Tan, S. C. (2018). Massive online open courses survey. PsycTESTS Dataset. https://doi.org/10.1037/t67807-000 Littlejohn, A., Hood, N., Milligan, C., & Mustain, P. (2016). Learning in MOOCs: Motivations and self-regulated learning in MOOCs. The Internet and Higher Education, 29, 40–48. https://doi.org/10.1016/j.iheduc.2015.12.003 Liyanagunawardena, T. R., Lundqvist, K. Ø., & Williams, S. A. (2015). Who are with us: MOOC learners on a FutureLearn course. British Journal of Educational Technology, 46(3), 557-569. https://doi.org/10.1111/bjet.12261 Loizzo, J., Ertmer, P. A., Watson, W. R., & Watson, S. L. (2017). Adult MOOC learners as selfdirected: Perceptions of motivation, success, and completion. Online Learning, 21(2), n2. https://doi.org/10.24059/olj.v21i2.889 Madhavan, K., Douglas, K., Roy, D., and Williams, T. V. (2018). edX course usage clustering pipeline: Beta release update 1. https://doi.org/10.5281/zenodo.1467246 Madhavan, K., & Richey, M. C. (2016). Problems in big data analytics in learning. Journal of Engineering Education, 105(1), 6–14. https://doi.org/10.1002/jee.20113 Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741. https://doi.org/10.1037/0003-066x.50.9.741 Morris, N. P., Hotchkiss, S., & Swinnerton, B. (2015). Can demographic information predict MOOC learner outcomes? In Proceedings of the European MOOC Stakeholder Summit 2015 (pp. 199-207). Retrieved from https://www.researchgate.net/publication/278392770.

Motivation in MOOC Learners 23

Na, S., Xumin, L., & Yong, G. (2010). Research on k-means clustering algorithm: An improved k-means clustering algorithm. In Third International Symposium on Intelligent Information Technology and Security Informatics (pp. 63–67). https://doi.org/10.1109/iitsi.2010.74 Onah, D. F. O., Sinclair, J. E., & Boyatt, R. (2015). Forum posting habits and attainment in a dual-mode MOOC. International Journal for Cross-Disciplinary Subjects in Education, 5(2), 2463–2470. https://doi.org/10.20533/ijcdse.2042.6364.2015.0336 Pellegrino, J. (2013). Proficiency in science: Assessment challenges and opportunities. Science, 340(6130), 320 – 323. https://doi.org/10.1126/science.1232065 Pursel, B. K., Zhang, L., Jablokow, K. W., Choi, G. W., & Velegol, D. (2016). Understanding MOOC students: Motivations and behaviours indicative of MOOC completion. Journal of Computer Assisted Learning, 32(3), 202-217. https://doi.org/10.1111/jcal.12131 Ramesh, A., Goldwasser, D., Huang, B., Daume III, H., & Getoor, L. (2014, June). Learning latent engagement patterns of students in online courses. In Twenty-Eighth AAAI Conference on Artificial Intelligence. Reich, J. (2015). Rebooting MOOC research: Improve assessment, data sharing, and experimental design. Science, 347(6217), 34-35. https://doi.org/10.1126/science.1261627. Romero, C. & Ventura, S. (2016). Educational data science in massive open online courses. WIREs Data Mining and Knowledge Discovery, 7(1), 1-12. https://doi.org/10.1002/widm.1187 Ross, J., Sinclair, C., Knox, J., Bayne, S., & Macleod, H. (2014). Teacher experiences and academic identity: The missing components of MOOC pedagogy. Journal of Online Learning and Teaching, 10(1), 57-69. Retrieved from https://www.research.ed.ac.uk/portal/files/17513228/JOLT_published.pdf. Riel, J. & Lawless, K. A. (2017). Developments in MOOC technologies and participation since 2012: Changes since “The Year of the MOOC.” In M. Khosrow-Pour (Ed.), Encyclopedia of Information Science and Technology (4th Ed.), Hershey, PA: IGI Global Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American psychologist, 55(1), 68. https://doi.org/10.1037//0003-066x.55.1.68 Samejima, F. (1997). Graded response model. In Handbook of modern item response theory (pp. 85-100). Springer New York. Shah, D. (2018). By the numbers: MOOCs in 2018. [Online]. Shroff, R. H., Vogel, D. R., Coombes, J., & Lee, F. (2007). Student e-learning intrinsic motivation: A qualitative analysis. Communications of the Association for Information Systems, 19, 241–260. doi:10.17705/1CAIS.01912 Smith, P. L., & Ragan, T. J. (1999). Instructional design. New York, NY, USA: Wiley.

Motivation in MOOC Learners 24

Sooryanarayan, D. G., & Gupta, D. (2015, August). Impact of learner motivation on mooc preferences: Transfer vs. made moocs. In Advances in Computing, Communications and Informatics (ICACCI), 2015 International Conference on (pp. 929-934). IEEE. Swinnerton, B., Hotchkiss, S., & Morris, N. P. (2017). Comments in MOOCs: Who is doing the talking and does it help? Journal of Computer Assisted Learning, 33(1), 51–64. doi:10.1111/jcal.12165 Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411–423. https://doi.org/10.1111/1467-9868.00293 Thille, C., Schneider, E., Kizilcec, R. F., Piech, C., Halawa, S. A., & Greene, D. K. (2014). The future of data-enriched assessment. Research & Practice in Assessment, 9, 5-16. Thissen, D., Pommerich, M., Billeaud, K., & Williams, V. S. (1995). Item response theory for scores on tests including polytomous items with ordered responses. Applied Psychological Measurement, 19(1), 39-49. https://doi.org/10.1177/014662169501900105 Wigfield, A., & Eccles, J. S. (2000). Expectancy-Value Theory of achievement motivation. Contemporary Educational Psychology, 2, 25(1), 68–81. https://doi.org/10.1006/ceps.1999.1015 Williams, T. V., Douglas, K. A., Yellamraju, T., & Boutin, M. (2018). Characterizing MOOC learners from survey data using modeling and n-TARP clustering. Proceedings from 2018 ASEE Annual Conference and Exposition, Salt Lake City, UT. Retrieved from https://peer.asee.org/characterizing-mooc-learners-from-survey-data-using-modelingand-n-tarp-clustering.pdf Zhou, M. (2016). Chinese university students’ acceptance of MOOCs: A self-determination perspective. Computers & Education, 92-93, 194-2013. https://doi.org/10.1016/j.compedu.2015.10.012

Motivation in MOOC Learners 25

Appendix 1 Strongly disagree E1 E2 E3 V1 V2 V3 C1 C2 C3

2 Disagree

3 4 5 6 Slightly Slightly agree Agree Strongly agree disagree I know I can learn the material in this course. I believe I can be successful in this course. I am confident that I can understand the material in this course. I think this course is or will be important. I value this course. I think this course is or will be useful. Because of other things that I do, I do not expect to have time to put into this course. I think I will be unable to put in the time needed to do well in this course. I think I may have to give up too much to do well in this course.

Purpose

Use

Inferences

Who will be assessed? Advanced nanotechnology MOOC learners who take the pre-course survey.

How will results be used? Results will be used to analyze and potentially predict learner behavior in advanced MOOCs.

Interpretation of scores? Learners motivations within the course, particularly in terms of the expectancy, value, and cost of participation.

Why? MOOC completion rates are low as learners have variable intentions. Motivation may help explain in-course behavior.

Level of decisions made? Potential targeted interventions to help support learners.

What claims to we wish to make? Knowledge of EVC motivation explains behavior and can inform supportive interventions.

What do we want? Does learner motivation explain and help predict learner behavior in advanced nanotechnology MOOCs?

Consequences of use? Improved understanding of learner behavior, design of interventions, and support of learner goals.

Figure 1. Articulation of intended use of the EVC scale in advanced nanotechnology MOOCs.

Motivation in MOOC Learners 26

Table 1 Claims, Assumptions, and Evidence to Support the Use of the EVC Scale in Advanced Nanotechnology MOOCs Claim Assumption Evidence Items intended to measure similar constructs (expectancy, value, and cost) are related

Items expected to measure the same construct possess internal consistency

Confirmatory Factor Analyses

Scores can be interpreted as measuring students’ EVC levels of motivation

Scoring structure makes sense

Confirmatory Factor Analysis fits with theory about constructs

Items are appropriately difficult (i.e., they readily differentiate between learners)

Item Response Theory

EVC scores can be Scoring structure and internal interpreted consistently for all consistency of constructs are MOOC learners invariant across groups of learners

Measurement Invariance across age groups and levels of education

EVC scores are instructionally relevant

Correlation between scores and behavior and performance

EVC scores relate to learner behavior and performance in the MOOC

Motivation in MOOC Learners 27

Figure 2. Cumulative distribution of time to answer Expectancy Value Scale questions. Responses submitted in less than 25 seconds were excluded from further analysis.

Motivation in MOOC Learners 28

Table 2 Demographic Information

Category

Course (n (%)) Physics of Nanophotonic Electronic Modeling Polymers (n = 199) (n = 189)

Principles of Electronic Biosensors (n = 273)

Total (n = 661)

Gender Male Female Transgender Prefer not to answer Non-respondent

151 (76%) 28 (14%) 0 (0%) 5 (2%) 15 (8%)

126 (67%) 51 (27%) 0 (0%) 2 (1%) 10 (5%)

191 (70%) 50 (18%) 0 (0%) 2 (1%) 30 (11%)

468 (71%) 129 (20%) 0 (0%) 9 (1%) 55 (8%)

Age 24 or under 25-34 35 or older Non-respondent

57 (29%) 85 (43%) 42 (21%) 14 (7%)

74 (39%) 66 (35%) 33 (17%) 12 (6%)

98 (36%) 79 (29%) 61 (27%) 30 (11%)

229 (35%) 230 (35%) 136 (21%) 56 (8%)

Education Less than a four-year degree 27 (14%) 44 (23%) 63 (23%) 134 (20%) Four-year degree 49 (25%) 52 (28%) 73 (27%) 174 (26%) Master’s degree 71 (36%) 61 (32%) 74 (27%) 206 (31%) Doctoral or Professional degree 40 (20%) 22 (12%) 37 (14%) 99 (15%) Non-respondent 12 (6%) 10 (5%) 26 (10%) 48 (7%) Note. Responses from the initial survey were combined when necessary to provide adequate numbers for analysis, when groups could be justifiably combined (i.e., all education levels below a four-year degree).

Motivation in MOOC Learners 29

Figure 3. Histogram of EVC survey completion time in seconds. The mode of the completion time is between 27 and 51 seconds, with a long tail extending up to above 600 seconds.

Motivation in MOOC Learners 30

Table 3 Measurement Invariance Models Model

Description

Model 0: Dimensional Invariance

Judgment comparison of results of CFA using the same number of factors for each group

Model 1: Configural Invariance

Constraints each group to the same factor loading patterns

Model 2: Metric Invariance

Nested within Model 1; Constraints each group to the same factor loading values for each item

Model 3: Scalar Invariance

Nested within Model 2; Constraints each group to the same intercepts for each item

Model 4: Strict Invariance

Nested within Model 3; Constraints each group to the same residual variance for each item

Model 5: Structural Invariance

Nested within Model 3; Constraints each group to the same factor variance

Table 4 Comparison of Measurement Invariance Model Fit Indices 

Model  1-factor 2182.921** 2-factor 886.649** 3-factor 16.432

 27 19 12

CFI 0.836 0.934 1.000

TLI 0.782 0.875 0.999

RMSEA 0.348 0.263 0.024

90% CI for RMSEA LL UL 0.335 0.360 0.248 0.278 0.000 0.049

Note.(n = 661). CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; CI = confidence interval; LL = lower limit; UL = upper limit. * p < 0.05, ** p < 0.01. Variables specified as ordered. Rotation = geomin, Row standardization = correlation, Oblique rotation.

Motivation in MOOC Learners 31

Table 5 Confirmatory Factor Analysis Model Fit Indices Model 3-factor



 49.130*

df 24

CFI 0.998

TLI 0.997

RMSEA 0.040

90% CI for RMSEA LL UL 0.024 0.056

Note. (n = 661). CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; CI = confidence interval; LL = lower limit; UL = upper limit. * p < 0.05.

Table 6 CFA Model Parameter Estimates Factor Expectancy

Value

Cost

Item 1 (E1) 2 (E2) 3 (E3) 4 (V1) 5 (V2) 6 (V3) 7 (C1) 8 (C2) 9 (C3)

Factor loadingsa (R) 0.877 0.923 0.880 0.934 0.943 0.893 0.806 0.942 0.707

SE 0.013 0.012 0.012 0.008 0.010 0.013 0.016 0.016 0.020

Factor variance (R2) 0.769 0.853 0.775 0.872 0.890 0.797 0.649 0.888 0.501

Note. (n = 661). aAll factor loadings are statistically significant with p < 0.001.

Residual variance (1 – R2) 0.231 0.147 0.225 0.128 0.110 0.203 0.351 0.112 0.499

Motivation in MOOC Learners 32

Table 7 Model Fit Indices of 3-Factor Model for Learner Groups Variable Age

Education

Group 24 or under 25-34 35 or older Less than 4-yr. 4-yr. degree Master’s PhD, MD, JD

n 230 230 145 134 174 206 99

a

 37.958* 29.850 24.946 34.669 33.464 34.448 32.189

CFI 0.996 0.999 1.000 0.996 0.997 0.998 0.997

TLI 0.994 0.998 1.000 0.994 0.995 0.996 0.995

RMSEA 0.050 0.033 0.016 0.058 0.048 0.046 0.059

90% CI for RMSEA LL UL 0.014 0.079 0.000 0.066 0.000 0.070 0.000 0.097 0.000 0.083 0.000 0.078 0.000 0.107

Note.(n = 661). CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; CI = confidence interval; LL = lower limit; UL = upper limit. a All tests had = 24. * < .05.

Table 8 CFA Measurement Invariance Test Results by Age Group Model



df

94.012*

72

109.575*

84

M2 – M1

15.563

12

0.998

0.000

0.039

159.011*

126

M3 – M2

49.436

42

0.997

-0.001

0.036

222.008**

144

M4 – M3

62.997**

18

0.994

-0.003

0.052

213.079

129

M5 – M3

54.068**

3

0.993

-0.004

0.057

Model Comparison

∆ 

∆df

CFI

∆CFI

RMSEA

M1. Configural Equal factor pattern M2. Metric Equal loadings M3. Scalar Equal intercepts M4. Strict Equal error variance M5. Structural Equal factor variance

0.998

0.039

2

Note. (n = 661). CFI = comparative fit index, RMSEA = root mean square error approximation, ∆X = nested chisquare difference. ∆CFI ≤ −0.01 suggests lack of invariance for increasingly constrained models. * p < 0.05; ** p < 0.01.

Motivation in MOOC Learners 33

Table 9 CFA Measurement Invariance Test Results by Education Group 

df

M1. Configural Equal factor pattern

134.386**

96

M2. Metric Equal loadings

154.744**

114

M2 – M1

20.358

18

0.997

0.000

0.048

M3. Scalar Equal intercepts

210.36*

177

M3 – M2

55.616

63

0.997

0.000

0.035

286.017**

204

M4 – M3

75.657**

27

0.993

-0.004

0.051

258.156**

180

M5 – M3

47.796**

3

0.994

-0.003

0.053

Model

M4. Strict Equal error variance M5. Structural Equal factor variance

Model Comparison

∆ 

∆df

CFI

∆CFI

RMSEA

0.997

0.051

Note. (n = 661). CFI = comparative fit index, RMSEA = root mean square error approximation, ∆X2 = nested chisquare difference. ∆CFI ≤ −0.01 suggests lack of invariance for increasingly constrained models. * p < 0.05; ** p < 0.01.

Table 10 Parameter Estimates of 2-Parameter Graded Response Model Factor Expectancy

Value

Cost

Note. (n = 661).

Item 1 (E1) 2 (E2) 3 (E3) 4 (V1) 5 (V2) 6 (V3) 7 (C1) 8 (C2) 9 (C3)

Discrimination index   0.878 0.925 0.882 0.932 0.943 0.896 0.806 0.941 0.708

b1

-1.374 -1.224 -1.155

b2

-0.432 -0.290 -0.275

b3 -2.030 -2.061 -2.001 -2.254 -2.429 -2.208 0.185 0.302 0.338

b4

b5

-1.170 -1.050 -0.876 -0.993 -1.240 -1.257 0.887 0.927 0.975

0.204 0.294 0.436 0.193 0.078 0.154 1.589 1.526 1.576

Motivation in MOOC Learners 34

Figure 4. Expectancy (Item 1) and Value (Item 4) Item Characteristic Curves

Figure 5. Cost (Item 7) Item Characteristic Curves.

Motivation in MOOC Learners 35

Figure 6. Learners access to course material clustered by k-means

Table 11 Multicollinearity Analysis of Intended Completion vs Actual Completion Model Factor variance (R2)

Intention Assignments Quizzes 0.791 0.812

Exams 0.802

Actual Assignments Quizzes 0.765 0.842

Exams 0.736

Motivation in MOOC Learners 36

Table 12 Nonparametric Regression Analysis of EVC Scores Against Behavior and Assessment Scores Model

EVC & behavior EVC & assessment scores EVC + intention & behavior EVC + intention & assessment scores

Factor variance (R2) 0.017 0.012 0.038 0.046

Residual variance (1-R2) 0.983 0.988 0.962 0.954

F 1.540 2.126 1.018 2.333

 3 3 5 5

Acknowledgements This work was made possible by the U. S. National Science Foundation (NSF) [grant numbers EHR1544259-PRIME and EEC1454315-CAREER]. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF.

Highlights 1. This paper applies an argument-based approach to assessment validation in advanced engineering MOOCs. 2. With learner activity data and motivation scores, we study the validity of the Expectancy-Value-Cost scale for MOOCs. 3. Learners’ Expectancy and Value motivation scores were generally high, creating a ceiling effect. 4. Results did not have positive performance predictiveness of Expectancy and Value, or negative predictiveness of Cost.