Who are interested in online science simulations? Tracking a trend of digital divide in Internet use

Who are interested in online science simulations? Tracking a trend of digital divide in Internet use

Computers & Education 76 (2014) 205–214 Contents lists available at ScienceDirect Computers & Education journal homepage: www.elsevier.com/locate/co...

588KB Sizes 1 Downloads 23 Views

Computers & Education 76 (2014) 205–214

Contents lists available at ScienceDirect

Computers & Education journal homepage: www.elsevier.com/locate/compedu

Who are interested in online science simulations? Tracking a trend of digital divide in Internet use Meilan Zhang* College of Education, University of Texas at El Paso, EDUC 801D, 500 West University Ave., El Paso, TX 79968, USA

a r t i c l e i n f o

a b s t r a c t

Article history: Received 27 February 2014 Received in revised form 27 March 2014 Accepted 3 April 2014 Available online 13 April 2014

Although the Internet has become a major source for disseminating educational resources for science, technology, engineering, and mathematics (STEM), little is known about the extent to which these resources are being used, their relationship to academic performance, and the type of users accessing these resources online. This study used two innovative tools, Google Trends and Web analytics, to explore interest in and usage of the PhET website, one of the most well-known online science simulation resources. This study found that search interest in the PhET science simulations has been growing continuously since 2005. However, search interest in PhET was positively correlated with academic performance and income, and negatively correlated with the achievement gap between high- and lowperforming students. Moreover, Internet users in states with more White students were more interested in the PhET science simulations. Yet Internet users in states with more Black students were less interested in these science simulations. These findings suggest that the way online STEM resources are being used is likely to widen, rather than narrow, the achievement gap. This is the first study to utilize Internet search trend data and Web analytics tools for monitoring Internet use for educational purposes. Ó 2014 Elsevier Ltd. All rights reserved.

Keywords: Science simulations Internet use Digital divide Achievement gap Web analytics

1. Introduction Science simulations are computer-based interactive representations of real or hypothesized scientific phenomena (National Research Council, 2011). Science simulations help students visualize abstract concepts and observe phenomena invisible to the naked eye, such as molecular motion (Olympiou, Zacharias, & deJong, 2013). Science simulations also allow students to conduct virtual experiments, manipulate variables, and immediately observe results that would naturally require long periods of time, such as the effects of varying amounts of greenhouse gases (Scalise et al., 2011). In addition, scientists routinely create and use simulations to model and understand natural phenomena. Science simulations can help students access and construct the mental models of scientists for understanding and explaining scientific phenomena (Wieman, Adams, & Perkins, 2008). A growing body of research has explored the potential of science simulations to improve content understanding and student engagement in science education (Jimoyiannis & Komis, 2001; National Research Council, 2011; Olympiou et al., 2013; Scalise et al., 2011). In a recent review of 79 studies on science simulations for Grades 6–12 (Scalise et al., 2011), 96% of the studies reported at least some learning gains, suggesting that well-designed science simulations are a powerful learning tool. These studies frequently reported positive outcomes for science simulations with respect to supporting students in engaging in active and extended scientific inquiry, understanding dynamic processes of scientific phenomena with animated representations, and supporting student collaboration, individualized learning, and continuous assessment (Scalise et al., 2011). Motivated by the promising potential for science learning and student engagement, science educators and researchers have developed many interactive science simulations and made them freely accessible on the Internet. One of the most well-known online science simulation resources is that created by the Physics Education Technology (PhET) project (http://phet.colorado.edu/) at the University of Colorado at Boulder (National Research Council, 2011).

* Tel.: þ1 734 709 9756; fax: þ1 616 777 1305. E-mail address: [email protected]. http://dx.doi.org/10.1016/j.compedu.2014.04.001 0360-1315/Ó 2014 Elsevier Ltd. All rights reserved.

206

M. Zhang / Computers & Education 76 (2014) 205–214

2. PhET online science simulations The PhET website contains a large collection of simulations for topics in physics, chemistry, biology, earth science, and mathematics from elementary school to college levels. All simulations are free to the public. The PhET project was founded in 2002 by Carl Wieman, a Physics Nobel Laureate, with the goal of using interactive simulations to improve science education (Xue, 2012). The development of PhET simulations has been supported by dozens of federal, corporate, and private sponsors, including the National Science Foundation, the William and Flora Hewlett Foundation, and the O’Donnell Foundation.1 PhET simulations were conceptualized and built based on sound design principles (McKagan et al., 2008; Perkins et al., 2006; Wieman et al., 2008). The simulations emphasize real-world connections and use familiar objects to bridge science concepts and real-life experience (e.g., balloons, faucets). The simulations incorporate visual representations to make the invisible visible and provide multiple representations to promote deeper understanding. In addition, the simulations invite students to explore, to change variables, and to measure and analyze data using embedded tools. The simulations were carefully designed to simplify the complexity of reality and present challenges to students that are neither too easy nor too difficult. Common misconceptions were anticipated and guidance and feedback was built into the simulations to scaffold student learning (Wieman et al., 2008). PhET simulations can be used in both formal and informal learning environments, such as inquiry-based lessons, laboratory activities, homework, and student-directed self-exploration (Wieman, Adams, Loeblein, & Perkins, 2010). Rigorous research has been conducted to evaluate the effectiveness of PhET simulations (Adams et al., 2008a, 2008b; Finkelstein et al., 2005; McKagan, Handley, Perkins, & Wieman, 2009).2 For example, a study that compared use of a simulation called “Circuit Construction Kit” with equivalent electric equipment found that students conducted more spontaneous experiments using the simulation and gained a better understanding of the concepts of current and voltage than those who used real circuit equipment (Finkelstein et al., 2005). In addition, the researchers conducted over 200 individual interviews with students when they used PhET simulations in a think-aloud format (Adams et al., 2008a, 2008b). These studies showed that the PhET simulations engaged students, who preferred using the simulations rather than actual equipment, and helped them learn science concepts. There are currently a total of 128 simulations on the website,3 including 94 in physics, 39 in chemistry, 38 in biology, 19 in earth science, and 31 in mathematics. Some simulations cover multiple subjects. These simulations serve students from elementary school to college, including 47 simulations for elementary school, 83 for middle school, 110 for high school, and 115 for university students.4 Some simulations are listed cross grade levels. PhET simulations have been viewed over 110 million times.5 Few online science simulations are as large in scale or as popular. With strong support from a variety of foundations, conceptualized and developed by researchers with high scientific credentials, and empirically tested by rigorous research, PhET simulations are widely recognized as exemplars of science simulations (MERLOT, 2006; National Research Council, 2011). PhET simulations received the Tech Award in 2011, a prestigious award that honors organizations for using technology to make the world a better place (McCracken, 2011). Although a great deal of research has been conducted on PhET simulations in various educational settings, little is known about how these simulations are being used on the Internet and many questions remain unanswered: To what extent are Internet users interested in and using these online science simulations? What type of users may be more interested in the simulations? How is interest in the simulations related to academic performance? This study aimed to address these questions using Internet search trend data and Web analytics, two innovative tools that have rarely been used in educational research. Over the last decade, the Internet has become a major source for disseminating educational resources for science, technology, engineering, and mathematics (STEM) (Lee et al., 2011; Porcello & Hsi, 2013). Yet little is known about the extent to which these resources are being used, what type of students use them, and their impact on student learning. The paucity of research in this area is partially due to the lack of an effective method for tracking real-time Internet usage. This study described how to monitor the Web for education through a case study on PhET, one of the most well-known STEM resources on the Internet. The tools and methods used in this study can be applied to analyze other Internet-based STEM resources. Understanding how online STEM resources are being used is important, because prior research has suggested that, given equal access to the Internet, Internet use can enlarge, rather than narrow, the digital divide (Hargittai & Hinnant, 2008; Wainer et al., 2008). 3. Internet use and digital divide Attewell (2001) differentiated two levels of digital divide. The first digital divide refers to unequal access to computers and the Internet, which creates a gap between the “haves” and “have-nots.” Poor, less educated, and minority families were less likely to have access to computers and the Internet. Great progress has been made in the last two decades with regard to closing the first digital divide in school and at home (National Center for Education Statistics, 2010; Tsikalas, Lee, & Newkirk, 2007). However, not all computer use is beneficial, which leads to the second digital dividedsociodemographic inequalities in the use of computers and the Internet (Attewell, 2001). Depending on the type of online activities that one engages in, Internet use can reinforce the privileges of the advantaged and enlarge social inequalities. For example, Hargittai and Hinnant (2008) found that young adults with a college degree were more likely to engage in capital-enhancing use of the Web as compared to less educated peers. Such use refers to online activities that can advance career, improve education, increase civic engagement, and inform financial and health choices. Similarly, van Deursen and van Dijk (2014) found that highly educated adults were more likely to use the Internet for personal development, while adults with lower education levels were more likely to go online for gaming and social interaction. In addition, a recent study published by Nature found that the vast majority of students who took

1 2 3 4 5

Information was retrieved from http://phet.colorado.edu/en/about/sponsors. A more complete list of research on PhET simulations is available at http://phet.colorado.edu/en/research. Information was retrieved from https://phet.colorado.edu/en/simulations/index. Information was retrieved from https://phet.colorado.edu/en/simulations/category/by-level. Information was retrieved from http://phet.colorado.edu/.

M. Zhang / Computers & Education 76 (2014) 205–214

207

the Massive Open Online Courses (MOOCs) offered by the University of Pennsylvania were among the wealthiest and most well educated (Emanuel, 2013). As a result, far from realizing the goal of providing free quality education to all, MOOCs reinforced the gap between the “haves” and “have-nots.” Prior research has also documented a differential impact of computer and Internet use that enlarged the achievement gap at the K–12 level (Vigdor & Ladd, 2010). For example, a large-scale Brazilian study (Wainer et al., 2008) found that the frequency of computer use and Internet access at home had negative associations with mathematics and reading performance in fourth grade students. Moreover, this negative effect was greater for younger students and those in low-income families. In another study, using data from the Early Childhood Longitudinal Study, Chang and Kim (2009) found that home computer access had a significant positive effect on science performance for English-speaking students, but a negative effect for English language learners. Fifth grade Black and Hispanic students who used computers frequently also had lower science performance, but this effect was not observed in fifth grade Caucasian students. Black English language learners displayed the lowest rate of computer use for educational purposes among all groups in the third grade. The above findings suggest that the different ways in which students use computers and the Internet may affect the achievement gap. Existing research has tackled the question mainly from a student perspective, using surveys to assess how students use the Internet. Another way to tackle the question is to look at website usage. Quantifying the relationship between the kind of websites that students use most and their academic performance may provide insight into the mechanism through which Internet use impacts learning. In addition, identifying patterns of use by different groups of students may provide information about whether and why Internet use is contributing to a larger achievement gap. 4. Using big data to understand Internet use Big data refers to extremely large, complex sets of data that can be difficult to obtain and process using traditional methods due to the massive scale of information. In recent years, big data has been used to transform practices in business, science, public health, and information technology (Mayer-Schonberger & Cukier, 2013). However, big data is still a relatively new concept in educational research (Bienkowski, Feng, & Means, 2012; Eynon, 2013). This study introduces two types of publicly available big data, Internet search trend and Web analytics, which can help researchers to monitor dynamic and real-time use of the Web for educational purposes. Online search engines are the primary means by which Internet users look for information of interest (Purcell, Brenner, & Rainie, 2012). Google is the most popular search engine on the Internet, accounting for 67% of global Internet search share (comScore, 2013). More than 12 billion search queries are conducted on Google each month. Children and adolescents rely heavily on Google to find information, to the extent that they are referred to as the “Google Generation” (Julien & Barker, 2009; Rowlands et al., 2008). Today’s students equate “Googling” with performing research (Purcell, Rainie, et al., 2012). Information seeking is a purposeful behavior, driven by individuals’ interests and needs (Brand-Gruwel, Wopereis, & Walraven, 2009). Therefore, Google search terms can provide useful information about student online interests. Google makes its search query data publicly available via a free tool called Google Trends. 4.1. Google Trends Google Trends (www.google.com/trends) displays search volumes for queries that Internet users have conducted on Google going back to 2004. Search volumes on Google Trends are normalized to a value between 0 and 100, where 100 represents the highest search volume for a query within a specified time period. Search volumes at other times are divided by the highest search volume and multiplied by 100. If there is not enough data, 0 is shown. Google Trends only provides data for queries whose search volume exceeds a minimum traffic threshold. Google Trends provides search volumes for all 50 American states, plus the District of Columbia. A state-level search index is generated by comparing the search volume for a particular query to the total Internet search volume for all queries in that region during the specified time period. Thus, state-level search volumes are normalized to account for different population sizes and Internet penetration rates. In addition, Google Trends removes repeated searches within a short time period by the same user, so that those types of queries do not influence the overall search trend.6 The validity of Google Trends as a research tool has been extensively verified and is well accepted by natural and social science researchers (Choi & Varian, 2012; McCarthy, 2010; Ortiz, Zhou, Shay, Neuzil, & Fowlkes, 2011; Ripberger, 2011; Zhu, Wang, Qin, & Wu, 2012). Google Trends has been used to study the public’s interest in many topics, including influenza surveillance (Carneiro & Mylonakis, 2009; Ginsberg et al., 2009), dieting and obesity (Markey & Markey, 2013), environmental issues (Mccallum & Bury, 2013), presidential elections (Reilly, Richey, & Taylor, 2012), and the stock market (Preis, Reith, & Stanley, 2010). However, few studies have used Google Trends for educational research. The present study is the first to use Google Trends and other Web analytics tools to analyze the use of online STEM resources. 4.2. Web analytics tools Web analytics measure how a website is being used, including the number of visitors, number of visits, number of page views, time spent on the site, geographical location of visitors, and search keywords leading to the traffic (Beasley, 2013). Several third-party Web analytics companies allow the public to access traffic data for millions of popular websites on the Internet. Such Web analytics tools include Alexa (www.alexa.com), Quantcast (www.quantcast.com), Compete (www.compete.com), SEMRush (www.semrush.com), and SimilarWeb (www.similarweb.com). These tools provide traffic data for regular websites that have a root domain address (e.g., www.colorado.edu). The PhET website is a subdomain of www.colorado.edu and this limits the use of Web analytics in the present study. Only SimilarWeb and Compete provide traffic data for the PhET website, and the Compete data is much more limited than for websites with a root domain.

6

More information about Google Trends data is available on its support page at https://support.google.com/trends.

208

M. Zhang / Computers & Education 76 (2014) 205–214

Compete focuses on U.S. Internet users only, whereas SimilarWeb provides only global data with its regular web subscription.7 Data from both SimilarWeb and Compete was used in this study to illustrate the use of the PhET website by global and U.S. users. Web analytics tools monitor and report website traffic using different methods. Compete collects website usage data from two million Internet users in the United States through Internet Service Providers. These participants, called a panel, willingly provide anonymous clickstream data for marketing research. Clickstream data refers to the sequence and timing of the Universal Resource Locators (URLs) used by Internet users. Its panel represents approximately 1% of the total Internet population in the United States. Data collected from the panel is statistically projected to the total U.S. Internet population. Compete claims to have the largest and most diverse panel in the United States.8 SimilarWeb collects data from its browser plug-ins and apps, which Internet users install on their computers. SimilarWeb claims that it has distributed tens of millions of browser plug-ins and apps around the world over the last four years, although no technical details have been disclosed. SimilarWeb also claims to have the largest panel of Internet users in the world.9 Different ways for monitoring traffic may lead to different results. It is important to note that site traffic data from both Compete and SimilarWeb are estimates, rather than exact numbers. In general, the estimates tend to be more accurate for popular websites than websites that do not receive significant traffic volume. Despite these limitations, both companies are industry leaders and have been widely used in Internet marketing research. However, the value of their data for educational research is underexplored. Because teachers and students rely heavily on the Internet to find educational resources (Purcell, Heaps, Buchanan, & Friedrich, 2013; Rowlands et al., 2008), Web analytics for educational websites may be a valuable tool for educational research. 5. Purpose of this study This study used the Internet search trend and Web analytics tools to analyze interest in and usage of the PhET website, which is one of the most popular science simulation websites on the Internet. Searching science simulations on Google found the PhET website in the first position out of millions of results. Thus, the PhET website represents the most illustrative case for studying interest in online science simulations. Prior research has shown that race, income, and language status mediate the impact of Internet use on student learning (Chang & Kim, 2009; Wainer et al., 2008). Therefore, it is important to examine whether interest in online science simulations is associated with racial and socioeconomic status. In addition, few studies have compared whether a search trend for a website from Google Trends and a site visit trend for the same website from Web analytics tools match one another. A close match between data from the two independent sources can yield positive evidence for their validity as research tools. Accordingly, this study examined the following research questions: 1) To what extent are Internet users interested in and using a science simulation website? 2) To what extent is the search trend for a science simulation website correlated with the site visit trend? 3) How are the search volumes for a science simulation website correlated with academic performance and achievement gaps at the state level? 4) How are the search volumes for a science simulation website correlated with race, income, and language status at the state level?

6. Method 6.1. Data sources 6.1.1. Internet search trend In order to determine the most appropriate search query keyword(s) for the PhET website, data from SimilarWeb was used to identify the major keyword(s) that sent traffic to the PhET website from search engines. According to SimilarWeb, PhET was the leading keyword for traffic to the PhET website. A search for PhET on Google confirmed that the PhET website was listed in the first place out of over one million search results. The first website in the search results enjoys a “winner-take-all” effect (Goldman, 2008), and receives 33% of total traffic generated by the search (Chitika, 2013). Therefore, PhET was selected as the most representative term reflecting Internet users’ interest in the online science simulation website and was used in this study for subsequent search trend analysis. Google Trends showed that search interest for PhET emerged around September 2005. This defined the start of the time period for monthly search volumes for PhET; thus, the time period over which search trend data in the United States was retrieved from Google Trends for this study was from September 2005 to November 2013 (99 months). In addition to the overall search volume in the United States, Google Trends also provided a search index for PhET in the 99 months for the 50 U.S. states and the District of Columbia. The search index was used in the correlational analysis with academic performance, achievement gaps, and state demographics. In addition, global search volumes for PhET from November 2012 to November 2013 (13 months) were retrieved from Google Trends, which was used to correlate with the global site visit trend from SimilarWeb. 6.1.2. Web analytics Compete provides U.S. traffic data for up to 25 months. The following data for the PhET website was retrieved from Compete for the period of October 2011 to October 2013: number of unique visitors per month, number of visits per month, time on site per visit, and number of pages viewed per visit. SimilarWeb provides global traffic data for up to 13 months. The following data was retrieved from SimilarWeb for the period of November 2012 to November 2013: number of visits per month, time on site per visit, number of pages viewed per visit, bounce rates, traffic sources, leading search keywords, and referral sites.

7 8 9

SimilarWeb provides data by country for enterprise-level users, which was beyond the financial capacity for this study. More information about Compete data is available at https://www.compete.com/about-compete/our-data/. More information about SimilarWeb data is available at http://www.similarweb.com/ourdata.

M. Zhang / Computers & Education 76 (2014) 205–214

209

Fig. 1. Overview of global traffic to the PhET website from November 2012 to October 2013.

6.1.3. Academic performance Student academic performance data was obtained from the National Assessment of Educational Progress (NAEP), the largest nationally representative assessment and the only assessment available for state comparison. NAEP measures the academic performance of fourth and eighth grade students every two years and reports the average scores by state for mathematics, reading, and science on a scale of 0–500. Performance is also reported as the percentage of students by state attaining each of three achievement levels: basic, proficient, and advanced.10 This study used the following NAEP data (average score and percentage at each achievement level): mathematics and reading for 2009, 2011, and 2013 at Grades 4 and 8; Grade 4 science for 2009; and Grade 8 science for 2009 and 2011. The Grade 4 science data for 2009 and the Grade 8 science data for 2011 represent the latest available assessment data for that subject. Achievement gap data was obtained from the state comparison tool of the National Center for Education Statistics (NCES).11 For each NAEP assessment, data for the achievement gaps between White and Black students, White and Hispanic students, students eligible and ineligible for free or discounted lunch, and students in the 75th percentile and 25th percentile was used in this study. 6.1.4. State demographics The following state-level data for the two school years of 2009–2010 and 2010–2011 was retrieved from the U.S. Department of Education, NCES, including the percentage of White, Black, and Hispanic students enrolled in public schools, the percentage of public school students eligible for free or discounted lunch, and the percentage of students participating in programs for English Language learners.12 In addition, the percentage of persons below the poverty level by state in 2008 was retrieved from the U.S. Department of Commerce, Census Bureau.13 6.2. Data analysis Descriptive data from SimilarWeb and Compete was used to illustrate the extent to which Internet users worldwide and in the United States were using the PhET site and major traffic sources for the website. Linear regression was performed to determine whether the change in monthly search volumes for PhET in the 99 months since September 2005 was statistically significant. The predictor was time in month and the dependent variable was the normalized monthly search volume. This method has been used previously in several other studies that used Google Trends to examine changes in public interest (Carr & Dunsiger, 2012; Mccallum & Bury, 2013). National and global site visit data from Compete and SimilarWeb, respectively, were normalized to a scale of 0–100 to allow for a direct comparison with search trend data from Google Trends, which is similarly normalized. The highest visit volume both in the United States and worldwide occurred in October 2013. Each monthly visit volume was divided by the highest volume in October 2013 and then multiplied by 100 to yield a normalized score for site volume. To determine the relationship between search trends and site visits, search trend data was compared to normalized site visit data using Pearson correlational analysis. Global search volumes for PhET from Google Trends for the period of November 2012 to November 2013 were correlated with global site visit data from SimilarWeb over the same time period. National search volumes for PhET from Google Trends were correlated with national site visit data from Compete for the period of October 2011 to October 2013. To determine the relationships between search interest for PhET and student performance, search interest and achievement gap, and search interest and state demographics, Pearson correlational analysis was performed using the search index for PhET in the 99 months and NAEP and state demographics data for each of the 50 U.S. states and the District of Columbia.

10 NAEP average scores and percentages of students attaining three achievement levels were retrieved from the NAEP Data Explorer at http://nces.ed.gov/nationsreportcard/ naepdata/. 11 NAEP state comparison data was retrieved from http://nces.ed.gov/nationsreportcard/statecomparisons/. 12 Data was retrieved from the Digest of Education Statistics in 2011 and 2012 from http://nces.ed.gov/programs/digest. 13 Data for persons below the poverty level by state in 2008 was retrieved from http://www.census.gov/compendia/statab/rankings.html.

210

M. Zhang / Computers & Education 76 (2014) 205–214

Fig. 2. Monthly U.S. search volume for PhET from Google Trends from September 2005 to November 2013.

7. Results 7.1. Use of the PhET website Fig. 1 presents an overview of global traffic to the PhET website during a one-year period from November 2012 to October 2013 using data from SimilarWeb. The PhET website received approximately 20,000 visits daily. On average, global Internet users spent 4.3 min on the website and viewed 3.6 pages per visit. Approximately 47% of users viewed only the first page and then left the website, called bounce rate. Approximately one third (37%) of the site traffic was from direct visit, 44% from search engines, and 17% referred by other websites. A total of 8082 search terms sent global traffic to the PhET website from search engines, among which the five top keywords were phet, phet simulations, phet.colorado.edu, projectile motion, and density, accounting for 28%, 7%, 3%, 2%, and 2% of traffic share from search engines, respectively. A total of 2703 websites referred traffic to the PhET website, with edmodo.com, thelearningodyssey.com, wikispaces.com, wordpress.com, and weebly.com as major referring sites. In terms of national statistics, according to Compete, in a typical month from November 2012 to October 2013, the PhET website received 169,067 visits from 110,606 visitors in the United States, who spent 7 min on the website and viewed 4 pages per visit.

7.2. Growing interest in PhET science simulations According to the regression analysis, the U.S. search trend for PhET showed a significant growth from September 2005 to November 2013 (R2 ¼ 0.532, p < .001). That is, 53.2% of the variance in the search volumes for PhET over the 99-month period was explained by time progress. The slope of the regression line was 0.676 (p < .001), meaning that each month the normalized search volume increased by 0.676. As illustrated in Fig. 2, the search trend over the years showed a strong seasonal pattern, consistently mirroring the school-year cycle in the United States. Each year, the search interest built up from September, the beginning of a school year, reached its peak in February and March, and then dropped to the lowest levels in July and August, the summer vacation.

7.3. Correlation between search interest and site visits The Pearson correlation between the normalized global search volumes for PhET from Google Trends and normalized global visit volumes for the PhET website from SimilarWeb yielded a coefficient of 0.863 (n ¼ 13, p < .001), indicating a positive correlation. Overall, there was a relatively good match between the global search trend data for PhET and the global site visit data, as shown in Fig. 3. The Pearson correlation between the normalized U.S. search volumes for PhET from Google Trends and normalized national visit volumes for the PhET website from Compete yielded a coefficient of 0.909 (n ¼ 25, p < .001), indicating a positive correlation. In other words, national search trend data for PhET closely matched national site visit data, as shown in Fig. 4.

Fig. 3. Correlation between normalized global search volumes for PhET and normalized global site visit volumes for phet.colorado.edu from November 2011 to November 2013 (13 months).

M. Zhang / Computers & Education 76 (2014) 205–214

211

Fig. 4. Correlation between normalized national search volumes for PhET and normalized national site visit volumes for phet.colorado.edu from October 2011 to October 2013 (25 months).

7.4. Correlation between search interest and academic performance As shown in Table 1, correlations between the search index for PhET and the average score and percentage of students at or above the proficient level were significantly positive across grade levels, subjects, and years. Also, most correlations between the search volume and the percentage of students at the advanced level were significantly positive. On the other hand, correlations with the percentage of students below the basic level were significantly negative across grade levels, subjects, and years. These findings suggest that Internet users in states with more high-achieving students in mathematics, reading, and science were more likely to search for PhET. Conversely, Internet users in states with more students below the basic achievement level were less likely to search for PhET. Significant negative correlations were found between the search index for PhET and achievement gap (based on the difference between students in the 75th and 25th percentiles) for Grade 4 and Grade 8 science and reading in all years, except for Grade 4 reading in 2009. Overall, Internet users in states with larger achievement gaps in science and reading were less interested in PhET. Correlations between the search index and achievement gaps between White and Black students, White and Hispanic students, and students eligible and ineligible for free or discounted lunch were not significant. 7.5. Correlation between search interest and race, income, and language status Table 2 shows the results of correlation analyses between the search index for PhET and state demographics, including percentage of White, Black, and Hispanic students enrolled in public schools; percentage of students eligible for free or discounted lunch; percentage of students participating in programs for English language learners; and percentage of persons below the poverty level. As shown in Table 2, Internet users in states with more Black students, more people below the poverty level, and more students eligible for free or discounted lunch were significantly less interested in searching for PhET. On the other hand, Internet users in states with more White students were more interested in searching for PhET. Correlations between the search index and the percentage of students participating in English language learning programs were not significant for either year. 8. Discussion This study is the first to use Internet search trend data and Web analytics tools to assess the use of Internet-based STEM resources and the relationship to academic performance and sociodemographic status. The findings show that the PhET website has been widely used, and interest in online science simulations has been growing continuously since 2005. This study provided new evidence for the digital divide in Internet use. The results of this study agree with previous findings regarding Internet use by adults as discussed earlier (van Deursen & van Dijk, 2014; Hargittai & Hinnant, 2008) and the reinforcing effect of MOOCs on Table 1 Correlation of the state-level search index for PhET and Grades 4 and 8 mathematics, reading, and science performance in NAEP 2009, 2011, and 2013. Grade 4

2009

2011

2013

Average score % of Below basic % of At or above proficient % of At advanced Achievement gap: 75th–25th percentile Average score % of Below basic % of At or above proficient % of At advanced Achievement gap: 75th–25th percentile Average score % of Below basic % of At or above proficient % of At advanced Achievement gap: 75th–25th percentile

Grade 8

Math

Reading

Science

Math

Reading

Science

0.518*** L0.508*** 0.522*** 0.392** 0.206 0.502*** L0.470*** 0.515*** 0.397** 0.173 0.520*** L0.494*** 0.534*** 0.427** 0.130

0.446** L0.462*** 0.477*** 0.376** 0.213 0.385** L0.430** 0.393** 0.239 L0.294* 0.428** L0.464*** 0.435** 0.325* L0.338*

0.513*** L0.519*** 0.507*** N/Aa L0.376**

0.557*** L0.554*** 0.558*** 0.442** 0.087 0.599*** L0.617*** 0.578*** 0.497*** 0.227 0.535*** L0.551*** 0.527*** 0.423** 0.014

0.532*** L0.563*** 0.498*** 0.199 L0.475*** 0.582*** L0.605*** 0.567*** 0.324* L0.425** 0.551*** L0.591*** 0.523*** 0.292* L0.398**

0.602*** L0.606*** 0.580*** 0.296* L0.473*** 0.603*** L0.618*** 0.634*** 0.387** L0.492***

*p < .05, **p < .01, ***p < .001. a The analysis is not available because 29 states had missing data for students attaining the advanced level in NAEP 2009 fourth grade science assessment.

212

M. Zhang / Computers & Education 76 (2014) 205–214 Table 2 Correlation of the state-level search index for PhET and state demographics. Search index for PhET % % % % % % % % % % %

of of of of of of of of of of of

White students in public schools in 2010–2011 Black students in public schools in 2010–2011 Hispanic students in public schools in 2010–2011 White students in public schools in 2009–2010 Black students in public schools in 2009–2010 Hispanic students in public schools in 2009–2010 Students eligible for free/reduced lunch in 2010–2011 Students eligible for free/reduced lunch in 2009–2010 Students participating in programs for English language learners in 2010–2011 Students participating in programs for English language learners in 2009–2010 Persons below the poverty level in 2008

0.461*** L0.451*** 0.066 0.476*** L0.443** 0.083 L0.519*** L0.540*** 0.101 0.110 L0.423**

*p < .05, **p < .01, ***p < .001.

wealthy college student advantages (Emanuel, 2013). The results of the present study suggest that high performing elementary and middle school students are more likely to take advantage of well-designed online science simulations to enhance their learning, compared to lower performing students. Moreover, the results suggest that White students are more interested in online science simulations than Black students. The implications of this study are that the digital divide in Internet use may be present at an early stage in the educational pathway, potentially exacerbating the achievement gap between socioeconomically advantaged and disadvantaged youth over time. One possible explanation for the positive correlation between interest in online science simulations and high socioeconomic status is different cultural and technological capital associated with different populations (Gilbert, 2010), which mediates Internet use by children in school and at home. Wood and Howley (2012) found that teachers in affluent suburban schools had better computer training opportunity and technical support than peers in rural and urban schools. It is possible that teachers in affluent schools had better knowledge about the PhET website and thus were more likely to integrate the simulations in their teaching. Hollingworth, Mansaray, Allen, and Rose (2011) found that middle-class parents were better able to help their children engage in sophisticated use of the Internet than working-class parents who tended to lack technological proficiency in themselves. Also, Li and Ranieri (2013) found that children of highly educated parents showed greater Internet self-efficacy and used the Internet for study more often than peers whose parents were less educated. On the other hand, Black English language learners were less likely to use computers for study than White and Hispanic peers (Chang & Kim, 2009). As a result, children in advantaged families may be more likely to use educational resources such as PhET simulations than disadvantaged peers. The findings of this study suggest that policy makers and educational researchers should pay greater attention to the digital divide in Internet use, which is more hidden than Internet access (Wood & Howley, 2012) and receives less attention in national agendas for education. The Obama administration has launched a new initiative called ConnectED to provide high-speed Internet access to students in the United States in the next five years (The White House, 2013). Yet there is still a lack of public discourse and initiatives that focus on the hidden digital divide in Internet use. The PhET website is one of the most successful STEM resources with respect to traffic volumes. The finding that search engines and referral websites contributed 61% of traffic to the PhET website has important implications for educational researchers who are interested in disseminating STEM resources on the Internet. The findings of this study suggest that search engine optimization and popular educational social media sites such as Edmodo, Wikispaces, and WordPress may be important for improving the visibility of educational resources on the Web. Educational researchers and designers should also pay close attention to how their educational resources are being used on the Internet. Google Analytics is a widely used Web analytics tool that helps website owners monitor the use of their websites (Beasley, 2013). This study focused on academic performance in elementary and middle school levels drawing upon the NAEP data. Because a great number of PhET simulations serve high school and university students, future research should examine whether the same trends found in this study apply to high school and university levels. Future research should also examine whether the findings for the PhET website apply to other science simulation websites and other online educational resources for STEM. In addition, this study found that search volume trends closely matched site visit trends, supporting the validity of Google Trends, SimilarWeb, and Compete as research tools. More research is needed to further understand how these tools can be used for educational research. Several limitations to this study should be noted. First, aggregated data from Internet search trend analyses and Web analytics tools is anonymous; therefore, the identity or type of user is unknown. One assumption of this study is that Interest users who searched for the PhET website were mainly educators and students, an assumption based on the content of the PhET website. The seasonal nature of the search volume trend, which matched the school-year cycle, supported this assumption. However, there is no data to verify this assumption, which presents a limitation to using Internet search trend analysis and Web analytics data in educational research. Also, companies such as Google, SimilarWeb, and Compete control the data and the level of technical detail regarding how the data was obtained, and this information is not available to researchers. The study was also limited with respect to the time period for the data; SimilarWeb provides data for up to 13 months and Compete for up to 25 months. It would be useful to be able to compare search trend and site visit data over longer periods of time. Finally, the Internet search trend data and Web analytics tools used in this study provided broad-based information about Internet usage, but offered little detail regarding how users interacted with the science simulations. Despite these limitations, Internet search and Web analytics are valuable tools that can complement existing research methods. This is particularly true in educational research, as children and adolescents are active Internet and Google users (Purcell, Brenner, et al., 2012; Rowlands et al., 2008). These tools are free or available at low cost and provide ready access to large-scale, real-time data on Internet use that would otherwise be very difficult to obtain using traditional methods. Also, the data reflects actual Internet use behaviors, rather than intentions or attitudes as measured by surveys. These tools may help educational researchers discover important large-scale patterns in Internet use for further empirical research. For example, the significant correlations found in this study between interest in science simulations and academic performance and sociodemographic status suggest an important trend for further study. To conclude, Internet

M. Zhang / Computers & Education 76 (2014) 205–214

213

search trend analysis and Web analytics can be valuable tools for monitoring dynamic and ever-evolving Internet usage in the field of education. References Adams, W. K., Reid, S., LeMaster, R., McKagan, S., Perkins, K., Dubson, M., et al. (2008a). A study of educational simulations part II – interface design. Journal of Interactive Learning Research, 19(4), 551–577. Adams, W. K., Reid, S., LeMaster, R., McKagan, S. B., Perkins, K. K., Dubson, M., et al. (2008b). A study of educational simulations part I – engagement and learning. Journal of Interactive Learning Research, 19(3), 397–419. Attewell, P. (2001). The first and second digital divides. Sociology of Education, 74(3), 252–259. Beasley, M. (2013). Practical web analytics for user experience: How analytics can help you understand your users. Waltham, MA: Elsevier. Bienkowski, M., Feng, M., & Means, B. (2012). Enhancing teaching and learning through educational data mining and learning analytics. Washington, D.C.: U.S. Department of Education, Office of Educational Technology. Brand-Gruwel, S., Wopereis, I., & Walraven, A. (2009). A descriptive model of information problem solving while using internet. Computers & Education, 53(4), 1207–1217. Carneiro, H. A., & Mylonakis, E. (2009). Google Trends: a web-based tool for real-time surveillance of disease outbreaks. Clinical Infectious Diseases, 49(10), 1557–1564. Carr, L. J., & Dunsiger, S. I. (2012). Search query data to monitor interest in behavior change: application for public health. PLoS One, 7(10), e48158. Chang, M., & Kim, S. (2009). Computer access and computer use for science performance of racial and linguistic minority students. Journal of Educational Computing Research, 40(4), 469–501. Chitika. (2013, June 7). The value of Google result positioning. Retrieved from http://cdn2.hubspot.net/hub/239330/file-61331237-pdf/ChitikaInsightsValueofGoogleResultsPositioning.pdf. Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88(s1), 2–9. comScore. (2013, March 13). comScore releases February 2013 U.S. search engine rankings. Retrieved from http://www.comscore.com/Insights/Press_Releases/2013/3/ comScore_Releases_February_2013_U.S._Search_Engine_Rankings. van Deursen, A. J., & van Dijk, J. A. (2014). The digital divide shifts to differences in usage. New Media & Society, 16(3), 507–526. http://dx.doi.org/10.1177/1461444813487959. Emanuel, E. J. (2013). Online education: MOOCs taken by educated few. Nature, 503(342). Eynon, R. (2013). The rise of big data: what does it mean for education, technology, and media research? Learning Media and Technology, 38(3), 237–240. http://dx.doi.org/ 10.1080/17439884.2013.771783. Finkelstein, N., Adams, W., Keller, C., Kohl, P., Perkins, K., Podolefsky, N., et al. (2005). When learning about the real world is better done virtually: a study of substituting computer simulations for laboratory equipment. Physical Review Special Topics-Physics Education Research, 1(1), 010103. Gilbert, M. (2010). Theorizing digital and urban inequalities: critical geographies of ‘race’, gender and technological capital. Information, Communication & Society, 13(7), 1000– 1018. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232), 1012–1014. Goldman, E. (2008). Search engine bias and the demise of search engine utopianism. In A. Spink, & M. Zimmer (Eds.), Web search: Multidisciplinary perspectives (pp. 121–133). Springer Berlin Heidelberg. Hargittai, E., & Hinnant, A. (2008). Digital inequality differences in young adults’ use of the internet. Communication Research, 35(5), 602–621. Hollingworth, S., Mansaray, A., Allen, K., & Rose, A. (2011). Parents’ perspectives on technology and children’s learning in the home: social class and the role of the habitus. Journal of Computer Assisted Learning, 27(4), 347–360. Jimoyiannis, A., & Komis, V. (2001). Computer simulations in physics teaching and learning: a case study on students’ understanding of trajectory motion. Computers & Education, 36(2), 183–204. Julien, H., & Barker, S. (2009). How high-school students find and evaluate scientific information: a basis for information literacy skills development. Library & Information Science Research, 31(1), 12–17. Lee, S. W.-Y., Tsai, C.-C., Wu, Y.-T., Tsai, M.-J., Liu, T.-C., Hwang, F.-K., et al. (2011). Internet-based science learning: a review of journal publications. International Journal of Science Education, 33(14), 1893–1925. Li, Y., & Ranieri, M. (2013). Educational and social correlates of the digital divide for rural and urban children: a study on primary school students in a provincial city of China. Computers & Education, 60(1), 197–209. http://dx.doi.org/10.1016/j.compedu.2012.08.001. Markey, P. M., & Markey, C. N. (2013). Annual variation in internet keyword searches: linking dieting interest to obesity and negative health outcomes. Journal of Health Psychology, 18(7), 875–886. Mayer-Schonberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. New York: Houghton Mifflin Harcourt. Mccallum, M. L., & Bury, G. W. (2013). Google search patterns suggest declining interest in the environment. Biodiversity and Conservation, 22(6–7), 1355–1367. McCarthy, M. J. (2010). Internet monitoring of suicide risk in the population. Journal of Affective Disorders, 122(3), 277–279. McCracken, H. (2011, October 21). Meet the winners of this year’s Tech Humanitarian Awards. Retrieved from http://techland.time.com/2011/10/21/meet-the-winners-ofthis-years-tech-humanitarian-awards. McKagan, S. B., Handley, W., Perkins, K. K., & Wieman, C. E. (2009). A research-based curriculum for teaching the photoelectric effect. American Journal of Physics, 77(1), 87–94. McKagan, S. B., Perkins, K. K., Dubson, M., Malley, C., Reid, S., LeMaster, R., et al. (2008). Developing and researching PhET simulations for teaching quantum mechanics. American Journal of Physics, 76(4), 406–417. MERLOT. (2006). MERLOT/physics showcase. Retrieved from http://physics.merlot.org/ShowcasePhet.html. National Center for Education Statistics. (2010). Internet access in U.S. public schools and classrooms: 1994–2005 And educational technology in U.S. public schools: Fall 2008. Washington, D.C.: U.S. Department of Education. National Research Council. (2011). Learning science through computer games and simulations. Washington, D.C.: The National Academies Press. Olympiou, G., Zacharias, Z., & deJong, T. (2013). Making the invisible visible: enhancing students’ conceptual understanding by introducing representations of abstract objects in a simulation. Instructional Science, 41(3), 575–596. Ortiz, J. R., Zhou, H., Shay, D. K., Neuzil, K. M., & Fowlkes, A. L. (2011). Monitoring influenza activity in the United States: a comparison of traditional surveillance systems with Google Flu Trends. PLoS One, 6, e18687. Perkins, K., Adams, W., Dubson, M., Finkelstein, N., Reid, S., Wieman, C., et al. (2006). PhET: interactive simulations for teaching and learning physics. The Physics Teacher, 44(1), 18–23. Porcello, D., & Hsi, S. (2013). Crowdsourcing and curating online education resources. Science, 341(6143), 240–241. Preis, T., Reith, D., & Stanley, H. E. (2010). Complex dynamics of our economic life on different scales: insights from search engine query data. Philosophical Transactions of the Royal Society A, 368, 5707–5719. Purcell, K., Brenner, J., & Rainie, L. (2012a). Search engine use 2012. Washington, D.C.: Pew Research Center. Purcell, K., Heaps, A., Buchanan, J., & Friedrich, L. (2013). How teachers are using technology at home and in their classrooms? Washington, D.C.: Pew Research Center. Purcell, K., Rainie, L., Heaps, A., Buchanan, J., Friedrich, L., Jacklin, A., et al. (2012b). How teens do research in the digital world. Washington, D.C.: Pew Research Center. Reilly, S., Richey, S., & Taylor, B. J. (2012). Using Google search data for state politics research: an empirical validity test using roll-off data. State Politics & Policy Quarterly, 12(2), 146–159. Ripberger, J. T. (2011). Capturing curiosity: using internet search trends to measure public attentiveness. Policy Studies Journal, 39(2), 239–259. Rowlands, I., Nicholas, D., Williams, P., Huntington, P., Fieldhouse, M., Gunter, B., et al. (2008). The Google generation: the information behaviour of the researcher of the future. Aslib Proceedings, 60(4), 290–310. Scalise, K., Timms, M., Moorjani, A., Clark, L., Holtermann, K., & Irvin, P. S. (2011). Student learning in science simulations: design features that promote learning gains. Journal of Research in Science Teaching, 48(9), 1050–1078. The White House. (2013, June 6). President Obama unveils ConnectED Initiative to bring America’s students into digital age. Retrieved 20.07.13 from http://www.whitehouse. gov/the-press-office/2013/06/06/president-obama-unveils-connected-initiative-bring-america-s-students-di. Tsikalas, K., Lee, J., & Newkirk, C. (2007). Home computing, school engagement and academic achievement of low income adolescents: Findings from year one of a three year study of the CFY intervention. New York: Computers for Youth Foundation.

214

M. Zhang / Computers & Education 76 (2014) 205–214

Vigdor, J. L., & Ladd, H. F. (2010). Scaling the digital divide: Home computer technology and student achievement. Cambridge, MA: National Bureau of Economic Research. Wainer, J., Dwyer, T., Dutra, R. S., Covic, A., Magalhaes, V. B., Ferreira, L. R. R., et al. (2008). Too much computer and internet use is bad for your grades, especially if you are young and poor: results from the 2001 Brazilian SAEB. Computers & Education, 51(4), 1417–1429. Wieman, C. E., Adams, W., Loeblein, P., & Perkins, K. (2010). Teaching physics using PhET simulations. The Physics Teacher, 48(4), 225–227. Wieman, C. E., Adams, W. K., & Perkins, K. K. (2008). PhET: simulations that enhance learning. Science, 322(5902), 682–683. Wood, L., & Howley, A. (2012). Dividing at an early age: the hidden digital divide in Ohio elementary schools. Learning, Media and Technology, 37(1), 20–39. Xue, K. (2012). Science simulator. Harvard Magazine. Retrieved from http://harvardmagazine.com/2012/07/science-simulator. Zhu, J. J., Wang, X., Qin, J., & Wu, L. (2012). Assessing public opinion trends on user search queries: validity, reliability, and practicality. In Paper presented at the annual conference of World Association of Public Opinion Research. Hong Kong.

Meilan Zhang is an Assistant Professor of Educational Technology in the Department of Teacher Education at the University of Texas at El Paso. Her research interests focus on improving student learning using mobile technology and understanding Internet use and digital divide using big data from Internet search trends and Web analytics.