Telecommunications Policy 39 (2015) 830–847
Contents lists available at ScienceDirect
Telecommunications Policy URL: www.elsevier.com/locate/telpol
A multiple indicator model for panel data: An application to ICT area-level variation Eva Ventura a,b,n, Albert Satorra a,b a b
Department of Economics and Business, Universitat Pompeu Fabra, Spain Barcelona GSE, Spain
a r t i c l e in f o
abstract
Available online 12 August 2015
Consider the case in which we have data from repeated surveys covering several geographic areas, and our goal is to characterize these areas on a latent trait that underlies multiple indicators. This characterization occurs, for example, in surveys of information and communication technologies (ICT) conducted by statistical agencies, the objective of which is to assess the level of ICT in each area and its variation over time. It is often of interest to evaluate the impact of area-specific covariates on the ICT level of the area. This paper develops a methodology based on structural equations models (SEMs) that allows not only the ability to estimate the level of the latent trait in each of the areas (building an ICT index) but also to assess the variation of this index in time, as well as its association with the area-specific covariates. The methodology is illustrated using the ICT annual survey data collected in the Spanish region of Catalonia for the years 2008–2011. & 2015 Elsevier Ltd. All rights reserved.
Keywords: Structural equations model Confirmatory factor analysis Longitudinal analysis Index Digital divide Information and Communication Technologies (ICT)
1. Introduction Previous studies have established that the widespread use of information and communications technologies (ICT) increases productivity and boosts the economic growth of territories. These technologies contribute to the improvement of productivity resulting from the adoption of more efficient business processes. Also, intensive use of ICT by the population contributes to further stimulate the economy through the introduction of new applications and services (see for instance Crandall, Lehr, & Litan, 2007; Garbacz & Thompson, 2008; Qiang & Rosotto, 2009). Thus given that ICT constitute a powerful tool to enhance economic performance, policy makers are concerned about narrowing (ideally eliminating) the inequalities of access to and use of ICT across the territory as a means of achieving economic and social convergence. Statistical offices worldwide invest resources in surveys to ascertain the levels of information and communication technology (ICT) activity in the population. Often the aim is to inspect the levels of ICT in different geographical areas rather than the levels of ICT of individuals. It is therefore necessary to measure the level of ICT activity for each area and to monitor the area and time variation. In addition, it is important to investigate which elements are responsible for the observed differences in ICT activity across areas. That information will be valuable in order to design better policy interventions aimed for instance at enhancing ICT activity in some lagging areas. Many authors have studied the so-called digital divide across countries or regions. A commonly accepted definition of the term “digital divide” is provided by the OECD (2001). According to this organization, “the term digital divide refers to the gap between n
Corresponding author at: Department of Economics and Business, Universitat Pompeu Fabra, Spain. E-mail addresses:
[email protected] (E. Ventura),
[email protected] (A. Satorra).
http://dx.doi.org/10.1016/j.telpol.2015.07.002 0308-5961/& 2015 Elsevier Ltd. All rights reserved.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
831
individuals, households, businesses and geographical areas at different socioeconomic levels with regard to their opportunities to access information and communication technologies and to their use of the Internet for a variety of activities”. The digital divide is thus, in nature, a multidimensional concept. In a recent paper, Vicente and López (2011) identify two main types of contribution to the literature of the digital divide. The first type focuses on “measuring and quantifying the extent, evolution, and pace of the digital divide”, while the second tries to explain “the determinants of ICT disparities”. There are a considerable number of studies that examine the relationship between the ICT disparities and several socioeconomic characteristics at the cross-country, regional and even the individual level. Interested readers may consult the work of Vicente and López (2011) for a complete overview. Echoing their work and regarding the measurement issue, there have been many attempts to build composite indices capable of capturing the “multidimensionality of the digital divide” (Al-Mutawkkil, Hesmati, & Hwang, 2009, Bagchi, 2005; Corrocher & Ordanini, 2002; Cruz-Jesús, Oliveira, & Bacao, 2012; Dutta & Jain, 2004; ITU, 2009; Orbicom, 2003; Selhofer & Hüsing, 2002; Wolcott, Press, McHenry, Goodman, & Foster, 2001). Simple compound indices based on the arbitrary weighting of several indicators present certain disadvantages. New technologies are continuously emerging and have to be incorporated into the calculation of the index. Therefore, it is necessary to constantly update both the definition of the index as well as the weights used to calculate it. On the other hand, the weights are arbitrary and “could be the subject of political dispute” (Cruz-Jesús et al., 2012). Thus, a number of these studies rely on exploratory factor analysis to select the weights applied to a number of indicators to produce a single composite index (Al-Mutawkkil et al., 2009; Corrocher & Ordanini, 2002; Cruz-Jesús et al., 2012; Hanafizadeh, Saghaei, & Hanafizadeh, 2009; ITU, 2009; Soupizet, 2004; Vicente & López, 2011). Other authors use multivariate techniques such as multivariate analysis, cluster analysis and discriminant analysis to “explore the digital divide among old, new and candidate member states of the European Union” (Çilan, Bolat, & Coşkun, 2009; Cruz-Jesús et al., 2012). In this line, Vehovar, Sicherl, Hüsing, and Dolnicar (2006) argue that the multivariate approaches such as loglinear modeling, compound measures and time–distance methodology used to analyze the changes in the digital divide are a better alternative to “an oversimplified methodological approach to digital divide studies”. Much of the work performed on measuring and explaining the digital divide has been performed at the cross-country level. Research at the regional level has been conducted in large part for the United States (Atkinson & Andes, 2008; Chaudhuri, Flamm, & Horrigan, 2005; Grubesic, 2006; Horrigan, Stolp, & Wilson, 2006; Kolko, 2000; Mills & Whitacre, 2003; Progressive Policy Institute, 2002; US Department of Commerce, 2000); the European references are fewer (Billón, Ezcurra, & Lera-López, 2008, 2009; Vicente & López, 2007, 2011). There are a few studies that use microdata to determine the urban/ rural digital gap (Chaudhuri et al., 2005; Mills & Whitacre, 2003; Noce & McKeown, 2008; US Department of Commerce, 2000), and others that use the individual and regional characteristics to explain the digital divide (Demoussis & Giannakopoulos, 2006; Horrigan et al., 2006; Schleife, 2010; or Cerno and Pérez-Amaral (2008) for Spanish data). This paper presents a methodology that integrates in a single model both the measurement of the ICT activity for each area in each period, as well as the association of the ICT activity with the area-specific covariates. As we have already mentioned, the idea of an index based on factor analysis is not new. The approach in this paper is unique because the confirmatory factor analysis replaces the principal components approach. This permits us to assess the dimensionality of the index and to study its evolution in time for each of the areas. The geographical areas are meant to be the territories that differ in the level of contingency of several factors that we feel explain the digital gap among the areas. In previous works based on the principal components and exploratory factor models, the estimated factors are regressed into a series of explanatory variables to determine what may cause the differences among the regions or countries (see, for instance, Vicente & López, 2011). In contrast to this two-stage approach (fitting first an exploratory factor model for each year and then applying a regression to the extracted factors), our methodology considers the single-stage analysis for which the level of ICT activity is estimated and simultaneously regressed on the covariates. This singlestage approach leads to a more consistent and efficient estimation. To illustrate our methodology, we use the ICT data collected in the administrative areas of the Spanish region of Catalonia. The Statistics Institute of Catalonia (Idescat) conducts repeated household surveys with the aim of ascertaining the time variation in the ICT at the area level. We combine several survey questions to produce a set of indicators representing the ICT activity at the individual level. These variables are then aggregated within each area to produce the indicators of the ICT activity at the area level. The rest of the paper is organized as follows. Section 2 describes the data to be used. A confirmatory factor analysis with one factor for each year is developed in Section 3. Section 4 presents a multiple indicators dynamic panel data model whereby the factors are decomposed into a permanent and a time-varying component and related to the covariates. Section 5 presents the conclusions.
2. Data and construction of indicators In this application, we use a household survey on the ICT conducted by the Idescat. The survey is a rotating panel, with five shifts of rotation on an annual basis. Here, we use four consecutive surveys, from 2008 to 2011.1 The sample design is by means of a stratified two-stage sample. The strata are 41 Catalan administrative divisions, and the sample is uniformly distributed among them with 75 randomly-selected first-stage units (dwellings) in each administrative division. Within each dwelling, the second-stage unit is one person aged 16 or over (also randomly selected). 1
The survey became biannual after 2011.
832
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
Each one of the surveys asks more than 50 questions, distributed among several thematic blocks such as the main home equipment in terms of the ICT products, home access to the Internet, satisfaction of the respondent,2 use of computers by the respondent, use of Internet by the respondent, e-commerce, language and ICT, the knowledge and use of public Internet access points, trust in digital services, and courses and computer skills. Many of the answers to those questions provide us with dichotomous indicators that are highly correlated to each other. Other indicators attain values on a discrete scale. Some of the questions are redundant, and the surveys also collect information on certain socioeconomic issues of the households. The first step consists of identifying the underlying factors that determine the level of ICT equipment and demand for services. This issue has been explored by many academic articles that address the measuring of the digital divide, and we will rely on their findings. Corrocher and Ordanini (2002) refer to the model elaborated by the OECD Task Force on the digital economy (Colecchia, 2000) and consider three stages of digital development. At the beginning, the factors that determine the differences between countries or regions are related to the speed of adoption of the new technologies. There are three main factors: the communication infrastructures, human resources and competitiveness of the companies providing the information and communication technologies. Subsequently, the technology spreads out though the territory, and the aspects related to the intensity of the adoption of the technology become more important. Here, Corrocher and Ordanini (2002) introduce the market diffusion factors, meaning “the diffusion of different devices for the use of digital services and applications that can determine different patterns of digitalization in different systems” and the size of the digital market. In the third stage, the qualitative aspects of the ICT become more relevant. The factors to consider are related to the impact of digitalization on the social and economic activities, on the structure of production, or on employment. Our data set contains some indicators that can be used to measure several of those factors. We will aggregate and summarize the information they provide into several new variables, which in the tradition of the factor analysis approach will be used as indicators of a common factor of the ICT activity for individuals. The set of indicators is essentially stable along the different years, though there are some minor variations that can be addressed by our methodology. The second and third blocks of the survey questions contain information on the equipment and access to the Internet in the household. The information is summarized in a new variable (equip) that takes values ranging from 0 to 3. Households without a personal computer and without access to the Internet score 0 in this variable, while those with a personal computer and broadband access score 3. Other intermediate situations score 1 or 2 depending on the quality of the equipment and access to the Internet. The two blocks also contain information on the number of operating mobile phones in the household (mobile), and in 2008 we also know the length of time that there has been Internet access at the home3 (sinceInt). This last variable is important to determine the degree of the ICT preparedness of the household and will be correlated with the respondents' ICT skills and with the measures of the social and economic impact of the ICT. It ranges from 0 if the household does not have access to the Internet to 4 if there has been access for over five years. As for the measurement of human resources, we consider several indicators. The survey systematically records all the abilities of the respondents using a computer and/or the Internet. There are seven questions on the tasks performed with a computer (such as the management of files and documents) and also nine questions on the uses of the Internet. We then create new variables (knowPC and knowInt), which add together the number of distinct tasks that the respondent is capable of doing without the help of others, either with a personal computer or through the Internet. The variables range from 0 for the respondents that do not own a computer or do not have access to the Internet to 7 for the respondents with high computing skills and 9 for the respondents with proven abilities using the Internet. We also take into account the intensity of the use of a personal computer or the Internet. The indicators measuring the intensity of use will most likely be correlated with the knowledge variables. The indicator that measures the frequency of use of the home computer (frecPC) takes values ranging from 0 to 5. The value is 0 if there is no computer in the household and 5 if the computer is used daily. As for the frequency of use of the Internet, the newly created variable (frecInt) ranges from 0 to 7. Again, the value is 0 if the household does not have access to the Internet, and it is 7 if the respondent is connected to the Internet for more than 20 h a week. On the chapter regarding the market diffusion factors, the household survey has little to offer. We have chosen to include the indicators related to the perceived level of security in three types of electronic transactions. The survey asks the respondent to what extent she or he thinks it is safe to use a credit card in a restaurant, to buy through the Internet, or to perform bank transactions through the Internet. Further analysis revealed that the security perception when paying in a restaurant with a credit card is only weakly correlated with the rest of the indicators. Therefore, we did not include that variable in the set of newly created variables. The answers to the last two security perception questions are translated into two new variables (secbuy and secbank) with values that range from 0 for a perception of total insecurity to the value of 3 if the respondent declares feeling very secure. Finally, we create a number of synthetic indicators aimed at capturing part of the social and economic impact of the ICT. The social and economic benefits of these technologies should be more evident for the frequent users of the Internet. The variable created to measure the economic impact of the ICT (econ) ranges from 0 to 4. The value is higher for the frequent users performing several tasks of an economic nature, such as telework or buying through the Internet. The survey also
2 These are a group of questions dealing mainly with the number and type of incidences related to the Internet services provided by the companies in the digital sector. 3 This question was not asked in subsequent surveys.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
833
records several types of activities – most of them related to leisure – that can be viewed as increasing the social exposure of the respondents. For instance, there are questions on the participation in chats, social networks and the use of virtual communities. We sum the answers to these questions to create a new variable (social) that ranges from 0 to 4. The level of interaction with several public health administrations is also measured by combining the response to several survey questions. The new variable (admin) ranges from 0 to 6. Table 1 summarizes these synthetic indicators. A final linear transformation was applied to all the indicators, so they range from 0 to 10. Because we were interested in estimating the level of ICT activity at the area level, we used the sampling weights included in the database to compute the areas' sample averages involved in the posterior analysis. Table 2 reports some descriptive statistics of the indicators across the 41 areas, for the years 2008–2011. The methodology we propose presupposes a specific transformation method from survey data to the multiple indicators to be used in the model. Although it would be technically possible to use the variables directly collected in the survey,4 most often the variables used in the model are summaries of several original variables (e.g. totals, averages, percentages, etc.). Typically the conversion of qualitative variables into quantitative variables will be undertaken by experts in ICT surveys. In our case we created a set of indicators aiming to extract as much relevant information from the survey as possible without incurring redundancies.
3. A confirmatory factor model for longitudinal data We apply a confirmatory factor analysis (CFA) in which the level of the ICT activity of a particular year is a factor (latent variable) underlying the set of indicators of the ICT activity at the area level listed in Table 1. That is, the level of ICT of an area is a complex, multivariate and unobservable concept that can be measured by the observed indicators. This concept is a characteristic of the area and it varies also with time. Formally, we consider one factor5 in each of T consecutive periods. Areas are indicated by subscript i ¼ 1; 2; …; n, and there are K synthetic indicators measuring each factor.6 The measurement equations of the CFA model are: yitk ¼ λtk f ti þ eitk ;
t ¼ 1; 2; …; T;
k ¼ 1; 2; …; K
ð1Þ
Here, yitk is the kth indicator of the factor at time t for the ith area, f it is the factor (level of ICT activity) of area i at timet, and λtk is the factor loading of the kth indicator at time t. The errors eitk are uncorrelated with each other and with the factors. The measurement part of the model for year 20087 is depicted as a path diagram in Fig. 1. Rectangles represent the observed indicators, and the factor f is in a circle. There are arrows going from the factor to the indicators, representing the loadings (λk ). The values of the factor loadings in combination with the variance of the unique factor determine the reliability of the indicators as measures of the latent factor. That is, the factor loading indicates the expected linear change in the value of the corresponding indicator for a unit change in the value of the underlying ICT level. Note that this factor model can be viewed as a multivariate regression model where each indicator is a different dependent variable regressed on a single unobservable variable (the underlying ICT level of the area). Note that in the non-standarized solution (the one reported in Table 3) loadings can be greater than one. To fix the scale of the factors, the loading of variable knowInt (Internet skills) is set to one. There are also arrows going from each disturbance term ek to the corresponding indicator. A similar path diagram applies for the other years. Allowing correlation parameters among the factors ties the four path diagrams into a single CFA model. Fig. 2 displays the path diagram for the four correlated factors whereby a double-headed arrow represents a correlation. That is, the model assumes that there is a level of ICT activity for each area (a latent variable) which is not directly observable and which is measured by the indicator variables. The model also assumes that the level of ICT activity of each area is correlated over time. This is a confirmatory factor model because the indicators of one year have zero loadings on the factors of other years. The estimation of this CFA model is undertaken using EQS (Bentler, 2002).8 Other SEM software, e.g., LISREL, Mplus, CALIS, sem of Stata, AMOS, etc. could have been used. As in regular SEM analysis, we assess the validity of the model using a classic goodness of fit test9 (see Bollen, 1989). The p-value for this test statistic is 0.1 so the model is not rejected by a standard 5% level test. If the model were rejected by the goodness of fit test, diagnostic statistics would be explored for the potential sources of misspecification (LM tests and expected parameter changes are available in regular SEM for model specification searches); see Saris, Satorra, and Sörbom (1987). Note that we use robust statistics so we are protected from deviations from normality. 4 If the original variables are dichotomous or ordinal (with few categories) then the estimation methods could be adapted to that case (see Saris, Satorra, & Coenders, 2004). 5 If necessary we could have introduced more than one factor for each period. In our application, a single-factor model led to an acceptable fit. 6 In our model, the 2008 factor has 12 indicators, while the rest of the factors have only 11. For the sake of simplicity, we do not take this detail into account. 7 This measurement part is almost identical every year; the only difference is that one of the indicators (sinceInt) is only present in year 2008. 8 In EQS, we use the least squares (LS) estimation option with robust standard errors (to protect the inferences from possible heteroskedasticity of the error terms and non-normality). For the technical details of this method, see Satorra and Bentler (1990, 1994). 9 We used the mean-and-variance-adjusted chi-square test that protects for non-normality.
834
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
Table 1 Set of synthetic indicators. Variable
econ mobile sinceInt knowPC knowInt frecPC frecInt secbuy secbank econ social admin
Description
Values
Level of equipment in the household Number of mobile phones in the household Since when has access to Internet Number of tasks performed with a computer Number of tasks performed through Internet Intensity of use of personal computer Intensity of use of Internet Security perception: buying through Internet Security perception: bank transactions through Internet Level of economic impact of ICT actions from home Level of social impact of ICT actions from home Level of interaction with public administrations through Internet
2008
2009
2010
2011
0–3 0–5 0–4 0–7 0–9 0–5 0–7 0–3 0–3 0–4 0–4 0–6
0–3 0–5 – 0–7 0–9 0–5 0–7 0–3 0–3 0–4 0–4 0–6
0–3 0–5 – 0–7 0–9 0–5 0–7 0–3 0–3 0–4 0–6 0–7
0–3 0–5 – 0–7 0–9 0–5 0–7 0–3 0–3 0–2 0–3 0–7
Table 2 Sample averages and standard deviations across areas and time. Variable
equip mobile sinceInt knowPC knowInt frecPC frecInt secbuy secbank econ social admin
Average
Standard deviation
2008
2009
2010
2011
2008
2009
2010
2011
6.151 4.384 4.321 3.311 3.177 5.641 3.875 3.217 1.110 1.747 3.178 1.014
6.169 4.509 – 4.168 3.582 5.679 4.262 3.781 1.229 1.855 3.496 1.109
6.597 4.519 – 3.958 3.397 5.787 4.493 3.991 1.359 2.095 3.962 1.060
6.830 4.583 – 4.691 4.051 6.088 5.952 3.991 1.362 3.587 5.061 1.336
0.770 0.437 0.643 0.550 0.632 0.740 0.708 0.492 0.189 0.437 0.641 0.278
0.772 0.440 – 0.659 0.645 0.798 0.723 0.529 0.172 0.436 0.675 0.317
0.792 0.458 – 0.663 0.585 0.813 0.750 0.404 0.152 0.474 0.734 0.306
0.654 0.364 – 0.780 0.634 0.736 0.810 0.485 0.185 0.741 0.949 0.329
The estimation results for this CFA model10 are displayed in Tables 3 and 4. The first table reports the estimates of the factor loadings for the full model. Each column corresponds to the factor loadings of one year, with the factors designated f 1 in 2008 to f 4 in 2011. The second table displays the estimated correlation matrix of the factors. All the factor loadings estimates are statistically significant and positive. The indicator with the highest loading is frecInt (frequency of use of the Internet) followed by frecPC (frequency of use of personal computer) and equip (equipment and access to Internet in the home). The relevance of indicator knowPC (personal computer skills) grows with time, with factor scores of 0.795 in 2008 and 1.179 in 2011. In addition, the indicator social (the use of ICT to engage in social activities related to leisure) is associated to a high (and increasing with time) factor loading. The two indicators related to security perceptions (secbank for the bank transactions by Internet and secbuy for electronic commerce) exhibit lower values for their respective factor loadings. The value for the factor loading is also low for the variables measuring the economic impact (econ) and the interaction with public administrations (admin) when using the ICT. The number of mobile phones in the home (mobile) seems to be moderately important in the year 2008 (a factor loading of 0.564) but becomes less important in consecutive waves. This is consistent with the widespread implementation of such devices and therefore the indicator mobile becomes less important to explain the variation in the level of the ICT activity across the areas. As observed in Table 4, the four factors are highly correlated, suggesting a very stable level of ICT activity across the years at the area level. This points to the fact that there are time invariant structural characteristics of the area that account for most of the observed variation in the ICT variables. We believe that this fact is relevant for policy making; for example, it may help in deciding the frequency of ICT surveys. The empirical best linear unbiased estimators (EBLUP) of the factors were computed. These are the factor scores provided by EQS when the regression method is used (see Satorra & Bentler, 2014). These factor scores are displayed in Table 5 where the rows correspond to the areas and the columns correspond to years. The score for each area in each of the four columns
10 We display the estimated results without imposing an equality constraint on the loadings across time. The same model could be estimated with the loadings set as equal across time. Such a model would be an alternative specification because the robust (mean- and variance-adjusted) chi-square goodness of fit test yielded a nearly acceptable fit (χ2 ¼ 25.627 with 15 degrees of freedom, p-value¼ 0.042).
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
equip
e1
mobile
e2
λ3
sinceInt
e3
λ4
knowPC
e4
1
knowInt
e5
λ6
frecPC
e6
λ7
frecInt
e7
λ8
secbuy
e8
λ9
secbank
e9
econ
e10
social
e11
admin
e12
λ1
λ2
f
835
λ 10 λ 11 λ 12
Fig. 1. Path diagram of the 2008 section of the CFA model. Table 3 Factor loadings estimates in the CFA model. Indicator
f1 2008
f2 2009
f3 2010
f4 2011
equip
1.084 (0.157) 0.564 (0.099) 0.910 (0.123) 0.795 (0.111) 1 1.035 (0.207) 1.158 (0.059) 0.527 (0.094) 0.215 (0.038) 0.532 (0.088) 1.004 (0.066) 0.325 (0.064)
1.187 (0.197) 0.589 (0.096) –
1.136 (0.156) 0.355 (0.152) –
0.793 (0.123) 0.231 (0.092) –
0.978 (0.041) 1 1.174 (0.114) 1.152 (0.067) 0.405 (0.102) 0.135 (0.037) 0.604 (0.071) 1.012 (0.097) 0.378 (0.053)
1.129 (0.064) 1 1.109 (0.074) 1.302 (0.050) 0.344 (0.096) 0.149 (0.032) 0.731 (0.072) 1.246 (0.063) 0.399 (0.058)
1.179 (0.071) 1 1.039 (0.113) 1.151 (0.080) 0.511 (0.092) 0.160 (0.037) 0.691 (0.126) 1.304 (0.143) 0.377 (0.062)
mobile sinceInt knowPC knowInt frecPC frecInt secbuy secbank econ social admin
Mean- and variance-adjusted goodness-of-fit model test: χ 2 ¼ 30.746, d.f. ¼ 22 (p-value ¼0.101). Robust standard errors are displayed below the estimates, in parenthesis.
corresponds to an estimate of the ICT level of activity for the corresponding year. These area scores are relevant for ranking the areas according to their ICT activity, as well as to assess the ICT variation across time. At this point, however, any ranking of areas is problematic because we have four scores for each area. To display the time evolution of ICT activity for each area, the factor scores of Table 4 are represented in Fig. 3. We created four groups to facilitate the graphical representation. Group 1 includes the areas that have a factor score greater than 0.5 in 2008. Group 2 includes areas with a factor score in 2008 lower than 0.5 but greater than 0. The areas with negative factor scores in 2008 down to 0.5 are in group 3. The areas with the poorest scores are in group 4.
836
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
f1
f2
f4
f3 Fig. 2. Path diagram of the correlated factors. Table 4 Estimated correlation matrix of the factors (CFA model).
f1 f2 f3 f4
f1
f2
f3
f4
1 0.879 0.807 0.730
1 0.802 0.610
1 0.780
1
We observe that in group 1 there are four areas (Tarragonès, Vallès Occidental, Maresme and Gironès) slightly ahead of the rest in terms of ICT activity (the factor scores are over 0.8 for each year). These are dynamic and heavily urbanized areas whose economic activity is mainly based on industry and services. At the other extreme, group 4, there are five areas (Ripollès, Pallars Jussà, Garrigues, Priorat and Terra Alta) that are at the bottom for ICT activity (the factor scores are below 0.79). These are rural, low density areas with an ageing population. We also observe that the factor scores remain very stable for all the groups. In any case we observe a slight increase in the best positioned groups, 1 and 2, and a slight decrease in the other two groups. The differences between areas are not reduced over time. Two main results emerge from this CFA analysis. First, we observe a high variability in the ICT level across areas for each of the years. It is observed that in general, the urban, dynamic, and industrialized areas tend to be at the top of the classification, while the rural, agricultural and ageing areas are at the bottom. Second, there is little variation of ICT activity over time. The results suggest the need to expand the CFA model to the one developed in the next section. 4. A multiple indicator panel data model with covariates We now propose a multiple indicators dynamic panel data model in which the ICT factors underlying the multiple indicators ðf 1 –f 4 Þ will be decomposed into two components: a permanent component and a transient one, with the permanent component being regressed on the observable socio-economic characteristics of the areas (covariates). An autoregressive structure is assumed for the time-varying component. The model is expressed by the following equations. First, we have the measurement Eq. (1) of the previous section that relates the multiple indicators with the factors. Second, factors (f it ) are decomposed into permanent (pi ) and transitory components (qit ) f it ¼ pi þqit
ð2Þ
with i ¼ 1; 2; …; n and t ¼ 1; 2; …; T. In addition, we specify an autoregressive structure for the transitory component qit ¼ ρqit 1 þ νit
ð3Þ
with i ¼ 1; 2; …; n and t ¼ 2; …; T, where νit is a disturbance term uncorrelated with qit 1 . Finally, we have the regression equation pi ¼ β1 x1i þ β2 x2i þ ⋯ þ βm xmi þ δi
ð4Þ
where the permanent component pi depends on marea-level covariates xj ; j ¼ 1; 2; …; m. The full set of Eqs. (1)–(4) is a dynamic factor model specification which differs from a typical panel data model in that the typical response variable in panel data is a latent variable, the factorf . As in any typical factor model there is an unobserved heterogeneity which is represented in our model by the common factorp. In our application this is an ICT activity level that is constant across time. In addition our panel model for the unobservable f is dynamic in the sense that it incorporates the unobservable autoregressive components, the factorsq. In our application this represents those temporal shocks on ICT activity in the area that vanish over time. Since the response variables f of the panel data model are latent the panel model equations need to be expanded to include the measurement equations for the factors, that is, the linear equations that relate the actual indicators to the factor (see Eq. (1)). This is the multiple indicator panel data model introduced in Bou and Satorra (2007), further extended in
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
837
Table 5 Factor scores estimates of the CFA model (ICT index by area and time). Area
f1 2008
f2 2009
f3 2010
f4 2011
Tarragonès Vallès Occidental Maresme Gironès Baix Llobregat Garraf Barcelonès Baix Empordà Baix Camp Anoia Vallès Oriental Val d'Aran Osona Pla de l'Estany Alt Penedès Garrotxa Alt Camp Bages Segarra Pla de l'Urgell Urgell Alta Ribagorça Segrià Selva Alt Empordà Baix Penedès Solsona Cerdanya Berguedà Alt Urgell Baix Ebre Conca de Barberà Ribera d'Ebre Pallars Sobirà Montsià Noguera Ripollès Pallars Jussà Garrigues Priorat Terra Alta
0.982 0.922 0.913 0.840 0.673 0.639 0.630 0.625 0.572 0.551 0.549 0.450 0.427 0.424 0.301 0.227 0.186 0.101 0.099 0.079 0.050 0.047 0.005 0.144 0.152 0.183 0.273 0.372 0.416 0.422 0.450 0.481 0.502 0.504 0.521 0.530 0.824 0.909 0.968 1.092 1.545
0.985 0.917 0.912 0.866 0.696 0.647 0.629 0.665 0.553 0.560 0.558 0.429 0.413 0.445 0.297 0.225 0.190 0.089 0.113 0.064 0.078 0.057 0.032 0.108 0.140 0.152 0.244 0.376 0.455 0.450 0.449 0.506 0.499 0.521 0.521 0.527 0.852 0.925 0.997 1.052 1.581
0.947 0.909 0.908 0.799 0.614 0.618 0.646 0.611 0.587 0.495 0.533 0.422 0.369 0.415 0.316 0.202 0.211 0.096 0.130 0.050 0.033 0.075 0.007 0.159 0.186 0.226 0.243 0.320 0.388 0.366 0.445 0.465 0.488 0.466 0.535 0.506 0.791 0.889 0.920 1.113 1.485
1.031 1.000 1.038 0.858 0.632 0.672 0.719 0.577 0.703 0.514 0.597 0.534 0.436 0.425 0.336 0.198 0.261 0.131 0.119 0.079 0.034 0.066 0.085 0.293 0.239 0.307 0.305 0.341 0.348 0.339 0.515 0.503 0.538 0.473 0.595 0.565 0.840 0.939 0.976 1.284 1.578
Note: Areas have been ordered in decreasing to values (top to bottom) of the row average.
Bou and Satorra (2014) that can be specified as a special case of SEM. In contrast with other approaches that are descriptive or exploratory, our confirmatory approach allows integrating a specific behavioral hypothesis into the analysis of the ICT survey data. Our statistical methodology provides a goodness of fit test for the model thus providing evidence to support/reject the behavioral hypothesis (see Bollen, 1989, Chapter 7). This, we feel, may constitute a novel way to enhance ICT research. In our application, the permanent component pi is likely to be substantial given the high correlation among the factor scores observed in the previous section (see Table 3). The practical relevance of this component will be determined by its variance. At one extreme, we could have a case in which the whole variance of f it in Eq. (2) is captured by pi (all the variation across the areas is permanent); and, at the other extreme, we could have a case in which the variance of pi is negligible compared with the variance of the transitory component, so the relevant variation is fully captured by the transitory component. The autoregressive structure of the transitory component qit in Eq. (3) simply assumes that the time-dependent ICT activity in an area has some persistence but vanishes with time. The speed of the time erosion of this component is given by the autoregressive parameter ρ. When ρ is zero the temporary component is independent across the years, while a lasting effect arises when ρ approaches one. According to the previous contributions on this issue, the likely causes of the observed disparities in the ICT level of areas were the differences in the average regional economic development, productive structures (Billón et al., 2008, 2009), human capital (Demoussis & Giannakopoulos, 2006), demography (Holloway, 2005), investment in research and development (Vicente & López, 2006, 2011), cultural and institutional factors (Howard, Anderson, Bush, & Nafus, 2009), or the access costs to the ICT equipment and services (Kiiski & Pohjola, 2002). In our study the choice of the covariates of Eq. (4) is largely determined by the availability of the socioeconomic data at a highly disaggregated level. The Idescat publishes data at the
838
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
Fig. 3. Time evolution of the factor scores of the CFA model.
administrative-unit level on several socioeconomic variables such as the gross domestic product (GDP), the percentage of gross value added (GVA) by the type of economic activity, the percentage of occupied workforce by the type of economic activity, the unemployment rate, the population share by age, the educational level or the population density. However, information on the investment research and development and the cultural and institutional factors is not available. Table B1 in Appendix B exhibits the correlations between the available set of variables (measured in 2008 or earlier) and the ICT activity at the area level. A goodness-of-fit assessment of the alternative SEM specifications led us to the choice of the two covariates for our panel data model, namely, the percentage of younger population (variable x1 ) and the percentage of gross value added in the agricultural sector (x2 ). Fig. 4 displays the path diagram corresponding to Eqs. (2)–(4) of the panel data model. The measurement part of the model is identical to that of the CFA model except that we have equated the loadings across time. We also equated across time the variances of the disturbance terms of Eq. (3).11 This multiple-indicators panel data model is estimated with the EQS under LS and robust statistics. A global goodness of fit test of the multiple indicator dynamic panel model is undertaken and reported in Table 6. The p-value for the chi-square statistic is 0.38, indicating that the model is not rejected at the usual significance levels. The table also shows that all the factor loadings in Eq. (1) differ significantly from 0 indicating that all the variables are relevant for measuring the ICT level. Note that we have now imposed that each factor loading is invariant over time. As before, frecInt (the frequency of use of Internet) is the most reliable indicator, followed by frecPC (frequency of use of a personal computer) social (social impact of ICT actions from home) and equip (level of equipment in the household). Again, the less reliable indicators are those of security perception secbuy and secbank and admin (level of interaction with public administrations). The regression coefficients in Eq. (4) are both statistically significant, with variable x1 (the share of the young population) positively affecting the permanent part of the ICT activity and variable x2 (the share of agricultural activity) affecting it negatively. The standardized solutions for Eqs. (2)–(4) are reported in Appendix D. The determination coefficient R2 associated to the standardized version of Eq. (2) is 0.82, revealing that a high percentage of the variation of the permanent component of ICT activity is explained by the two covariates. 11
Both sets of restrictions have been validated by goodness-of-fit testing.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
839
δ β1
1
1
f1
1
f3
f4
1
q1
ρ
x2
1
f2
1
β2
p
x1
1
q2
ρ
ν2
1 ρ
q3
q4
ν3
ν4
Fig. 4. Path diagram representation of factors and covariates.
Table 6 Results for the panel data model. Parameter
Estimate
Robust S.E.
λ1 (equip) λ2 (mobile) λ3 (sinceInt) λ4 (knowPC) λ5 (knowInt) λ6 (frecPC) λ7 (frecInt) λ8 (secbuy) λ9 (secbank) λ10 (econ) λ11 (social) λ12 (admin) β1 β2 ρ
1.013 0.385 0.948 1.007 1 1.119 1.135 0.386 0.160 0.635 1.134 0.368 0.183 0.047 0.371
0.107 0.074 0.135 0.040 – 0.087 0.046 0.060 0.017 0.050 0.061 0.044 0.032 0.015 0.0146
Mean and variance goodness-of-fit χ 2 ¼ 11.813, d.f. ¼ 11 (p-value ¼0.378).
model
test:
A scatter plot of the estimates of the permanent component versus its predicted value by Eq. (4) is displayed in Fig. 5 (the key to the abbreviations used in the figure in place of the full names of the areas can be found in Appendix A). This graphic visualizes the ordering of the areas and their deviations from the predicted values of the (permanent) ICT activity given the covariates. As in regular regression analysis, areas that are above (below) the straight line have permanent ICT activity above (below) what would be expected given the characteristics of the area. For example, we observe that area BXP has a much lower ICT activity than was predicted given a relatively young population and a relatively low gross value added in agriculture. Inspection of the deviations of the areas from the group pattern could be highly informative for policy makers. For example, high deviations from the trend could lead to the discovery of other determinants of ICT activity (new variables) not included in the model. The autoregressive coefficient ρ in Eq. (3) is positive and significant. However, its absolute value is relatively small. This implies that the time-varying component of the ICT level of the area persists for a short period of time. These temporary shocks in practice can be considered uncorrelated over time. The factor scores (EQS, regression method) for both the permanent and temporal components of ICT activity are displayed in Table C1 of Appendix C. Fig. 6 provides a graphical representation of the relative size of these components. The vertical axis of the graph corresponds to the values of the permanent or transitory components, and the horizontal axis contains the 41 areas ordered from the higher to the lower values of the permanent component. The solid line that crosses the graph from the value 0.88 for MAR to the value 1.694 for TAL corresponds to the permanent component. The four other lines correspond to the four temporary components, and we observe a much narrower range of variation.
840
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
1 MAR VOC TAR BXC BXL GIR GRF ANO PLE VOR
BXM VAR ACA
Permanent component
BAR
OSO APE
BAG PLU
0
ARI NOG
PSO BXE SOL CBA PJU RIE AUR BER MON
GRG
-1
GRT URG SGR
SEL
SGA AEM CER
BXP
RIP PRI
TAL
-2 1.5
2
2.5
3
3.5
Predicted permanent component Fig. 5. Predictors of ICT activity.
1,5
1,0
ICT components
0,5
0,0
-0,5
-1,0
Permanent component Transitory component 2008 Transitory component 2009
-1,5
Transitory component 2010
MAR VOC TAR BXC BXM BXL GIR ANO GRF VOR PLE VAR OSO APE ACA BAR BAG SEL PLU GRT URG SGR ARI SGA AEM BXP CER PSO BXE SOL NOG CBA PJU RIE AUR BER MON GRG RIP PRI TAL
Transitory component 2011 -2,0
Fig. 6. Permanent and transitory components of ICT activity.
Note that the factor scores estimate obtained in the analysis is in fact an estimate of the ICT level of the area. This estimate, in itself, should be of interest to policy makers since it informs of these areas with high and low values of ICT activity (e.g. for ranking purposes). Knowing which structural characteristics of the area are driving the variation of the ICT activity should also be of interest to policy makers. The ten best positioned areas are the same as in the previous section, although the relative ranking varies slightly. These are economically developed areas, with a below average percentage of gross value added in agriculture and a well above average percentage of young population. The worst positioned areas are also the same as in the previous section although the relative positions differ somewhat more. All of them are below average regarding the percentage of young population and the gross value added in agriculture is above average with the notable exception of Ribera d'Ebre and Ripollès, though somewhat less. An important result of the analysis is that most of the variance (across areas) of the ICT level is explained by the permanent component. The standardized solution of Eq. (2) indicates that approximately 70% of the variation in the level of ICT activity is accounted for by the permanent component (see Appendix D). Because the covariates explain a large percentage of the permanent component and the permanent component is a dominant part of the ICT variation, we conclude that the structural characteristics of the areas highly determine the ICT variation. This issue is important for policy making and could not have been detected simply by observing the correlations among the indicators and covariates. For instance, it suggests that surveys do not need to be repeated so frequently.12
12
In fact, the Idescat survey has become biannual since year 2011.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
841
5. Concluding remarks In this paper we deal with the measurement of the level of ICT activity when we have data from repeated surveys covering several geographical areas. Statistical offices may be interested in assessing not only the level of ICT in each area and its variation over time, but to evaluate the impact of area-specific covariates on the level of ICT. We presented a panel data methodology in which a common factor underlying a set of indicators of each time period is decomposed into the permanent and transitory components, and the permanent component is regressed on the covariates. The methodology uses standard SEM software and is applied to the data from repeated surveys on ICT activity in the districts of the region of Catalonia, Spain. The analysis indicates a high degree of persistence in the level of ICT activity (with small variation over time) with the permanent component accounting for most of the ICT variation. A regression equation of the model demonstrates that approximately 82% of the variation of this permanent component is accounted for by the variation in the observable socioeconomic characteristics of the areas. The factor scores corresponding to the permanent component can be interpreted as an index of the ICT activity at the area level and thus allow the areas to be ranked. Deviation of this index from the value predicted by the covariates permits the assessment of the differential performance of the areas. In the context of our data, the marginal variance of the transient component (compared with the permanent component) raises the substantive questions of whether it is necessary for Idescat to repeat a costly ICT survey every year. In our application, lower levels of ICT activity (measured by the permanent component) correspond to areas where the economic weight of the agricultural sector is high with respect to the Catalan average. These are areas with below average GDP per capita as suggested by the negative correlation between GDP per capita and the percentage of gross value added in the agricultural sector observed in Table B1 of Appendix B. The population of these areas is aging and the percentage of population with higher education is lower than average. Households in areas with lower levels of ICT activity are worse equipped (especially regarding broadband connection), have less ICT skills, and report lower frequency of use of both computers and the Internet. The social uses of ICTs are also limited in these areas. The modeling approach proposed in this paper could apply to other areas of official statistics for which surveys are periodically repeated, with multiple indicators tapping a latent construct that is the focus of interest (in our case, the level of ICT activity). The method integrates into a single model the multiple stages of analysis that are commonly performed using exploratory or dimension-reduction techniques, such as principal components, discriminant analysis, and other multivariate methods. Integrating into a single framework the different stages obviously leads to a more accurate and informative analysis. In our single-stage approach, we are able to neatly define the factor underlying the indicators, even to assess its dimensionality, and enable the regular test statistics to detect the structural changes in the model that are likely to arise in the applications. In addition, in our approach we obtain an overall goodness of fit test to validate our model. For applied researchers, it is important to note that our analysis can be routinely implemented using standard SEM software. In our application, debate could arise at the stage of defining the multiple area-level indicators that have been used. However, for the methodological emphasis of the paper, we feel they are sufficient indicators of ICT activity. A requirement of our model is a sufficiently large sample size to ensure a negligible sample-survey variance of the indicators at the area level. In addition, it is necessary to incorporate a sufficient number of areas to implement the inferences of the SEM analysis. See Satorra and Bentler (2013) for corroboration that our data set-up fulfills these requirements and for the alternative methods to be used when the sample size in each area is small.
Acknowledgments The authors would like to thank the Statistics Institute of Catalonia (Idescat) for facilitating the data used in this study, with special recognition to Maribel García, Marcos Pardal and the director Cristina Rovira. Recognition is also due to Alex Costa Saenz de San Pedro for encouraging our cooperation with Idescat. In addition, the authors wish to acknowledge the financial support from the Spanish Ministry of Science and Innovation through Grant ECO2011-28875. Appendix A. Names of the areas Table A1 exhibits the complete names of the 41 administrative units of Catalonia and the corresponding abbreviations used in this paper. Table A1 Complete names of the areas and their abbreviations. Name of area
Abbreviation
Name of area
Abbreviation
Alt Alt Alt Alt
ACA AEM APE AUR
Montsià Noguera Osona Pallars Jussà
MON NOG OSO PJU
Camp Empordà Penedès Urgell
842
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
Table A1 (continued ) Name of area
Abbreviation
Name of area
Abbreviation
Alta Ribagorça Anoia Bages Baix Camp Baix Ebre Baix Empordà Baix Llobregat Baix Penedès Barcelonès Berguedà Cerdanya Conca de Barberà Garraf Garrigues Garrotxa Gironès Maresme
ARI ANO BAG BXC BXE BXM BXL BXP BAR BER CER CBA GRF GRG GRT GIR MAR
Pallars Sobirà Pla d'Urgell Pla de l'Estany Priorat Ribera d'Ebre Ripollès Segarra Segrià Selva Solsonès Tarragonès Terra Alta Urgell Vall d'Aran Vallès Occidental Vallès Oriental
PSO PLU PLE PRI RIE RIP SGA SGR SEL SOL TAR TAL URG VAR VOC VOR
Appendix B. Socioeconomic characteristics of the areas Table B1 reports the (Pearson) correlations between the available socioeconomic variables and the across-time average ICT activity of the areas in the CFA model. The correlation between the GDP per capita and the average ICT activity of the areas (administrative units) is lower than 0.3. Other authors found a correlation of approximately 0.9 at the cross-country level (Hanafizadeh et al., 2009) and approximately 0.7 at the regional level (Vicente & López, 2011). The correlation between the employment share in agriculture and the average ICT activity was approximately zero, and we do not report it in Table B1. For the rest of the main economic sectors, the correlations with the average ICT activity are lower than 0.3. The differences in the productive structures are most likely better measured by the shares of the gross value added in agriculture, industry, construction and services. Here, the correlation between the share of the gross value added in agriculture and the average ICT activity is 0.613. The correlations between the shares in construction or services and the average ICT activity are both less than 0.5 in absolute value. The correlation for industry was approximately zero and has not been reported in the table. Human capital is measured by the percentage of the population with a college degree in 2001, the latest observation available in our data. The correlation between this percentage and the average ICT activity is approximately 0.5, very close to the value found by Vicente and López (2011) at the cross-regional level. Population density does not seem to have any relevance to the ICT level of the areas. In contrast, the shares of the population under 14 and over 65 years of age are strongly correlated to the factor scores.
Table B1 Correlation matrix of socio-economic characteristics. Source: Idescat.
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
1.000 0.256 0.486 0.773 0.824 0.262 0.285 0.285 0.285 0.613 0.486 0.353
1.000 0.258 0.005 0.173 0.184 0.083 0.083 0.083 0.371 0.608 0.214
1.000 0.086 0.353 0.455 0.098 0.099 0.100 0.381 0.210 0.607
1.000 0.836 0.110 0.280 0.279 0.278 0.432 0.340 0.125
1.000 0.004 0.242 0.242 0.242 0.587 0.336 0.438
1.000 0.054 0.054 0.054 0.218 0.311 0.359
1.000 1.000 1.000 0.206 0.218 0.020
1.000 1.000 0.206 0.217 0.022
1.000 0.206 0.217 0.023
1.000 0.398 0.332
1.000 0.173
1.000
(1) Average of ICT activity (CFA); (2) GDP per capita, 2008; (3) College degree, 2001 (% of population); (4) Population o14 years old, 2008 (% of population); (5) Population 465 years old, 2008 (% of population); (6) Population density, 2008; (7) Employment in industry, 2007 (% of population); (8) Employment in construction, 2007 (% of population); (9) Employment in services, 2007 (% of population); (10) GVA in agriculture, 2008 (% of total); (11) GVA in construction, 2008 (% of total); (12) GVA in services, 2008 (% of total). Correlations are based on 41 observations.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
843
Appendix C. Estimation results for the panel data model Table C1 displays the estimated factor scores of the panel data model. The areas have been ordered according to the value of the permanent component (last column). Table C1 Estimates of the factor scores of the panel data model. Area
f1 2008
f2 2009
f3 2010
f4 2011
q1 2008
q2 2009
q3 2010
q4 2011
p
MAR VOC TAR BXC BXM BXL GIR ANO GRF VOR PLE VAR OSO APE ACA BAR BAG SEL PLU GRT URG SGR ARI SGA AEM BXP CER PSO BXE SOL NOG CBA PJU RIE AUR BER MON GRG RIP PRI TAL
0.934 0.876 0.864 0.798 0.827 0.799 0.762 0.784 0.699 0.635 0.615 0.571 0.503 0.395 0.393 0.372 0.226 0.113 0.094 0.046 0.017 0.078 0.101 0.155 0.158 0.225 0.344 0.407 0.373 0.456 0.410 0.494 0.447 0.535 0.697 0.707 0.674 0.930 0.986 1.310 1.836
1.085 0.893 0.965 0.836 0.928 0.881 0.845 0.783 0.881 0.880 0.750 0.603 0.568 0.405 0.503 0.523 0.242 0.089 0.007 0.041 0.058 0.167 0.050 0.197 0.238 0.281 0.442 0.360 0.410 0.405 0.431 0.710 0.644 0.595 0.683 0.809 0.755 1.063 1.171 1.274 2.064
1.150 1.038 1.079 0.930 0.939 0.809 0.846 0.709 0.854 0.809 0.712 0.547 0.425 0.510 0.451 0.685 0.237 0.130 0.061 0.066 0.004 0.153 0.056 0.033 0.269 0.464 0.321 0.379 0.461 0.407 0.506 0.641 0.797 0.608 0.565 0.686 0.812 1.066 1.127 1.561 2.068
1.171 1.062 1.015 0.964 0.745 0.796 0.815 0.697 0.821 0.784 0.613 0.709 0.526 0.513 0.479 0.615 0.267 0.093 0.068 0.050 0.087 0.034 0.018 0.083 0.272 0.478 0.320 0.352 0.537 0.467 0.520 0.608 0.737 0.579 0.545 0.571 0.853 0.985 1.080 1.619 1.974
0.053 0.040 0.051 0.041 0.097 0.071 0.050 0.082 0.036 0.011 0.042 0.053 0.042 0.000 0.021 0.009 0.012 0.023 0.043 0.005 0.018 0.014 0.015 0.046 0.005 0.012 0.050 0.041 0.007 0.056 0.010 0.033 0.037 0.038 0.088 0.079 0.042 0.034 0.070 0.075 0.142
0.203 0.056 0.152 0.080 0.199 0.153 0.133 0.081 0.218 0.257 0.177 0.085 0.107 0.010 0.130 0.160 0.028 0.002 0.058 0.001 0.059 0.103 0.035 0.088 0.085 0.068 0.148 0.006 0.043 0.005 0.011 0.250 0.160 0.097 0.074 0.182 0.123 0.167 0.255 0.039 0.370
0.269 0.202 0.267 0.173 0.209 0.081 0.134 0.007 0.191 0.186 0.140 0.029 0.037 0.115 0.078 0.322 0.023 0.040 0.112 0.025 0.005 0.089 0.142 0.076 0.116 0.251 0.027 0.013 0.094 0.006 0.086 0.181 0.313 0.111 0.044 0.059 0.179 0.170 0.212 0.326 0.374
0.290 0.226 0.202 0.208 0.016 0.068 0.103 0.005 0.158 0.161 0.041 0.191 0.064 0.118 0.106 0.252 0.053 0.184 0.017 0.008 0.087 0.098 0.104 0.026 0.119 0.265 0.026 0.014 0.171 0.067 0.100 0.148 0.253 0.082 0.063 0.056 0.221 0.089 0.164 0.383 0.280
0.881 0.836 0.813 0.756 0.730 0.728 0.712 0.703 0.663 0.623 0.573 0.518 0.462 0.395 0.372 0.363 0.214 0.090 0.051 0.041 0.000 0.064 0.086 0.109 0.153 0.213 0.294 0.366 0.367 0.400 0.420 0.460 0.484 0.497 0.609 0.627 0.632 0.896 0.916 1.236 1.694
Appendix D. Standardized solutions as extracted from EQS output The standardized solutions for Eqs. (2)–(4) are displayed below. f 1i ¼ 0:907pi þ 0:422q1i f 2i ¼ 0:815pi þ 0:580q2i f 3i ¼ 0:804pi þ 0:597q3i f 4i ¼ 0:802pi þ 0:422q4i q2i ¼ 0:242q1i þ 0:970ν2i q3i ¼ 0:357q3i þ 0:934ν3i q4i ¼ 0:369q3i þ 0:930ν4i pi ¼ 0:683x1i 0:367x2i þ 0:427δi
R2 ¼ 0:059 R2 ¼ 0:127 R2 ¼ 0:136 R2 ¼ 0:818
844
Table E1 Correlation matrix of the indicator variables. 2008
2009
equip mobile sinceInt knowPC knowInt frecPC frecInt secbuy secbank econ social admin equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin 1.00 0.68 0.92 0.53 0.71 0.41 0.73 0.46 0.40 0.47 0.75 0.59
2009 equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin
0.71 0.59 0.59 0.59 0.54 0.57 0.11 0.19 0.45 0.47 0.47 2010
2010
2011
1.00 0.71 0.51 0.70 0.31 0.69 0.42 0.36 0.62 0.68 0.63
1.00 0.54 0.75 0.41 0.78 0.47 0.51 0.52 0.78 0.59
1.00 0.84 0.80 0.77 0.58 0.56 0.62 0.75 0.50
1.00 0.63 0.95 0.49 0.50 0.76 0.94 0.63
1.00 0.67 0.54 0.59 0.49 0.57 0.39
1.00 0.50 0.53 0.75 0.94 0.66
1.00 0.72 0.44 0.47 0.42
1.00 0.43 0.49 0.35
1.00 0.69 1.00 0.61 0.67
0.53 0.61 0.55 0.56 0.52 0.57 0.34 0.36 0.57 0.55 0.38
0.71 0.64 0.58 0.62 0.60 0.63 0.22 0.28 0.53 0.55 0.46
0.61 0.45 0.70 0.71 0.69 0.71 0.31 0.33 0.51 0.69 0.53
0.70 0.63 0.76 0.77 0.77 0.77 0.24 0.27 0.60 0.72 0.56
0.61 0.48 0.65 0.69 0.70 0.72 0.40 0.38 0.56 0.70 0.56
0.71 0.64 0.73 0.75 0.78 0.77 0.19 0.24 0.60 0.72 0.52
0.60 0.43 0.52 0.55 0.47 0.54 0.46 0.43 0.45 0.58 0.32
0.60 0.47 0.62 0.59 0.65 0.62 0.40 0.46 0.49 0.58 0.45
0.50 0.48 0.61 0.62 0.68 0.68 0.24 0.31 0.55 0.66 0.45
0.69 0.65 0.71 0.75 0.74 0.73 0.20 0.17 0.57 0.69 0.55
equip
mobile
knowPC
knowInt
frecPC
frecInt
secbuy
secbank
econ
social
equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin
1.00 0.36 0.71 0.71 0.71 0.7 0.33 0.43 0.65 0.7 0.52
1.00 0.3 0.24 0.25 0.36 0.1 0.2 0.34 0.24 0.09
1.00 0.93 0.9 0.91 0.51 0.51 0.83 0.91 0.7
1.00 0.93 0.95 0.51 0.57 0.89 0.97 0.77
1.00 0.94 0.45 0.51 0.85 0.93 0.65
1.00 0.49 0.48 0.89 0.95 0.74
1.00 0.66 0.44 0.55 0.58
1.00 0.44 0.57 0.58
1.00 0.88 0.67
1.00 0.74
1.00
equip mobile knowPC knowInt frecPC frecInt secbuy secbank
0.53 0.33 0.66 0.67 0.51 0.56 0.44 0.46
0.39 0.47 0.42 0.38 0.28 0.38 0.3 0.33
0.53 0.18 0.76 0.78 0.68 0.66 0.51 0.39
0.51 0.15 0.69 0.75 0.66 0.63 0.47 0.33
0.56 0.25 0.77 0.8 0.76 0.74 0.52 0.4
0.57 0.26 0.74 0.77 0.71 0.69 0.51 0.39
0.33 0 0.44 0.41 0.35 0.32 0.37 0.39
0.39 0 0.46 0.53 0.42 0.43 0.36 0.38
0.46 0.24 0.63 0.66 0.56 0.59 0.41 0.32
0.55 0.2 0.74 0.77 0.71 0.68 0.48 0.38
0.37 0.11 0.51 0.54 0.4 0.39 0.31 0.19
1.00 0.49 1.00 0.42 0.72 0.47 0.72 0.52 0.76 0.50 0.71 0.54 0.81 -0.07 0.37 0.09 0.42 0.46 0.66 0.52 0.71 0.38 0.53 2011
admin
1.00 0.61 0.65 0.66 0.62 0.38 0.33 0.47 0.58 0.38
1.00 0.96 0.87 0.89 0.43 0.54 0.71 0.77 0.71
1.00 0.87 0.93 0.47 0.54 0.75 0.84 0.74
1.00 0.91 0.37 0.48 0.72 0.80 0.65
1.00 0.48 0.58 0.82 0.92 0.73
1.00 0.81 0.59 0.56 0.45
1.00 0.58 0.55 0.48
equip
mobile
knowPC
knowInt
frecPC
frecInt
secbuy
secbank
1.00 0.59 0.71 0.7 0.76 0.72 0.56 0.54
1.00 0.27 0.33 0.29 0.36 0.2 0.32
1.00 0.95 0.85 0.9 0.68 0.63
1.00 0.84 0.9 0.7 0.6
1.00 0.92 0.63 0.52
1.00 0.67 0.58
1.00 0.8
1.00
1.00 0.85 1.00 0.73 0.77
econ
social
1.00
admin
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
2008 equip mobile sinceInt knowPC knowInt frecPC frecInt secbuy secbank econ social admin
econ social admin
0.27 0.2 0.32
0.46 0.63 0.5
0.37 0.61 0.44
0.47 0.69 0.53
0.38 0.64 0.54
0.31 0.36 0.37
0.35 0.51 0.28
0.33 0.57 0.39
0.39 0.67 0.48
0.14 0.51 0.53
0.43 0.62 0.5 2009
0.21 0.37 0.21
0.64 0.8 0.7
0.64 0.86 0.7
0.66 0.77 0.54
0.7 0.86 0.61
0.68 0.61 0.51
0.6 0.52 0.43
1.00 0.61 0.32
1.00 0.64
1.00
equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ
social admin
0.57 0.32 0.61 0.61 0.60 0.63 0.37 0.29 0.59 0.62 0.59 0.50 0.20 0.65 0.63 0.57 0.57 0.34 0.25 0.28 0.62 0.65
0.56 0.29 0.65 0.68 0.65 0.64 0.24 0.40 0.64 0.69 0.48 0.38 0.21 0.54 0.61 0.43 0.45 0.25 0.31 0.28 0.56 0.34
0.49 0.44 0.50 0.55 0.57 0.58 0.18 0.12 0.58 0.53 0.49 0.44 0.34 0.56 0.58 0.44 0.52 0.39 0.29 0.30 0.52 0.45
0.53 0.36 0.57 0.63 0.60 0.67 0.40 0.34 0.61 0.64 0.61 0.54 0.17 0.65 0.65 0.59 0.58 0.42 0.32 0.25 0.61 0.63
0.61 0.22 0.62 0.60 0.60 0.61 0.32 0.40 0.47 0.57 0.48 0.60 0.42 0.56 0.58 0.54 0.49 0.48 0.36 0.31 0.48 0.50
0.59 0.36 0.65 0.68 0.64 0.68 0.28 0.37 0.60 0.64 0.55 0.57 0.36 0.62 0.64 0.56 0.53 0.37 0.28 0.26 0.50 0.52
0.50 0.17 0.72 0.67 0.68 0.64 0.27 0.43 0.50 0.60 0.47 0.49 0.27 0.56 0.63 0.59 0.53 0.48 0.30 0.49 0.52 0.44
0.61 0.39 0.67 0.72 0.70 0.71 0.26 0.39 0.63 0.67 0.53 0.62 0.36 0.65 0.69 0.62 0.59 0.43 0.32 0.34 0.56 0.54
0.52 0.24 0.47 0.42 0.50 0.50 0.34 0.15 0.38 0.41 0.31 0.33 0.19 0.40 0.37 0.27 0.26 0.33 0.24 0.34 0.30 0.45
0.47 0.22 0.44 0.49 0.48 0.52 0.31 0.41 0.48 0.42 0.45 0.45 0.30 0.34 0.42 0.31 0.28 0.35 0.23 0.32 0.36 0.38
0.51 0.43 0.51 0.52 0.43 0.49 0.18 0.19 0.44 0.45 0.41 0.37 0.35 0.41 0.45 0.34 0.36 0.34 0.32 0.28 0.34 0.32
0.58 0.30 0.64 0.67 0.65 0.67 0.34 0.37 0.63 0.64 0.58 0.61 0.36 0.63 0.65 0.57 0.54 0.39 0.26 0.23 0.54 0.59
0.46 0.33 0.53 0.55 0.53 0.57 0.36 0.18 0.54 0.55 0.58 0.40 0.30 0.45 0.43 0.37 0.35 0.37 0.32 0.17 0.40 0.49
0.65 0.36 0.75 0.77 0.74 0.80 0.27 0.28 0.70 0.75 0.55 0.47 0.22 0.57 0.63 0.57 0.55 0.24 0.07 0.26 0.56 0.52
0.45 0.51 0.60 0.67 0.64 0.72 0.24 0.10 0.64 0.65 0.53 0.49 0.35 0.52 0.56 0.58 0.53 0.30 0.06 0.30 0.48 0.40
0.63 0.37 0.70 0.70 0.66 0.66 0.27 0.36 0.63 0.64 0.55 0.38 0.29 0.49 0.56 0.38 0.37 0.26 0.17 0.26 0.45 0.37
0.62 0.36 0.73 0.72 0.69 0.70 0.28 0.33 0.64 0.67 0.56 0.38 0.23 0.54 0.60 0.44 0.41 0.32 0.19 0.26 0.49 0.44
0.55 0.46 0.63 0.64 0.62 0.63 0.27 0.38 0.57 0.58 0.53 0.48 0.40 0.50 0.57 0.45 0.43 0.26 0.21 0.28 0.45 0.45
0.63 0.40 0.71 0.71 0.70 0.70 0.23 0.34 0.61 0.67 0.54 0.41 0.31 0.54 0.63 0.50 0.49 0.33 0.24 0.32 0.57 0.49
0.25 0.16 0.34 0.32 0.38 0.36 0.13 0.14 0.28 0.33 0.17 0.14 0.08 0.31 0.36 0.24 0.33 0.21 0.10 0.30 0.36 0.23
0.17 0.16 0.33 0.37 0.39 0.38 0.17 0.18 0.32 0.37 0.29 -0.01 0.01 0.18 0.26 0.11 0.18 0.08 -0.01 0.19 0.31 0.22
0.59 0.30 0.66 0.66 0.65 0.63 0.26 0.36 0.50 0.64 0.49 0.47 0.28 0.54 0.63 0.51 0.49 0.38 0.29 0.34 0.58 0.48
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
2010 equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin 2011 equip mobile knowPC knowInt frecPC frecInt secbuy secbank econ social admin
0.3 0.51 0.45 2008
845
846
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
Appendix E See Table E1. References Al-Mutawkkil, A., Hesmati, A., & Hwang, J. (2009). Development of telecommunication and broadcasting infrastructure indices at the global level. Telecommunications Policy, 33(3–4), 176–199. Atkinson, R., & Andes, S. (2008). The 2008 state new economy index: Benchmarking economic transformation in the states. Retrieved from http://ssrn.com/ abstract=1323828%20or%20http:/dx.doi.org/10.2139/ssrn.1323828. Bagchi, K. (2005). Factors contributing to global digital divide: Some empirical results. Joumal of Global Information Technology Management, 8(3), 47–65. Bentler, P. M. (2002). EQS 6 structural equations program manual. Encino, CA: Multivariate Software, Inc. Billón, M., Ezcurra, R., & Lera-López, F. (2008). The spatial distribution of the internet in the European Union. Does geographical proximity matter?. European Planning Studies, 16(1), 119–142. Billón, M., Ezcurra, R., & Lera-López, F. (2009). Spatial effects in website adoption by firms in European regions. Growth and Change, 40(1), 54–84. Bollen, K. A. (1989). Structural equations with latent variables. New York, Chichester, Brisbane, Toronto, Singapore: John Wiley & Sons, inc. Bou, J. C., & Satorra, A. (2007). The persistence of abnormal returns at industry and firm levels: Evidence from Spain. Strategic Management Journal, 28, 707–722. Bou, J. C. & Satorra, A. (2014). Univariate versus multivariate modeling of panel data. WP 1417. Universitat Pompeu Fabra, Department of Economics and Business. Cerno, L., & Pérez-Amaral, T. (2008). Medición de la brecha digital en España. In Jesús Banegas Núñez & Rafael Myro (Eds.), El impacto de las tecnologías de la Información en la Sociedad Española. Thomson Civitas, 2008. Chaudhuri, A., Flamm, S., & Horrigan, J. (2005). An analysis of the determinants of internet access. Telecommunications Policy, 29(9-10), 731–755. Çilan, C. A., Bolat, B. A., & Coşkun, E. (2009). Analyzing digital divide within and between member and candidate countries of European Union. Government Information Quarterly, 26(1), 98–105. Colecchia, A. (2000). Defining and measuring electronic commerce – Towards the development of an OECD methodology. Paris: OECD Press. Corrocher, N., & Ordanini, A. (2002). Measuring the digital divide: A framework for the analysis of cross-country differences. Journal of Information Technology, 17(1), 9–19. Crandall, R., Lehr, W., & Litan, R. (2007). The effects of broadband deployment on output and employment: A cross-sectional analysis of U.S. data Issues in Economic Policy discussion paper no. 6. Brookings Institution, 2007. Cruz-Jesús, F., Oliveira, T., & Bacao, F. (2012). Digital divide across the European Union. Information & Management, 49, 278–291. Demoussis, M., & Giannakopoulos, N. (2006). Facets of the digital divide in Europe: Determination and extent of internet use. Economics of Innovation and New Technology, 15(3), 235–246. Dutta, S., & Jain, A. (2004). The networked readiness index 2003–2004: Overview and analysis framework. In S. Dutta, B. Lanvin, & F. Paul (Eds.), The global information technology report 2003–2004 (pp. 3–22). New York: Oxford University Press. Garbacz, C. & Thompson, H. G. (2008). Broadband impacts on state GDP: Direct and indirect impacts. Paper submitted to the Telcommunications Policy Research Conference, 2008. Available at: 〈http://www.itu.int/wsis/stocktaking/docs/activities/1287145862/Ohio_University.pdf〉. Grubesic, T. H. (2006). A spatial taxonomy of broadband regions in the United States. Information Economics and Policy, 18(4), 423–448. Hanafizadeh, M. R., Saghaei, A., & Hanafizadeh, P. (2009). An index for cross-country analysis of ICT infrastructure and access. Telecommunications Policy, 33 (7), 385–405. Holloway, D. (2005). The digital divide in Sydney. A socio-spatial analysis. Information, Communication & Society, 8, 168–193. Horrigan, J. B., Stolp, E., & Wilson, R. H. (2006). Broadband utilization in space: Effects of population and economic structure. The Information Society, 22, 341–356. Howard, P. N., Anderson, K., Bush, L., & Nafus, D. (2009). Sizing-up information societies: Towards a better metric for the cultures of ICT adoption. The information society, 25, 208–219. ITU (2009). Measuring the information society. The ICT development index 2009. Geneva: ITU. Kiiski, S., & Pohjola, M. (2002). Cross-country diffusion of the Internet. Information Economics and Policy, 14, 297–310. Kolko, J. (2000). The death of cities? The death of distance? Evidence from the geography of commercial internet usage. In I. Vogelsang, & B. M. Compaine (Eds.), The internet upheaval (pp. 73–97). Cambridge, MA: MIT Press. Mills, B. F., & Whitacre, B. E. (2003). Understanding the non-metropolitan–metropolitan digital divide. Growth and Change, 24(2), 219–243. Noce, A. A., & McKeown, L. (2008). A new benchmark for internet use: A logistic modeling of factors influencing internet use in Canada, 2005. Government Information Quarterly, 25(3), 462–476. OECD, D. L. (2001). Understanding the digital divide. Paris: Organization for Economic Cooperation and Development. Orbicom. (2003). In G. Sciades (Ed.), Monitoring the digital divide…and beyond. Retrieved from 〈http://orbicom.uqam.ca/upload/files/research_projects/ 2003_dd_pdf_en.pdf〉. Progressive Policy Institute (2002). The 2002 state new economy index: Benchmarking economic transformation in the states. Retrieved from 〈http://www. itif.org/files/2002-new-state-econ-index.pdf〉. Qiang, C. Z., & Rosotto, C. (2009). Economic impact of broadband. 2009 Information and communication for development: Extending reach and increasing impact. Washington, DC: World Bank. Saris, W. E., Satorra, A., & Sörbom, D. (1987). Detection and correction of specification on errors in structural equation model. Sociological Methodology, 17, 105–129. Saris, W. E., Satorra, A., & G. Coenders, G. (2004). A new approach for evaluating the quality of measurement instruments: Split ballot MTMM design. Sociological Methodology, 34, 311–347. Satorra, A., & Bentler, P. M. (1990). Model conditions for asymptotic robustness in the analysis of linear relations. In A. van Eye, & C. C. Clogg (Eds.), Computational Statistics & Data Analysis, 10 (pp. 285–305). Sage Publications, 1990. Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. Latent variable analysis in developmental research, 285–305. Satorra, A., & Bentler, P. M. (2013). A longitudinal model for repeated cross sectional data with clustering. Proceedings of the JSM 2013 – business and economic statistics section (pp. 3297–3311). Alexandria, VA 22314-1943: American Statistical Association. Schleife, K. (2010). What really matters: Regional versus individual determinants of the digital divide in Germany. Research Policy, 39(1), 173–185. Selhofer, H., & Hüsing, T. (2002). The digital divide index: A measure of social inequalities in the adoption of lCT. In S. Wrycza (Ed.), Proceedings of the Xth European Conference on Information Systems ECIS 2002 – Information systems and the future of the digital economy. June 6–8, 2002 (pp. 1273–1286). Gdansk: ECIS. Soupizet, J. F. (2004). The north–south digital divide. Paper presented at the digital divides international conference, November 18–19. Paris: Université Paris Sud. US Department of Commerce, T. (2000). Falling through the net: Toward digital inclusion. Washington, DC: US Government Printing Office. Vehovar, V., Sicherl, P., Hüsing, T., & Dolnicar, V. (2006). Methodological challenges of digital. The Information Society, 22(5), 279–290.
E. Ventura, A. Satorra / Telecommunications Policy 39 (2015) 830–847
847
Vicente, M. R., & López, A. J. (2006). A multivariate framework for the analysis of the digital divide: Evidence for the European Union-15. Information & Management, 43, 756–766. Vicente, M. R., & López, A. J. (2011). Assessing the regional digital divide across the European union-27. Telecommunications Policy, 35(3), 220–237. Vicente, M. R., & López, A. J. (2007). Measuring the digital divide at the regional level: Evidence for Spain. Paper presented at the regional studies association international conference: Regions in Focus? April 2–5. Portugal: University of Lisbon. Wolcott, P., Press, L., McHenry, W., Goodman, S., & Foster, W. (2001). A framework for assessing the global diffusion of the internet. Journal of the Association for Information Systems, 2, 1.