Assessing generalizability of scales used in cross-national research

Assessing generalizability of scales used in cross-national research

Intern. J. of Research in Marketing 20 (2003) 287 – 295 www.elsevier.com/locate/ijresmar Note Assessing generalizability of scales used in cross-nat...

116KB Sizes 1 Downloads 28 Views

Intern. J. of Research in Marketing 20 (2003) 287 – 295 www.elsevier.com/locate/ijresmar

Note

Assessing generalizability of scales used in cross-national research Subhash Sharma a,*, Danny Weathers b b

a Moore School of Business, University of South Carolina, Columbia, SC 29208, USA Ourso College of Business Administration, Louisiana State University, Baton Rouge, LA, USA

Received 14 June 2002; received in revised form 17 January 2003; accepted 28 January 2003

Abstract Increased globalization has generated considerable interest in testing theories in other countries that were developed in the United States. A requirement for any cross-national research should be that measures (scales) of constructs are generalizable across countries, and an important related issue is how many items and subjects are needed to generalize the scale across countries. This article discusses the use of generalizability theory (G theory) to assess the extent to which a previously administered scale can be generalized across countries and also to estimate the number of scale items and subjects needed to obtain a desired level of generalizability for future studies. The article further illustrates how G theory can be used in conjunction with a confirmatory factor analysis (CFA) framework to provide supplementary evidence of measurement invariance across countries. An empirical illustration of both G theory and the CFA-based framework is provided using data collected from multiple countries. The article concludes with a discussion of the unique benefits of CFA and G theory and suggests that the two techniques should be treated as complementary techniques for assessing measurement invariance and estimating the number of items and subjects needed to generalize the scale across countries. D 2003 Elsevier B.V. All rights reserved. Keywords: Confirmatory factor analysis; Generalizability theory; Scale development; Cross-national research; Measurement equivalency

1. Introduction Increased globalization and the trend towards removing or easing of trade barriers have generated considerable interest in cross-national or cross-cultural research (e.g., Deshpande´, Farley, & Webster, 1993). Much of the emphasis in this research is on determining whether theories, constructs, and/or scales developed predominantly in the United States

* Corresponding author. Tel.: +1-803-777-4912; fax: +1-803777-6876. E-mail address: [email protected] (S. Sharma). 0167-8116/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0167-8116(03)00038-7

can be generalized to other countries (e.g., GurhanCanli & Maheswaran, 2000; Netemeyer, Durvasula, & Lichtenstein, 1991; Sharma, Shimp, & Shin, 1995). In order to extend theories and their associated constructs to other countries, an important first step is to assess the degree to which a given scale is crossnationally invariant (Douglas & Craig, 1997; Hui & Triandis, 1985). Measurement invariance, or equivalence, refers to ‘‘whether or not, under different conditions of observing and studying phenomena, measurement operations yield measures of the same attribute’’ (Horn & McArdle, 1992, p. 117). Researchers have proposed frameworks that use confirmatory factor analysis (CFA) to

288

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

assess the measurement invariance of scales (Durvasula, Andrews, Lysonski, & Netemeyer, 1993; Singh, 1995; Steenkamp & Baumgartner, 1998), and these frameworks can be used during the scale development process or for existing scales. The impetus for developing these frameworks was that if a scale’s items vary across countries, the use of such scales for crossnational research is problematic, as one would not know the extent to which responses of subjects in different countries are due to differences in interpretation of the scale items across countries. Suppose that a scale does not have measurement invariance across countries and that parameters are estimated without constraining the scale’s items to have equal loadings across countries (i.e., no artificial invariance is imposed). The parameter estimates will not be biased as the model is correctly specified. However, problems arise in interpreting the results because the scale may not have the same meaning, or the same construct may not be measured, across countries. On the other hand, if the scale does not possess measurement equivalency and the parameters are estimated by constraining the loadings to be equal (i.e., artificial measurement invariance is imposed), then the parameter estimates will be biased as the estimated model is misspecified. Consequently, without evidence supporting measurement invariance, ‘‘conclusions based on that scale are at best ambiguous and at worst erroneous’’ (Steenkamp & Baumgartner, 1998, p. 78). A related yet distinct concept is that of generalizability—the degree to which one can generalize the data collected to a larger universe (Rentz, 1987). If measures are not generalizable, cross-national measurement invariance is meaningless, as invariance may not be obtained under different conditions. Generalizability theory (G theory), formally presented by Cronbach, Gleser, Nanda, and Rajaratnam (1972) and introduced in the marketing literature by Peter (1979) and Rentz (1987), can be used to provide an index assessing the degree of generalizability achieved for a given scale. It also provides supplementary evidence of measurement invariance across countries. An important related issue for researchers conducting cross-national studies is determining how many scale items and subjects are necessary to generalize the results of one study to other settings (Steenkamp & Baumgartner, 1998). G theory can be used to determine the number of items and subjects needed

to obtain a desired level of generalizability for future studies. Using a minimum number of items is desirable in order to reduce the length of the questionnaire, which, in turn, should reduce respondent fatigue and the cost of the study (Haynes, Nelson, & Blaine, 1999; Steenkamp & Baumgartner, 1995). Given the costs of gathering cross-national data, avoiding over sampling should also reduce study expenses. The purpose of this article is to empirically illustrate how G theory can be used to examine the extent to which scales are country-specific, and more importantly, to determine the number of scale items and subjects required to achieve a specified level of generalizability for future studies. In addition, the CFA framework proposed by Steenkamp and Baumgartner (1998) will be illustrated, and a comparison of this CFA approach with G theory will be provided. Because both techniques have been presented in detail elsewhere, an in-depth discussion is not provided here.1 However, since the assessment of scale generalizability depends on whether factors are treated as random or fixed, we discuss the difference between random and fixed factors in G theory analysis.

2. Random and fixed factors in G theory Suppose that a k item scale measuring a single construct is administered to nj subjects in each of j countries. Items, subjects, countries, and their interactions are factors contributing to the variance in subjects’ responses or scores. This study can be treated as a three-factor mixed ANOVA design with the following sources of variability: (1) countries (C); (2) subjects-within-countries (S:C) (note that variation due to S:C is confounded with variation due to the (S:C)  C interaction.); (3) items (I); (4) items-bycountries interaction (I  C); and (5) subjects-withincountries-by-items interaction ((S:C)  I). This last source is confounded with the (S:C)  I  C threeway interaction. Since there is only one observation

1 For a detailed discussion of generalizability theory, see the textbook by Shavelson and Webb (1991). The CFA approach for assessing the generalizability of constructs has been presented by Singh (1995) and Steenkamp and Baumgartner (1998).

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

per cell, the (S:C)  I and (S:C)  I  C interactions are also confounded with the error term. Consequently, (S:C)  I and (S:C)  I  C are used as the error variance estimate. Variance due to these sources can be estimated using the BMDP8V procedure in the BIOMED package (Dixon, Brown, Engelman, & Jennrich, 1990), PROC VARCOMP in SAS (1994), or the variance component procedure in SPSS (1998). 2.1. Random factors Suppose that in the three-factor study described above, both countries and subjects within each country are selected at random. The scale is administered to each subject, and variances due to the five sources are estimated. Subjects’ scores are expected to differ, as subjects would obviously vary with respect to the construct of interest. Suppose that the researcher also expects scores to differ across countries due to, for example, different cultural orientations. Subjects and countries, therefore, contribute desired or expected variance, and in G theory terminology, these factors are known as differentiation factors or objects of measurement. One would also expect some variance of scores across items, as no variance would imply complete item redundancy; however, large variance across items is not desired, as this would imply lack of internal consistency. That is, one would like to control for variation due to items, and such factors in G theory are called controllable or generalization factors. The extent to which subjects’ and countries’ scores can be generalized across items is given by the generalizability coefficient (GC), which can be computed from the following equation: GC ¼

r2C þ r2S:C  2 : rerror r2IC 2 2 rC þ rS:C þ þ k k

289

gorization of factors into differentiation and generalization factors depends on the study’s objectives. Suppose that the objective is to compare the scale scores across countries and thus determine the extent to which country scores can be generalized across items and subjects. For example, in comparing a construct such as consumer ethnocentrism across countries, the researcher would want to ensure that each country’s computed score does not depend on the specific set of scale items or subjects used in computing the score. For this research objective, countries is the differentiation factor, while items and subjects are generalization factors. The following equation can be used to compute the GC: GC ¼ r2C þ



r2error nk

r2C : r2IC r2S:C þ k þ n

ð2Þ

2.2. Fixed factors Assume, on the other hand, that the countries were not selected at random. Rather, the researcher’s interest focused on only a few specific countries. In this case, countries is a fixed factor and, consequently, cannot be considered as one of the sources of variation in the ANOVA design. Instead, a separate analysis needs to be conducted for each level of the fixed factor (i.e., each country). The sources of variation for each country are: subjects (S), items (I), and a subjects-by-items (S  I) interaction, which is confounded with the error. The GC for each country is computed by the following equation, which represents that the extent to which subject scores for the respective country can be generalized across items:

ð1Þ

In this equation, the numerator is the variance due to the differentiation factors, and the bracketed term is error variance.2 It is important to note that although the sources of variation for a given study do not change, the cate2 See Shavelson and Webb (1991) for a detailed list of terms and equations related to the computation of the GC.

GC ¼ r2S

þ

r2 S 2  : rerror

ð3Þ

k

3. Empirical illustration 3.1. Data We use Shimp and Sharma’s (1987) consumer ethnocentrism (CET) scale to provide the empirical

290

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

illustration. The data are taken from Netemeyer et al. (1991). Samples of 71 subjects in the US, 70 subjects in France, 76 subjects in Japan, and 73 subjects in Germany completed the 17-item scale. The different number of subjects across countries created an unbalanced design, potentially complicating the estimation of the variance components required for G theory. We addressed this issue by conducting the G theory analyses in three ways. First, we randomly selected an equal number of subjects (70) from each country, which is an approach previously used in similar situations (e.g., Finn & Kayande´, 1997). Second, we conducted the analyses using all subjects, leading to slightly unequal sample sizes across countries. Third, to assess the robustness of variance estimation procedures when the assumption of equal sample sizes is violated, we randomly selected 40, 50, 60, or 70 subjects from each country, creating large imbalances in the sample sizes across countries. For each approach, variance components were estimated via PROC VARCOMP in SAS. Results were very similar across the three approaches, suggesting that the procedure to estimate variance components is quite robust and, therefore, having unequal sample sizes is not a severe limitation. In the interest of space, we present only the results obtained with equal sample sizes across countries. 3.2. Assessing measurement equivalency using confirmatory factor analysis Measurement equivalency of the CET scale was first examined using Steenkamp and Baumgartner’s (1998) framework, henceforth referred to as the confirmatory factor analysis (CFA) approach. To assess configural equivalence, exploratory factor analysis was performed on the data from each country and on a data set created by combining the data from all four countries. The percent of variance explained by the first factor ranged from 43.2% to 57.3% for the various samples, while the percent of variance explained by the second factor ranged from 6.6% to 9.7%. The substantial decreases after the first eigenvalue for the individual and combined samples suggested a one-factor model (Sharma, 1996). Though the factor loadings are not reported, all 17 items had acceptable loadings on the first factor for both the combined and individual samples. This establishes

configural equivalency, meaning that the factor structure is the same across countries. Table 1 presents the CFA results for the various models outlined by Steenkamp and Baumgartner (1998). The first step in Steenkamp and Baumgartner’s approach is to test for equality of covariance matrices, but this test could not be performed because the sample sizes were less than the number of parameters to be estimated. It was possible to perform analyses for the remaining steps. Chi-square difference tests provided support for metric and factor variance equivalencies, which imply that the factor loadings and variances of the constructs are equivalent across countries. Error variance equivalency was not supported. However, when the objective is to test theories across countries in which the constructs are embedded in a nomological network, only configural and metric equivalencies are required (Steenkamp & Baumgartner, 1998). Since our focus is on theory testing, we can conclude that based on the CFA approach, the 17-item CET scale is invariant across countries.3 3.3. Assessing measurement equivalency using generalizability theory 3.3.1. Countries as a random factor First, consider countries as a random factor. Table 2a presents the estimated variance components. Countries accounts for only a small portion of the variance (7.01%), indicating that CET scores do not differ greatly across the countries from which data were collected. Almost 35% of the variance is due to the subjects-within-countries factor, suggesting, not surprisingly, that responses to the items vary across subjects. However, of greater interest presently is the interaction between items and countries, which allows for an assessment of the pattern of subjects’ responses to scale items across countries. Absence of a large interaction suggests that although mean responses may differ across countries, the pattern of responses to scale items is the same across countries. That is, the scale items are not country-specific and have the same

3

If researchers are interested in comparing mean scores across countries or groups, it is also imperative to examine the scalar invariance of the scale (see Steenkamp & Baumgartner, 1998).

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

291

Table 1 Results of measurement equivalency using confirmatory factor analysis Unconstrained and constrained models Model

Chi-square

df

RMSEA

TLI

RNI

(1) Unconstrained (2) Equal loadings (K1 = K2 =: : : = Kj) (3) Equal loadings, equal construct variances (K1 = K2 = : : : = Kj and /1 = /2 =: : : = /j) (4) Equal loadings, equal construct variances, equal error variances (K1 = K2 = : : : = Kj, /1 = /2 = : : : = /j and H1d = H1d =: : : = Hdj)

1089.42 1152.82 1159.62

476 524 527

0.13 0.12 0.12

0.77 0.79 0.79

0.79 0.79 0.79

1400.35

578

0.13

0.75

0.72

Equivalency tests

Model comparison

Difference in v2

Difference in df

p-value

Conclusion

Metric equivalency Metric and factor variance equivalency Metric, factor variance and error variance equivalency

(2) – (1) (3) – (2) (4) – (3)

63.40 6.80 240.73

48 3 51

0.07 0.08 0.00

supported supported not supported

Chi-square difference tests

meaning across countries. The items-by-countries term accounts for only 5.51% of the variance in CET scores, which is the smallest source of variation. This suggests that there is consistency in item response patterns across countries. The items measuring the construct do not appear to be country-specific, further suggesting that they have the same meanings across countries.

3.3.2. Countries as a fixed factor If the countries factors were assumed to be fixed, and if the effects due to this fixed factor were large, it would be appropriate to analyze each level of the fixed factor separately and pool or average the results over these levels. Although the variance due to countries was not large, separate analyses were conducted for each country, and the individual and

Table 2 Estimates of variance components for CET data (a) Countries as a random factor Source Countries (C) Subjects-within-countries [(S:C), (S:C)  C] Items (I) IC S:C  I, S:C  I  C, error Total

df

Sum of squares

Mean square

Variance

Percent (%)

3 276

692.251 4333.312

230.750 15.700

0.1727 0.8608

7.01 34.92

16 48 4416 4759

1193.387 507.145 4711.703 11,437.798

74.586 10.566 1.067

0.2286 0.1357 1.0670 2.4648

9.27 5.51 43.29

(b) Countries as a fixed factor Variance component

Items (I) Subjects (S) S  I (error) Total

US (n = 71)

France (n = 70)

Japan (n = 76)

Variance

Percent (%)

Variance

Percent (%)

Variance

0.422 1.200 1.050 2.672

15.79 44.91 39.30

0.413 0.799 1.193 2.405

17.17 33.22 49.61

0.289 0.672 1.109 2.07

Germany (n = 73)

Pooled results

Percent (%)

Variance

Percent (%)

Variance

Percent (%)

13.96 32.46 53.58

0.318 0.809 0.924 2.051

15.51 39.44 45.05

0.263 0.861 1.179 2.303

11.42 37.39 51.19

292

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

pooled results are presented in Table 2b. The pooled results were computed as in Shavelson and Webb (1991, pp. 74 – 75). The variance components are very similar across countries. Apart from the error (S  I) term, the subjects factor accounts for the highest percent of the total variance. Items accounts for a much smaller percentage of variance. Some item variation should be expected because low or no variation suggests that some items may be redundant. On the other hand, one should not expect item variation to be large, as this would imply inconsistencies in responses across items. More importantly, the variance due to items is not significantly different across countries (Bartlett test chi-square = 0.827, three df, p-value = 0.843; Neter, Kutner, Nachtsheim, & Wasserman, 1996), suggesting that the items are not country-specific. Thus, the conclusion reached is the same regardless of whether countries is assumed to be a fixed or random factor. 3.4. Generalizability coefficient Suppose that the researcher wishes to generalize over scale items and countries is treated as a random factor. Using the estimated variance components (Table 2a) and Eq. (1), the GC is given by the following: GC ¼

0:1727 þ 0:8608 ð0:1727 þ 0:8608Þ þ ð1:067=17 þ 0:1357=17Þ

¼ 0:936 This value is greater than the cutoff value of 0.90 suggested by Shavelson and Webb (1991), suggesting that the CET scale can indeed be generalized over items and further implying that the scale is not country-specific. Suppose the primary interest is in determining the extent to which country scores can be generalized across subjects and items. Using Eq. (2), the GC for this objective is given by the following: GC ¼

0:1727 0:1727 þ ð1:0670=ð17  70Þ þ ð0:1357=17Þ þ 0:8608=70Þ

¼ 0:891;

which is only slightly less than the recommended value of 0.90. This implies that the country scores can

be generalized across items and subjects and the scale items are not country-specific. When the countries factor is assumed to be fixed, the GC can be computed for individual countries via Eq. (3). Using the variance estimates from Table 2b yields similar values across countries, all of which are greater than 0.90 (GCUS = 0.951, GCFrance = 0.919, GCJapan = 0.912, GCGermany = 0.937, GCpooled =0.925). Thus, it appears that subjects’ scores can be generalized over items for each country. 3.5. Decision analysis 3.5.1. Number of items G theory can be extended to do a decision analysis, or D study, which employs information from G studies to design future studies that minimize error due to controllable sources and increase study generalizability. Suppose that the interest is in determining the number of items that should be used for subsequent administrations of the CET scale. When countries is assumed to be a random factor, the GC can be computed as a function of the number of items (i.e., k can be varied) using Eq. (1) and the variance estimates from Table 2a. If the recommended value of 0.90 is desired for the GC, 11 items would be sufficient to obtain this value (GC = 0.905). Thus, 11 of the 17 items could be randomly selected to create a shorter scale that has an acceptable level of generalizability. It is interesting to note that in one of the studies reported in Shimp and Sharma (1987), 10 items were randomly selected from the 17-item CET scale, and the reliability of the 10-item scale was found to be acceptable. In subsequent research, Netemeyer et al. (1991) also validated a 10-item CET scale. Thus, it appears that a 10- or 11-item CET scale would be acceptably generalizable. When countries is assumed to be a fixed factor, the GC can be computed as a function of the number of items for each country using Eq. (3) and the variance estimates from Table 2b. To obtain a GC of 0.90, the number of items required would be eight for the US, 14 for France, 15 for Japan, and 11 for Germany. 3.5.2. Number of subjects Perhaps more important than determining a minimum number of items, decision analysis can be used to determine the minimum number of subjects neces-

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

sary to achieve a desired level of generalizability. Once the decision regarding the number of items has been made, a second decision analysis can be performed to determine the required sample size. If countries were viewed as a random factor and 11 items were selected, the GC could be computed as a function of the number of subjects by varying n in Eq. (2). One hundred and forty subjects would be required to obtain the desired GC of 0.90. That is, in order to generalize the findings of an 11-item scale across subjects and items, samples larger than those collected by Netemeyer et al. (1991) would be needed because the variation due to the subjects-within-countries source was high. In general, all else being equal, the higher (lower) the variance of a generalization factor, the greater (fewer) the number of levels needed of that factor to obtain a desired value of the GC. If countries were viewed as a fixed factor, there is no GC equation that allows one to compute necessary sample sizes for the individual countries. However, G theory can still provide some guidance in determining sample sizes for future studies by using Eq. (2) and the maximum number of items from the countries-asa-fixed-factor decision analysis. While this analysis violates the assumption of countries as a random factor, it does provide sample size information that would not be available otherwise. For the current illustration, using 15 items and the variance components from Table 2a, 100 subjects from each country would be necessary to yield a GC greater than 0.90.

4. Summary and discussion This article illustrates the use of two approaches— generalizability theory and Steenkamp and Baumgartner’s (1998) CFA framework—for assessing the extent to which scales can be used to make valid cross-national comparisons. As discussed below, each of these techniques offers unique information and advantages. The CFA approach has the advantage of using statistical criteria (i.e., chi-square tests) for comparing model fits to assess measurement equivalency. With G theory, conversely, measurement equivalency of a scale across countries is subjectively assessed by the amount of variance due to the items-by-countries interaction when the countries factor is assumed to

293

be random. When the countries factor is fixed, invariance of items across countries is assessed by a Bartlett test to determine if the variance due to items differs across countries. Thus, CFA can have greater statistical rigor than G theory. However, in CFA, it is well known that the chi-square test is sensitive to sample size. Therefore, researchers often resort to alternate goodness-of-fit indices (which, presumably, are less sensitive to sample size) with arbitrary cutoff points, including the goodness-of-fit index (GFI), Tucker – Lewis index (TLI), relative noncentrality index (RNI), and RMSEA. Using these goodness-of-fit indices and arbitrary cut-off values in lieu of chi-square tests somewhat reduces the statistical rigor of CFA. The CFA approach has the distinct advantage of identifying problem items if measurement equivalency of a scale is not established. Once these items are identified, they can be deleted to improve the generalizability of the scale, or one can assess partial measurement equivalency.4 Further, CFA can be used to address the etic –emic dilemma in cross-national research where each country’s scale consists of some core items that are the same across countries (i.e., etic items) and some items that are specific to each country (i.e., emic items) (Baumgartner & Steenkamp, 1998). Therefore, the CFA approach is useful during the initial stages of the scale development process, for establishing partial measurement equivalency when full measurement equivalency is not achieved, or for addressing the etic –emic dilemma. G theory, on the other hand, does not provide any diagnostic information during the scale development process or procedures for establishing partial equivalency, and further research is needed to extend G theory to the etic – emic context. Both G theory and CFA provide information on the extent to which the scale is not generalizable. In the CFA analysis, the differences in the chi-square statistics and goodness-of-fit indices can be used to gauge the extent to which measurement invariance is violated. When countries is viewed as a random factor, G theory gives an estimate of the percent of variation that is due to differences in scale items across countries (i.e., the items-by-countries interac4

The interested reader is referred to Steenkamp and Baumgartner (1998) for further details on partial measurement equivalency.

294

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295

tion). When countries is viewed as a fixed factor, country effects are assessed through comparing across countries the variance due to items, which can be tested statistically with a Bartlett test. Additionally, G theory provides an estimate of the G coefficient, which could be useful when the researcher is unable to further refine a scale or is working with an existing scale. In such cases, the researcher can take this information into consideration in determining his/her confidence in comparing results across countries. A large GC, small items-by-countries interaction, or insignificant differences in item variance across countries would suggest that the scale can be used to test theories across countries. On the other hand, if the GC is small, the items-by-countries interaction is large, or the variance due to items differs significantly across countries, this might suggest that some of the items do not have the same meanings across countries, perhaps due to lack of construct equivalence caused by translation inaccuracies or other data collection problems. As discussed, in such instances the researcher can use CFA to identify problem items. The CFA approach does not provide guidelines regarding the number of items or subjects needed to obtain a given level of generalizability for future studies. Conversely, G theory allows the researcher to estimate the number of items and subjects needed to achieve this desired level of generalizability. Knowledge of the minimum number of required items and subjects helps in reducing the effort and cost of conducting cross-national studies. In such cases, a smaller set of items could be selected at random from the larger set of initial items, and the psychometric properties of the shortened scale could then be examined. In instances where only partial metric equivalency is established (i.e., not all the items have equal loadings across countries), one could use the CFA results to identify and select only the equivalent items. CFA does not require that the sample sizes be equal across countries or groups, whereas estimation of variance components in G theory requires a (nearly) balanced design. One can still estimate the variance components for unbalanced designs by using general linear models, but the estimated variances will not be additive. However, as discussed earlier, analyses to assess the effect of unequal cell sizes on variance estimation suggest that this limitation may not be severe.

To conclude, given that CFA and G theory have distinct benefits and limitations, we feel that the two techniques should be viewed as complementary, rather than competing, techniques. Through assessing the cross-national validity of scales via multiple methods, researchers should be better able to develop scales that have conceptual and measurement equivalency across countries, thus improving confidence in cross-national research findings.

Acknowledgements This research was funded by a grant from the Center for International Business Education and Research, Moore School of Business, University of South Carolina.

References Baumgartner, H., & Steenkamp, J. E. M. (1998, February). Multigroup latent variable models for varying numbers of items and factors with cross-national and longitudinal applications. Marketing Letters, 9, 21 – 35. Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley. Deshpande´, R., Farley, J. U., & Webster Jr., F. E. (1993, January). Corporate culture, customer orientation, and innovativeness in Japanese firms: A quadrad analysis. Journal of Marketing, 57, 23 – 37. Dixon, W. F., Brown, M. B., Engelman, L., & Jennrich, R. I. (Eds.) (1990). BMDP statistical software manual, vols. 1 and 2. University of California Press, Los Angeles, CA, pp. 1353 – 1360. Douglas, S. P., & Craig, S. C. (1997). The changing dynamic of consumer behavior: Implications for cross-cultural research. International Journal of Research in Marketing, 14, 379 – 395. Durvasula, S., Andrews, J. C., Lysonski, S., & Netemeyer, R. G. (1993, March). Assessing the cross-national applicability of consumer behavior models: A model of attitude toward advertising in general. Journal of Consumer Research, 19, 626 – 636. Finn, A., & Kayande´, U. (1997, May). Reliability assessment and optimization of marketing measurement. Journal of Marketing Research, 34, 262 – 275. Gurhan-Canli, Z., & Maheswaran, D. (2000, August). Cultural variations in country of origin effects. Journal of Marketing Research, 37, 309 – 317. Haynes, S. N., Nelson, K., & Blaine, D. D. (1999). Psychometric issues in assessment research. In P. C. Kendall, J. N. Butcher, & G. N. Holmbeck (Eds.), Handbook of research methods in clinical psychology (2nd ed.). New York: Wiley, pp. 125 – 154.

S. Sharma, D. Weathers / Intern. J. of Research in Marketing 20 (2003) 287–295 Horn, J. L., & McArdle, J. J. (1992, Fall – Winter). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18, 117 – 144. Hui, C. H., & Triandis, H. C. (1985, June). Measurement in crosscultural psychology: A review and comparison of strategies. Journal of Cross-Cultural Psychology, 16, 131 – 152. Netemeyer, R. G., Durvasula, S., & Lichtenstein, D. R. (1991, August). A cross-national assessment of the reliability and validity of the CETSCALE. Journal of Marketing Research, 28, 320 – 327. Neter, J., Kutner, M. H., Nachtsheim, C. J., & Wasserman, W. (1996). Applied linear statistical models (4th ed.). Chicago, IL: Richard D. Irwin. Peter, J. P. (1979, February). Reliability: A review of psychometric basics and recent marketing practices. Journal of Marketing Research, 16, 6 – 17. Rentz, J. O. (1987, February). Generalizability theory: A comprehensive method for assessing and improving the dependability of marketing measures. Journal of Marketing Research, 24, 19 – 28. SAS Institute (1994). SAS language: Reference, version 6 (1st ed.). Cary, NC: author.

295

Sharma, S. (1996). Applied multivariate techniques. New York: Wiley. Sharma, S., Shimp, T. A., & Shin, J. (1995, Winter). Consumer ethnocentrism: A test of antecedents and moderators. Journal of the Academy of Marketing Science, 23, 26 – 37. Shavelson, R. J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage Publications. Shimp, T. A., & Sharma, S. (1987, August). Consumer ethnocentrism: construction and validation of the CETSCALE. Journal of Marketing Research, 24, 280 – 289. Singh, J. (1995). Measurement issues in cross-national research. Journal of International Business Studies, 26(3), 597 – 619. SPSS (1998). SPSS reference guide. Chicago, IL: author. Steenkamp, J. E. M., & Baumgartner, H. (1995). Development and cross-cultural validation of a short form of CSI as a measure of optimum stimulation level. International Journal of Research in Marketing, 12, 97 – 104. Steenkamp, J. E. M., & Baumgartner, H. (1998, June). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78 – 90.