Longitudinal satisfaction measurement using latent growth curve models and extensions

Longitudinal satisfaction measurement using latent growth curve models and extensions

ARTICLE IN PRESS Journal of Retailing and Consumer Services 17 (2010) 321–331 Contents lists available at ScienceDirect Journal of Retailing and Con...

729KB Sizes 1 Downloads 91 Views

ARTICLE IN PRESS Journal of Retailing and Consumer Services 17 (2010) 321–331

Contents lists available at ScienceDirect

Journal of Retailing and Consumer Services journal homepage: www.elsevier.com/locate/jretconser

Longitudinal satisfaction measurement using latent growth curve models and extensions Christian Weismayer  Vienna University of Economics and Business Administration, Department of Cross-Border Business, Institute for Tourism and Leisure Studies, Vienna 1090, Augasse 2-6, Austria

a r t i c l e in fo

Keywords: Panel study Latent growth curve modeling Growth mixture modeling

abstract Latent growth curve modeling (LGCM) is used to describe changing latent aspects over time manifested in observed indicators. A case study of satisfaction indicators of cinema visitors observed over 12 months is used to detect such transitions from excitement factors to performance factors to basic factors, as mentioned in the Kano-model. The sample is split up into groups depending on slope trajectories and intercepts. More precisely, a growth mixture model (GMM) with random slopes and random intercepts is incorporated offering the possibility of visualizations including individual intercept and slope values. This figure allows deeper insight into the modifications of time. & 2010 Elsevier Ltd. All rights reserved.

1. Introduction Satisfaction measurements are used to describe the effects of service quality standards. These standards generate expectations and alter the peoples’ ongoing evaluation of a service. Differences between expectations prior to the utilization and fulfilled expectations afterwards are described by the confirmation/disconfirmation theory. Most of the social empirical studies use cross-sectional data and successfully detect non-linear influences of different service criteria on the overall satisfaction. The lack of longitudinal data prohibits the inclusion of timedependent aspects. But all societies develop over time and thus service and product quality expectations never persist at a steady state. Different satisfaction factors change over time. Excitement factors delight people because they are new and therefore surprising. These factors only contribute to satisfaction but not on dissatisfaction. Over time they become well-known and transform into performance factors. Latter include an impact on dissatisfaction too. Hence, transition of satisfaction factors generally shifts into the negative satisfaction part and at the end levels off at the solely dissatisfaction provoking basic factors. This tendency reflects a critical movement service providers have to be forewarned. As a result, leisure activities and tourism attractions have to offer increasingly higher quality standards. These standards are driven by raising demand for improved product features emerging from a quality adaptation of leisure time possibilities. Latent growth curve modeling (LGCM) is used here to identify influencing higher order aspects altering these quality levels over time and lead to the following questions:

 Tel.: + 43 1 313364583; fax: +43 1 3171205.

E-mail address: [email protected] 0969-6989/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.jretconser.2010.03.013

Which dimensions experience a steady level over a certain time interval, and which dimensions are affected by a positive or negative trend over time? The main advantage of LGCM, as some other notations like multivariate latent growth modeling or second order latent growth modeling point out, is the use of multiple indicators that are abstracted by latent factors. Based on these latent factors, transition of these aspects is measured by second order latent growth factors. Using growth models, ‘‘there are essentially two possible conceptualizations of the model; one within the random coefficient regression framework that we refer to as the random coefficient growth model (RCGM) and the other within the structural equation modeling (SEM) framework that is commonly referred to as the latent growth curve (LGC)’’ (Palardy and Vermunt, in press). Compared with multilevel models, the SEM approach is preferred here because the outcome of interest cannot be directly observed but is measured indirectly through a set of indicators at each occasion (Steele, 2008). The resulting LGCM findings offer insights into changing nonlinear trends. They provide information about increasing or decreasing positive and negative satisfaction shifts. It is the possibility of estimating person-specific changes over time that leads to conclusions concerning the satisfaction level trajectories of different segments. Several studies are being conducted to detect different consumer segments resulting in selective marketing behavior. For example, an Importance Performance Analysis (IPA) leads to interpretations of upcoming trends and the resulting management reactions if explored in a time-dependent way (Martilla and James, 1977). Mittal et al. (1999, 2001) demonstrate the temporally variant impact of attribute performance on overall satisfaction. According to these findings, the case study at hand provides an insight into general satisfaction shifts in the retailing sector.

ARTICLE IN PRESS 322

C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

1.1. Theoretical foundation If we keep an eye on these changes, it is possible for any kind of service or product provider to detect threatening transitions from excitement factors to one-dimensional factors, and from one-dimensional factors to basic factors (e.g., Johnston, 1995) as mentioned in the Kano-model and altered modifications (i.e., Matzler et al., 1996; Matzler and Hinterhuber, 1998; Matzler and Sauerwein, 2002; Lee et al., 2006; Chen et al., 2006; Mazanec, 2006; Audrain-Pontevia, 2006). The often noted time-dependent change of Kano-factors (i.e., Matzler et al., 2004; Fuchs and Weiermair, 2004) is empirically measured and described for cinemas in the present article. The underlying thought originally emerged from the theory of human motivation (Maslow, 1943). Various implementations in SEM define different attribute levels and asymmetric effects in customer satisfaction (i.e., Mittal et al., ¨ 1998; Fuller et al., 2006) and similar methods implementing such findings into general market models or loyalty models (i.e., Anderson and Mittal, 2000; Fullerton and Taylor, 2002; Finn, 2005) can be found in a multitude of articles. Related to the aforementioned IPA, service factors gaining satisfaction lose importance too (Sampson and Showalter, 1999). Being informed of such changes gives an operator the possibility to react to meet future expectations (Torres and Kline, 2006). As one can find different lifestyle changes, it becomes clear that also satisfaction changes are fostered by such influences. Finite mixture models (FMM) cope with such changes with the aim of discovering previously unknown subpopulations, namely latent classes, by GMM or the more restrictive LCGM, in an explorative data-driven way. This one-step solution copes with the disadvantage of a two-step solution like a subsequently conducted clustering approach. Growth trajectories of different a priori or a posteriori determined segments are frequently analyzed in psychology, medical or educational science (e.g., Reddy et al., 2003; Schaeffer et al., 2003; Delucchi et al., 2004). Methods grounded in the family of FMM are very complex in estimating the interesting parameters to detect latent classes, for example in satisfaction changing aspects. Underlying higher order aspects that drive heterogeneity are perhaps continuous, therefore making it difficult to describe influences which change trends in categorical classes. The present case study harnesses the widespread growth modeling approach for the unexplored longitudinal latent satisfaction transformation.

visits in the last month on a 6-point Likert scale. If they were not able to answer the question with one or more of the 11 satisfaction aspects, they had the possibility of skipping the question. This was necessary because if somebody, for example, had not bought any snacks or drinks at the concession stand, he would not have been able to answer correctly. For this reason many values are missing. If the respondent had not gone to the cinema in the past month, values would be missing on all indicators.

2.1. Descriptive sample statistics In total 5664 people answered at least one questionnaire. 203 respondents had never gone to the cinema and thus were not able to fill out the satisfaction questions resulting in a final number of 5461 respondents offering information on satisfaction. Descriptive data is available for the main part of the visitors. The mean age was 27.52 years, 3213 were female and 2153 were male. 4446 respondents came from Vienna and the rest from Viennese suburbs. The sample consists of 485 primary and secondary school students, 51 apprentices, 1899 university students, 45 currently serving their federally mandated military or volunteer service, 1955 employed persons, 267 self-employed persons, 263 public servants, 67 retirees and 334 others. One can see from Table 1 that most of the panel respondents evaluated had made 3, 4, 5, 6 or 7 visits in the 12 months span. The total number of visits for all survey participants equals 31,899 visitations with average evaluation number of 2,658.25 visits every month. The 11 service aspect indicators have lower frequencies because people were only forced to give an answer on the overall satisfaction. Respondents were also asked if they regularly visit ‘only one favorite cinema’ (714), ‘different cinemas’ (1557) or if they ‘have one favorite but nevertheless visit different cinemas’ (3095). Numbers in brackets indicate that most of the respondents prefer one cinema, but many also visit different cinemas. From the last result, we can conclude that most panel participants had created an expected service quality level from different cinemas. This is necessary to be able to assume a roughly stable level of service quality expectations. Panel mortality was very low. Table 2 contains the number of completed questionnaires with an average amount of 4,344.83 every month. Large observation numbers are needed to gain relevant results from LGCM.

2. Study description and intermediate analysis 2.2. Principal component analysis The aforementioned time-dependent aspect is exemplified in the field of leisure activities. Cinema visits are influenced by many different aspects prior to, during and after the movie screening. These aspects contain different higher order characteristics that are, in turn, influenced differently by the time factor. Some of these latent constructs are rather constant over time. But some of them undergo a dramatic modification, and diversified features of different cinema service aspects are indicators for these trends. To test different service features, a 12 month online questionnaire panel was conducted in Vienna, Austria from October 2006 to September 2007 with more than 5000 respondents, evaluating 11 different cinema service aspects. Every month panel respondents were asked to evaluate their satisfaction with one of their cinema

An exploratory factor analysis is used to evaluate factor loadings on the 11 underlying observed indicators. Table 3 shows the most reasonable three factor solution. Indicator values are treated as if they were measured on an ordinal scale. Using listwise deletion for the 12 months dataset, 10,626 observations of the 31,964 visitations are included. 58.53% of variance is explained by the three factors. Factors are named in a chronological way. Factor 1, called ‘screen,’ describes service items while watching the film. Factor 2, ‘inside the cinema,’ loads heavily on quality aspects cinema visitors are confronted with upon entering the movie theatre. Factor 3, called ‘prior visitation,’ includes items a person is confronted with before deciding to go

Table 1 Frequency table of visitations. Visits People

0 203

1 436

2 495

3 580

4 561

5 545

6 559

7 547

8 473

9 453

10 375

11 274

12 163

ARTICLE IN PRESS C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

323

Table 2 Response rate. Month People

September 4318

October 4536

November 4926

December 4813

January 4669

February 4376

Month People

March 4274

April 4368

May 3742

June 4211

July 4126

September 3779

Table 3 Factor loadings.

Technique/picture/sound (tps) Comfort (com) Employees (emp) Value-for-money (vfm) Buffet (buf) Atmosphere (atm) Location (loc) Movies offered (film) Information offered (inf) Reservation possibilities (res) Image/appearance (ima)

Factor 1: screen

Factor 2: inside the cinema

Factor 3: prior visitation

0.669 0.675 0.364 0.251 0.113 0.185 0.115 0.252 0.255 0.169 0.240

0.181 0.334 0.599 0.543 0.697 0.639 0.263 0.113 0.255 0.255 0.507

0.286 0.238 0.252 0.154 0.219 0.445 0.509 0.714 0.725 0.507 0.563

to a specific cinema. LGM is used to describe changes to these three latent dimensions over a time interval of 12 months.

3. Latent growth modeling steps 3.1. Data preparation At the beginning, visitations are pooled together independently according to the month in which they occurred so as to avoid a missing value replacement solution. The dataset is cut at the end, losing information of the last visitations, because only the first observations are used until the chosen number of visitations is reached. This solution certainly distorts the findings to some degree because many of the datasets are cut or deleted. Various numbers of observations per respondent are tested. The more observations are used, the shorter the number of respondents and the harder it is to detect changing satisfaction aspects. Furthermore, different solutions from a content interpretation perspective are measured. Tougher restrictions from version one to three lead to altered interpretations: (1) The simplest option is using all satisfaction evaluations without restriction of any kind. In this model, the influencing changes of consequences are perhaps strengthened because of the panel questionnaire itself. Although the biggest parts of detected changes are expected to occur due to general changes in satisfaction levels, this model can give evidence of an overall changing satisfaction trend concerning all cinemas. (2) Another solution is carried out by doing calculations by location. Visitors are taken into account evaluating every time point at the same cinema. But cinemas are allowed to vary between respondents. This possibility again fosters predictions concerning movie theatres in a general sense. Aspects that can be detected will be underlying aspects for all cinema locations included in the dataset. (3) The hardest constraint is defined by only taking people into account evaluating the same cinema over the considered time points. Only one movie theatre is included and has to be

identical among all respondents. These models lose most of the datasets. However, significant slope values can also be detected this way.

3.2. General model specifications To improve the model, information on time scores is considered, containing the time point information of each occasion. There are two different possibilities for defining these time scores: only the datasets from respondents are used that include one specific cinema location, but cinema locations are free to vary over the dataset. The first possibility is just to consider evaluations of other cinemas when calculating the time scores. Months without evaluations are not considered. This leads to a compression of time spans between two real time points, where the special location was evaluated. Therefore, the critical questionnaire confrontation is stressed because there is theoretically an evaluation assigned to every month independent of whether there was one. The second possibility takes real time spans into consideration when calculating the time scores. This is done independently regardless of whether there was an evaluation between the two evaluations of the specific cinema. This second method stresses more the confrontational aspect with a specific location and the time effect, because evaluations of other locations are not recognized as before. However, these additional variables do not really influence the estimated parameters. Ordinal and interval estimation outcomes do not differ substantially from the aforementioned principal component analysis. This is why an estimation method is used treating indicator values as being measured on an interval scale. Treatment of indicators as being ordinal boosts the computation effort of threshold parameters and sometimes cannot be estimated anymore. t-Values can be used to generalize results, but should be treated with caution so as to avoid misinterpretation concerning the underlying cinema visitor population. The intercept is always the same over time. Slope parameters are tested as being linear or quadratic and they are also measured freely. Quadratic slope arrangements do not show useful results here. To ensure that the three aforementioned latent dimensions are defined as being the same over time, factor structure loadings on

ARTICLE IN PRESS 324

C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

Fig. 1. Latent growth curve model.

the manifested indicators have to be the same over all time points. It is common to permit the disturbances to follow a first-order autoregressive structure, in which the covariances of adjacent disturbance terms are specified as free parameters (Preacher et al., 2008). However, most of the literature says that indicator-specific diachronic residual correlations should be allowed (Urban, 2004). They are implemented here between one and the same indicators over all time points, but not between different indicators. Dimensions that cannot be picked up by a factor at time point one can also not be explained by the model at time point two, or at later time points. Therefore, homoscedasticity cannot be assumed. Implementing time overlapping residual covariances have a positive impact on model fit statistics, but do not noticeably influence the estimated growth factors. A further possible extension not used here is to set the disturbance variance parameters equal to each other, stricter than first-order auto correlated residuals. But it is assumed that there are other service quality indicators influencing growth factors in a stronger way than the possibly unexplained slope errors that cannot be absorbed by the actual estimated parameters. Fig. 1 shows intercept and slope loadings of the latent dimensions.1 3.3. Slope and intercept specifications All intercept loadings in Fig. 1 are fixed to 1. Slope loadings start at 0 in period one. The second slope loading is fixed to 1 in period two. Therefore, in period two, the estimated parameter, called the slope mean, defines the change between period one and period two. 1 Legend: t stands for time point, here the sequence of evaluations (1 (1), 2 (2), 3 (3*), 4 (4*), 5 (5*)). Numbers with stars in brackets signalize free slope loadings, indicating unspecified trajectories at the related order of evaluation. The manifested satisfaction indicators are t/p/s ¼‘technique/picture/sound’ and com¼ ‘comfort’. The analyzed latent dimension is called ‘screen’ predefined by findings from the exploratory principal component analysis.

The other three slope loadings are fixed to 2, 3 and 4. So slope loadings are arranged in a linear way. The estimated parameter is multiplied by the respective loading and added to the intercept to obtain the satisfaction level at the desired time point. Slope loadings of the quadratic solution are stretched by a quadratic exponent of the time spans and will look like this: 0, 1, 4, 9, 16 and 25. Stars in brackets in Fig. 1 symbolize free estimated slope loadings of the unspecified trajectory model. Intercept parameters are defined in the same way as before. The first slope loading is again fixed to 0, and the one at time point two, representing the alteration of the satisfaction value between period one and period two, is fixed to 1. The estimated slope mean again defines satisfaction changes in real values. The slope mean has to be multiplied by the fixed and free estimated slope loadings to compute the satisfaction value for the given time point. So after estimating the *-slope loadings, the slope mean has to be stretched or compressed, depending on the strength of the satisfaction variation over time. It can also be interesting to detect different satisfaction trajectories of respondents over time with piecewise models. For example, two slope parameters can be defined as in Fig. 2. All loadings of the intercepts are fixed to 1. Loadings of the first slope for five time points are fixed to 0, 1, 2, 2 and 2, and loadings of the second slope are fixed to 0, 0, 0, 1 and 2. This test was arranged to find out if the individual trajectories at the beginning are steeper because of a later emerging familiarization with the questionnaire. No deeper insights are detected this way, apart from a significant negative correlation between the two slopes with a very low value of  0.016 (t-value: 3.803). Such a correlation can lead to interesting conclusions because respondents that experience a steeper trend at the beginning have flatter rates for the remaining time, and vice versa.

3.4. Empirical results and model comparison All models in this article are estimated using Mplus 5. In the first model, maximum likelihood estimation (ML) is used

ARTICLE IN PRESS C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

325

Fig. 2. Piecewise latent growth curve model.

Table 4 Parameter estimates of the linear slope.

Table 5 Parameter estimates of the non-linear slope/unspecified trajectory.

Linear solution

Non-linear solution

Time point

Loadings

Difference

t-Values

Slope

Cumulative change

Time point

Loadings

Difference

t-Values

Slope

Cumulative change

1 2 3 4 5

0.000 1.000 2.000 3.000 4.000

 1 1 1 1

0.000 0.000 4.123 3.826 3.752

 0.005  0.005  0.005  0.005  0.005

0.000  0.005  0.010  0.015  0.020

1 2 3 4 5

0.000 1.000 1.709 2.251 2.398

– 1.000 0.709 0.542 0.147

0.000 0.000 4.123 3.826 3.752

 0.010  0.010  0.010  0.010  0.010

0.00000  0.01000  0.01709  0.02251  0.02398

including five evaluations with a data matrix described before under point 1 in Section 3. The structure of the first model is identical with that of Fig. 1 without covariances of residual values. This model is estimated for the factor ‘prior visitation’ and results are listed in Table 4 for the linear trend with slope factor loadings of 0, 1, 2, 3 and 4. The aim here is to find out if there is an existing general trend of decreasing or increasing satisfaction over time, independent of the location visited at different time points. These can be positive variations, for example advancements of service quality or perhaps gentler evaluations over time. More critical evaluations, due to a cognitive confrontation with the cinema topic, or worse service quality offered can be the reason for satisfaction evaluations below the starting level at the beginning of the study. Steady satisfaction levels signal that no changes appeared or that positive and negative changes cancelled out each other. A general satisfaction statement with the service provider ‘cinema’ will be possible. The linear model in Table 4 shows a slope mean value of  0.005 (t-value:  2.151), using 892 visitors.

Over the first five evaluations, a satisfaction value of 0.005 can be subtracted for every time span. If hypothetically a linear trend is estimated by free estimated slope loadings, differences between time point one and time point two, time point two and time point three, and so on, must be equal to one. If the model is measured with free estimated slope loadings, namely an unspecified trajectory model like in Table 5, a general slope value of  0.010 has a nearly significant t-value of  1.945. Satisfaction movements do not occur by the same degree of strength between equal time spans. The negative effect decreases over time, indicating higher satisfaction changes at the beginning, with a value of  0.01, and lower ones at the further time points, namely  0.00709 at time point three,  0.00542 at time point four and  0.00147 at time point five. t-Values of the free estimated slope loadings are significant at a 5% probability of error. Cumulative satisfaction changes are also listed. At the end both reach a similar value, comparing the linear solution with a

ARTICLE IN PRESS 326

C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

Fig. 3. Linear and non-linear slopes with confidence intervals.

satisfaction decrease of 0.020 and the non-linear solution with a decrease of  0.024. To be able to visualize the difference between the linear model and the unspecified trajectory model, the 95% confidence intervals are calculated to see if they overlap. Fig. 3 makes clear, that the difference between the linear and the unspecified trajectory model will not be significant. However, one should keep in mind that the observed indicators do not follow a normally distributed population. ‘‘Freeing some of the loadings on a linear slope factor to create a shape factor allows direct comparison of the two models using a nested-model difference test, essentially a test of departure from linearity’’ (Preacher et al., 2008). Paying attention to the number of degrees of freedom of about 300 and the number of respondents, namely 892, with evaluations on five time points, the model comparison will not result in meaningful interpretations. Nevertheless, it is exemplified here. The chi-square test of model fit of the non-linear model is 6,439.924 with 304 degrees of freedom, and the one of the linear model is 6,445.045 with 307 degrees of freedom. The three degrees of freedom difference result from the three free estimated slope loadings at time point three, four and five. A chi-square difference test with a deviance of 5.121 and three degrees of freedom shows a non-significant difference between the two models with a p-value of 0.16315. Therefore, the chi-square deviance test favors the linear model compared with the more complex non-linear model. This conclusion is deducted from the Occam’s razor principle of parsimony. Also, significant location-specific satisfaction changes can be detected. A model containing five time points, a linear slope parameter, residual correlations, and only evaluations of one specific location without replacing missing values, previously explained as the hardest restriction from Section 3.1 on the data matrix, provides a significant slope value of  0.007 (t-value: 2.098) in 223 observations for the latent dimension ‘prior visitation’. These satisfaction shifts can be due to a general decrease in satisfaction with cinema visitation in general, but can also have location-specific reasons. Table 6 summarizes the applicability of the different growth factor loading specifications concerning the satisfaction shifts for

Table 6 Model comparison. Slope factor Case study results loadings Linear slope loadings Quadratic slope loadings Free estimated slope loadings

0, 1, 2, 3, 4, 5, y 0, 1, 4, 9, 16, 25, y 0, 1, *, *, *, *, y

Best alternative Not confirmed Transition detected but not necessary because fluctuating changes are of minor importance

the case study at hand. A stable linear trend is best capable to describe a general satisfaction shift over a long-time period of 12 month. Therefore, no short time actions of cinema operators overwhelm the expected invariant trend.

4. Further thoughts Correlation between respondents’ individual intercepts and slopes are possible because the model mentioned above was specified as a random intercept random slope model, allowing individual intercept and slope variation. Fig. 4 shows a scatter plot of the first linear model mentioned with a correlation value of  0.007 (t-value:  3.236). Each person is represented with a square on the dimension ‘screen’. The correlation value shows that the lower the satisfaction at the beginning, the more a satisfaction increasing effect is possible over the 12 month period. Conversely, respondents who started at a high satisfaction intercept had a lower chance of increasing satisfaction. Visitors located at a medium satisfaction level move around this area. In sum, they all move towards this level. Visitors who are unsatisfied do not become more unsatisfied over time. If changes occur they are pushed into a more positive satisfaction area. Moreover it is hard to get visitors who are very satisfied again more satisfied. They experience a steady high satisfaction level and are only sensitive on negative experiences because there

ARTICLE IN PRESS C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

327

Fig. 4. Scatter plot of individual intercepts and slopes.

Table 7 Intercept and slope means, variances and loadings for single indicators for the linear and the unspecified trajectory models. Indicator

Linear

Free

Intercept

Slope

Intercept

Slope

Time parameters

m

5.434 (571.596)

 0.009 (  6.269)

5.475 (442.912)

 0.074 (  4.699)

v

0.162 (17.058)

0.001 (5.575)

0.149 (8.531)

0.040 (2.628)

com 5452

m

5.174 (483.797)

0.000 (0.164)

5.185 (436.430)

 0.003 (  0.744)

vfm 5441

v m

0.222 (18.203) 4.651 (330.811)

0.002 (6.816) 0.008 (4.201)

0.217 (11.220) 4.613 (151.219)

0.006 (0.558) 0.024 (0.992)

v

0.511 (24.799)

0.003 (8.811)

0.536 (18.469)

0.009 (0.546)

m

4.995 (440.337)

 0.005 (  3.552)

5.055 (332.147

 0.055 (  3.878)

v

0.274 (20.271)

0.001 (5.500)

0.227 (7.584

0.028 (1.856)

0.000(0.000), 1.000(0.000), 0.863(4.913) 0.656(3.835), 1.412(6.009), 1.452(5.765) 1.778(5.686), 1.550(5.220), 2.024(5.679) 1.332(5.352), 1.537(5.455), 1.519(5.562) 0.000(0.000), 1.000(0.000), 1.078(1.227) 1.795(1.390), 2.973(1.346), 4.006(1.199) 4.054(1.252), 4.720(1.236), 5.833(1.218) 5.466(1.200), 6.087(1.141), 3.278(1.166) 0.000(0.000), 1.000(0.000), 1.448(1.795) 3.596(1.457), 3.499(1.528), 2.605(1.612) 3.169(1.538), 2.926(1.427), 3.968(1.352) 5.858(1.147), 5.368(1.189), 6.500(1.201) 0.000(0.000), 1.000(0.000), 1.374(5.049) 1.595(5.026), 2.548(4.916), 1.639(4.568) 1.860(4.505), 2.085(4.878), 1.651(4.881) 2.307(5.007), 2.192(4.638), 1.568(4.362)

tps 5448

ima 5437

are no surprising positive service experiences. Later on this argument can be disproved for a group of visitors. Furthermore, the model shows significant variances for the dimension ‘screen’, namely 0.139 (t-value: 12.343) for the intercept values and 0.004 (t-value: 5.324) for the slope values. This is the reason why a random slope model makes more sense than a fixed slope model with equal slopes over all respondents. Equally, the interpretation can be formulated for the intercept values without using a fixed intercept model. This raises the question if there are groups of respondents that cannot be lumped together, which shall be answered later in this paper. However, it makes sense to estimate slopes out from single indicators to be able to describe them on an individual indicator basis and not on a factor level. This is the reason why calculations here lead to the effort of calculating intercepts and slopes separately for the 11 items. Intercept and slope means (m), variance (v) of both with t-values in brackets, and the number of individuals below the indicator name contained in the models with replaced missing values, are exemplary listed for five satisfaction indicators in Table 7. The meanings of indicator abbreviations can be found in Table 3. Observed indicator values are treated as if they were measured on an interval scale. All intercept means and variances are highly significant for both the linear and the unspecified trajectory model. All of the linear slope means, except comfort, are significant. There are positive ones, indicating higher satisfaction over time, and negative ones, indicating lower satisfaction over time. Interestingly, all of the slope variances are significant. Only slope loadings of technique/ picture/sound, location, film, reservation possibilities and image of the unspecified trajectory model are significant and all of them

are negative following a general negative non-linear trend over time. This can be interpreted as an indicator of decreasing general satisfaction levels over time. The corresponding free estimated slope loadings are also all highly significant but are very low and level off at around two. Compared with the linear slope means, the slope means of the unspecified trajectory models are higher, but the accumulated satisfaction shifts level off at the same new satisfaction level after 12 months. No non-significant slope mean of the unspecified trajectory models shows significant variances over the sample. Only two of the five significant slope means of the unspecified trajectory models show significant variances, namely technique/picture/sound and variety of films. Many of these arguments can be seen as criteria for sample heterogeneity.

5. Mixture modeling The following model is used to shed light on these underlying distortions emerging by slope and intercept variances. It is used to detect latent groups inherent in the sample to explain these differences surfacing across respondents. In the next two models, the overall satisfaction values over 12 time points are used, independent of which cinema is evaluated. If missing values are not replaced, only 163 respondents are available due to listwise deletion. Convergence problems occur or negative variances are contained in the results. If data is pooled together as before in the model with five observations, primarily people who seldomly visit a movie theatre are deleted. This will distort results in favor of frequent visitors. The problem of pairwise deletion when creating

ARTICLE IN PRESS 328

C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

the covariance–variance matrix can lead to useless negative variances later on in the estimation procedure.

5.1. Estimation and fit Thus, one of the main underlying problems of this study is the high amount of missing values on the observed indicator, namely 53.11% of the overall satisfaction. Therefore, a solution suggestion for coping with missing values is necessary. For data that is not missing completely at random (MCAR) full quasi-likelihood (FQL) estimation is preferred to listwise quasi-likelihood (LQL) and pairwise present approach (PPA) (Muthe´n et al., 1987). A further study compares listwise deletion, pairwise deletion, similar response pattern imputation and full information maximum likelihood (FIML) estimation (Enders and Bandalos, 2001). They used four criteria, namely proportion of convergence failures, parameter estimate bias, parameter estimate efficiency and model goodness of fit as evaluation criteria. Datasets are created by varying factor loading, sample size and percentage of missing data. The FIML estimation is preferred for missing at random (MAR) and MCAR data for all dataset variations and a high percentage of missing values of 25% for all four evaluation criteria. FIML estimation is recommended, because, compared with pairwise and listwise deletion, while the latter two omit some data from consideration, FIML uses all available information to estimate parameters (Preacher et al., 2008). Especially for cohortsequential designs often used in LGC, FIML offers a good possibility to cope with logically volitional emerging missing values. Therefore, the FIML estimation method is used here with robust standard errors. If only visitors with a minimum of eight visits are included in the estimation procedure, 29.55% of missing values appear in the final model with 1738 visitors. Citing Asparouhov and Muthe´n (2006), ‘‘the maximum likelihood estimation of mixture models in general is susceptible to local maximum solutions.’’ Therefore, ‘‘yinitial sets of random starting values are first selected (here 250). Partial optimization is performed for all starting value sets followed by complete optimization for the best starting value sets (here 25)’’. It is not clear how many starting value sets should be used in general. However, the high number of random starts maintained by the EM algorithm is no guarantee for global maxima. A discrete latent class variable is defined as being able to describe differences between real classes and no-fictive classes as when using a continuous latent class variable. Examples describing the way of detecting groups are presented in literature, starting with an unconditional model with one class, followed by an unconditional model with more than one class. The previously mentioned GMM is extended by incorporation of background covariates, distal outcome variables, and direct paths from covariates to distal outcomes of the latent growth classes (Li et al., 2001). These models are named general growth mixture models (GGMM). Here GMM is used, allowing individuals within-class variation on intercept and slope factors.

5.2. Two class model Data is treated as being scaled at intervals and a linear slope is measured for two predefined groups and later on for three groups, both without specified residual correlations. Factor loadings are fixed for every indicator over all 12 time points and all classes to ensure invariant content of the latent factors between classes and over time. Of course one can argue for the cognitive development of latent aspects resulting in changes of the inherent information of dimensions over time. For example, O’Neill (2003) detected decreasing service quality perceptions over time but pointed out different complexities of the SERVQUAL factor structure at different time points. This theory should hopefully stimulate further research. The Vuong-Lo-Mendell-Rubin likelihood ration test as well as the parametric bootstrapped likelihood ratio test, for one versus two classes as well as for three versus two classes, show both significant p-values indicating improvement in model fit for the particular solution with the higher number of classes. Because of convergence problems, the random coefficient of the slope is fixed to zero in class one. Consequently there is no within-class variation. Latent class one contains 1545 (85.3%) visitors and latent class two 192 (14.7%) visitors. The entropy value, giving the estimated conditional probability of an individual to be in the assigned group is 0.736. The entropy criterion measures a model’s suitability by balancing model fit and model complexity (Celeux and Soromenho, 1996), supporting the decision of choosing the mixture model that provides the greatest evidence for a clustering structure. Results are visualized in Fig. 5. Class one has an intercept mean value of 5.159 (t-value: 211.457) and a mean slope value of 0.003 (t-value: 1.128). Within-class variance of the intercepts shows a value of 0.089 (t-value: 9.142). There is no variance for the slope values and no correlation between slope and intercept values because slope variance is fixed to zero. Class two has an intercept mean value of 4.036 (t-value: 40.111) and a mean slope value of 0.015 (t-value: 1.076). Within-class variance shows a value of 0.993 (t-value: 7.622) for the intercepts and 0.027 (t-value: 7.607) for the slopes. Correlation between intercept and slope is significant with a value of  0.121 (t-value:  6.340). Class mean values of overall satisfaction near six denote highly satisfied evaluations and those near one represent very unsatisfied evaluations. Group one represents a large very satisfied group, whose satisfaction values no longer change, resulting in a non-significant slope value. The second group represents a medium-sized satisfied group, also with a non-significant satisfaction trajectory. Variances are significant for the intercept values in both groups and also for the slope value in group two. They are different from each other allowing different variances in different classes. Groups are identified that are more heterogeneous or homogeneous than others. LCGM offer another possibility of not allowing intercept and slope values to vary individually within classes. Completely contrary to these models, finite growth mixture models (FGMM) offer a possibility where different

Fig. 5. Scatter plot of the two class model.

ARTICLE IN PRESS C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

growth trajectories are hypothesized to be captured by classvarying random coefficients, including class-varying growth factor means, variance and covariance structure, and time-specific error variances (Li et al., 2001). This is beyond the scope of this article, given that increasing model complexity (i.e., multiple latent classes, many indicators including predictor and outcome variables) can add to execution time, convergence problems, and the likelihood of improper solutions (Li et al., 2001). Again, parts of satisfaction starting levels described by the intercepts, and parts of satisfaction changes described by the slopes, cannot be explained. Compared with all the other possible random models, the one presented above was the best concerning the sample-size-adjusted BIC value of 51,505.742. Duncan et al. (2002) recommend a comparison by the sample-size-adjusted BIC: ‘‘Model fit for a mixture analysis is performed by the log likelihood value. Using chi-square-based statistics (i.e., the log likelihood ratio), fit for nested models can be examined. It is, however, not appropriate to use such values for comparing models with different numbers of classes, given that this involves inadmissible parameter values of zero class probabilities. In these instances, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) can be used instead. Mplus also provides a sample-size-adjusted BIC (ABIC), which has been shown to give superior performance in a simulation study for latent class analysis models’’. 5.3. Three class model Because of the high variance of intercept and slope factors, a three class model is presented in Fig. 6. It has an ABIC value of 51,483.962 and an entropy value of 0.535. The entropy value decreases because, in this model, for two of the three classes, intercept, as well as slope values, have to be fixed to zero. Class number three shows a significant intercept variance of 0.952 (t-value: 7.040) and slope variance 0.028 (t-value: 8.009). Unexplained variance is detected indicating a solution with more classes. Classification of individuals based on their most likely latent class membership has the following proportions: class one (#868, 49.97%), class two (#684, 39.38%) and class three (#185, 10.65%). The average latent class probabilities for the most likely latent class membership (rows) by latent class (columns) are visualized in Table 8 and look reliable. Table 9 shows the mean intercept and slope values for the three classes. The model shows significant intercept means for all classes and a significant slope mean in class one. A linear trend is assumed for all three classes indicating equal trajectories. This restriction has to be made here because the main aim of the paper is to detect a generally strong decreasing trend of all cinema visitors. This trend is assumed to be greater than the individual changes of service quality levels in specific locations. If this is the case, only negative slope means for the three classes will be detected. Otherwise, the impact of other influences like changes in location-specific cinema quality indicators will be able

329

to overwhelm the hypothesized general negative satisfaction trajectory. Given that a positive significant slope mean is detected for class one, respondents with increasing satisfaction values over time are contained in the sample and the hypothesis of a superior, generally decreasing satisfaction trend overwhelming the other individual trends can be disproved. The correlation between intercepts and slopes is only available for class three, namely 0.122 (t-value: 6.268). In the other classes, there is not enough heterogeneity to model intercept or slope variance and therefore no covariance between them is available. If visitors in this group start at a high level, their satisfaction value is going to decrease over time and vice versa. This statistical spread is explained in the model by the significant variances of intercept 0.952 (t-value: 7.040) and slope 0.028 (t-value: 8.009), as mentioned above. Additionally, class three representing the most widespread group of visitors shows the lowest mean satisfaction level of all three groups. This group has to be given the best attention. It includes visitors that are unsatisfied but experience an increasing trend but also visitors that are satisfied but experience a decreasing trend. From a longterm perspective, all visitors of class three will level off around a low satisfaction area of 3.962 without intervention. For sure, this is not the desired goal. The main aim for cinema operators concerning class three is the channeling of satisfaction shifts into a positive direction. Class two represents something like a mixture of class one and three. Satisfaction level is closer to class one compared with class three. Unfortunately it stopped at a level that has to be improved paying attention to class one. Class one starts with a high satisfaction value of 5.381. Nevertheless, satisfaction of this class shows a continuing increase and visitors become more satisfied over time. They end up with satisfaction levels of 5.501 (5.381+ 12*0.010) over 12 month. This level indicates customer delight and represents exactly what a cinema operator would like to see: satisfied visitors on the way to delight. Reasons for this high satisfaction level have to stipulate cinema operators to take action in classes two and three.

Table 8 Latent class probabilities.

1 2 3

1

2

3

0.824 0.195 0.006

0.163 0.702 0.102

0.014 0.103 0.892

Table 9 Intercept and slope means of the growth mixture model.

Intercept mean Slope mean

Fig. 6. Scatter plot of the tree class model.

Class 1

Class 2

Class 3

5.381 (167.699) 0.010 (2.802)

4.880 (95.612)  0.008 (  1.394)

3.962 (39.388)  0.008 ( 0.539)

ARTICLE IN PRESS 330

C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

6. Discussion The main aim of this paper is to gain deeper insight into satisfaction transitions. Comparison between the linear and the non-linear solution prefers the linear one for the actual case study. However, these results cannot be generalized neither for all satisfaction attributes or higher dimensions, nor for customer segments and even less for different service industries. In the long run these shifts will level off at the positive or negative satisfaction boundaries. A reason is given by Pollack (2008) who shows that dissatisfiers after reaching an acceptable quality level, cannot increase satisfaction any longer and following infinite service quality increases make no sense. Consequently dissatisfiers have to present progressive satisfaction shifts. The asymmetric connection between attribute performance and overall satisfaction and other theoretically driven arguments (Vargo et al., 2007) foster these statements. Stable shifts over time indicate that the actual satisfaction level must be somewhere in between these two extremes. Methodological approaches to detect excitement, performance and basic factors like the dummy-regression technique or Vavra’s importance grid do not show clear results for the cinema data set. But this is not astonishing as the reason may be grounded in the heterogeneity of customer satisfaction. The lower satisfied customer segments have more space for overall satisfaction increases compared with the highly satisfied ones. Segments with different levels as well as diverse transformations have to be taken simultaneously into consideration for future research as realized here. An example for the Kano-factor shift cannot be provided but the present methodological approach results in a clear statement. Kano-factor identification techniques may lead to different segment and time-specific results for the same item. In summary, cinema operators have to match new performance levels to keep up-to-date. The underlying changing quality levels give a direction and benchmark for future standards. Excitement factors today are market entry barriers tomorrow and standards of living for the next generation. It will be necessary to identify different customer segments with different quality expectations and satisfaction levels by further describing them with socio-demographic aspects like age, sex, income, or other characteristics like the ownership of a cinema customer card. Psychographic aspects are relevant too, like a favorite genre, or social aspects by which people are persuaded to visit a cinema. If a service provider is successful in identifying different trends the necessary steps can be taken in time before changes become too powerful in the future. The problem of missing values is one of the most important facts to be mentioned here because it distorts results. Multilevel models (MLM) are better able to cope with missing values. In an article by Stoel et al. (2003), the differences and similarities between the two statistical methods are described in detail. Multilevel mixture models (MLMM) offer the possibility of estimating latent classes in the same estimation run as the growth parameters. Consequently they combine the two characteristics of GMM in one step too. These models are intended to be applied to the present study.

References Anderson, E.W., Mittal, V., 2000. Strengthening the satisfaction-profit chain. Journal of Service Research 3 (2), 107–120. Asparouhov, T., Muthe´n, B.O., 2006. Multilevel mixture models, Version 3. In: Hancock, G.R., Samuelsen, K.M., 2007 (Eds.). Advances in Latent Variable Mixture Models, Part I: Multilevel and Longitudinal Systems. Audrain-Pontevia, A.F., 2006. Kohonen self-organizing maps: a neural approach for studying the links between attributes and overall satisfaction in a services

context. Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behaviour 19, 128–137. Celeux, G., Soromenho, G., 1996. An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification 13, 195–212. Chen, T.-L., Lee, Y.-H., Hua, C., 2006. Kano two-dimensional quality model and importance-performance analysis in the student’s dormitory service quality evaluation in Taiwan. Journal of American Academy of Business, Cambridge 9 (2), 324–330. Delucchi, K.L., Matzger, H., Weisner, C., 2004. Dependent and problem drinking over 5 years: a latent class growth analysis. Drug and Alcohol Dependence 74 (3), 235–244. Duncan, T.E., Duncan, S.C., Strycker, L.A., Okut, H., Li, F., 2002. Growth Mixture Modelling of Adolescent Alcohol Use Data: Chapter Addendum to ‘An Introduction to Latent Variable Growth Curve Modelling: Concepts, Issues, and Applications’. Oregon Research Institute, Eugene, OR. Enders, C.K., Bandalos, D.L., 2001. The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling 8 (3), 430–457. Finn, A., 2005. Reassessing the foundations of customer delight. Journal of Service Research 8 (2), 103–116. Fuchs, M., Weiermair, K., 2004. Destination benchmarking: an indicator-system’s potential for exploring guest satisfaction. Journal of Travel Research 42, 212–225. ¨ Fuller, J., Matzler, K., Faullant, R., 2006. Asymmetric effects in customer satisfaction. Annals of Tourism Research 33 (4), 1159–1163. Fullerton, G., Taylor, S., 2002. Mediating, interactive, and non-linear effects in service quality and satisfaction with services research. Canadian Journal of Administrative Sciences 19 (2), 124–136. Johnston, R., 1995. The determinants of service quality: satisfiers and dissatisfiers. International Journal of Service Industry Management 6 (5), 53–71. Lee, Y.-H., Chen, T.-L., Hua, C., 2006. A Kano two-dimensional quality model in Taiwan’s hot spring hotels service quality evaluations. Journal of American Academy of Business, Cambridge 8 (2), 301–305. Li, F., Duncan, T.E., Duncan, S.C., Acock, A., 2001. Latent growth modelling of longitudinal data: a finite growth mixture modelling approach. Structural Equation Modeling 8 (4), 493–530. Martilla, J.A., James, J.C., 1977. Importance-performance analysis. Journal of Marketing 41 (1), 77–79. Maslow, A.H., 1943. A theory of human motivation. Psychological Review 50 (4), 370–396. Matzler, K., Hinterhuber, H.H., Bailom, F., Sauerwein, E., 1996. How to delight your customers. Journal of Product and Brand Management 5 (2), 6–18. Matzler, K., Hinterhuber, H.H., 1998. How to make product development projects more successful by integrating Kano’s model of customer satisfaction into quality function deployment. Technovation 18 (1), 25–38. Matzler, K., Bailom, F., Hinterhuber, H.H., Renzl, B., Pichler, J., 2004. The asymmetric relationship between attribute-level performance and overall customer satisfaction: a reconsideration of the importance-performance analysis. Industrial Marketing Management 33 (4), 271–277. Matzler, K., Sauerwein, E., 2002. The factor structure of customer satisfaction: an empirical test of the importance grid and the penalty-reward-contrast analysis. International Journal of Service Industry Management 13 (4), 314–332. Mazanec, J.A., 2006. Exploring tourist satisfaction with non-linear structural equation modelling and inferred causation analysis. Journal of Travel and Tourism Marketing 21 (4), 73–90. Mittal, V., Ross Jr., W.T., Baldasare, P.M., 1998. The asymmetric impact of negative and positive attribute-level performance on overall satisfaction and repurchase intentions. Journal of Marketing 62 (1), 33–47. Mittal, V., Kumar, P., Tsiros, M., 1999. Attribute-level performance, satisfaction, and behavioral intentions over time: a consumption-system approach. Journal of Marketing 63, 88–101. Mittal, V., Katrichis, J.M., Kumar, P., 2001. Attribute performance and customer satisfaction over time: evidence from two field studies. Journal of Services Marketing 15 (5), 343–356. Muthe´n, B., Kaplan, D., Hollis, M., 1987. On structural equation modelling with data that are not missing completely at random. Psychometrika 52 (3), 431–462. O’Neill, M., 2003. The influence of time on student perceptions of service quality: the need for longitudinal measures. Journal of Educational Administration 41 (3), 310–324. Palardy, G., Vermunt, J.K. Multilevel growth mixture models for classifying grouplevel observations. Journal of Educational and Behavioral Statistics, in press. Pollack, B.L., 2008. The nature of the service quality and satisfaction relationship: empirical evidence for the existence of satisfiers and dissatisfiers. Managing Service Quality 18 (6), 537–558. Preacher, K.J., Wichman, A.L., MacCallum, R.C., Briggs, N.E., 2008. Latent Growth Curve Modeling Series: Quantitative Applications in the Social Sciences. Sage. Reddy, R., Rhodes, J.E., Mulhall, P., 2003. The influence of teacher support on student adjustment in the middle school years: a latent growth curve study. Development and Psychopathology 15, 119–138. Sampson, S.E., Showalter, M.J., 1999. The performance-importance response function: observations and implications. The Service Industries Journal 19 (3), 1–25. Schaeffer, C.M., Petras, H., Ialongo, N., Poduska, J., Kellam, S., 2003. Modeling growth in boys’ aggressive behavior across elementary school: links to later criminal involvement, conduct disorder, and antisocial personality disorder. Developmental Psychology 39 (6), 1020–1035.

ARTICLE IN PRESS C. Weismayer / Journal of Retailing and Consumer Services 17 (2010) 321–331

Steele, F., 2008. Multilevel models for longitudinal data. Journal of the Royal Statistical Society: Series A (Statistics in Society) 171 (Part 1), 5–19. Stoel, R.D., Van den Wittenboer, G., Hox, J.J., 2003. Analyzing longitudinal data using multilevel regression and latent growth curve analysis. Metodologia de las Ciencas del Comportamiento 5, 21–42. Torres, E.N., Kline, S., 2006. From satisfaction to delight: a model for the hotel industry. International Journal of Contemporary Hospitality Management 18 (4), 290–301.

331

¨ Urban, D., 2004. Neue Methoden der Langsschnittanalyse, Zur Anwendung von latenten Wachstumskurvenmodellen in Einstellungs- und Sozialisations¨ forschung. Lit Verlag Munster. Vargo, S.L., Nagao, K., He, Y., Morgan, F.W., 2007. Satisfiers, dissatisfiers, criticals, and neutrals: a review of their relative effects on customer (dis)satisfaction. Academy of Marketing Science Review 11 (2), 1–19.