Validation of a Subjective Outcome Evaluation Tool for Participants in a Positive Youth Development Program in Hong Kong

Validation of a Subjective Outcome Evaluation Tool for Participants in a Positive Youth Development Program in Hong Kong

Original Study Validation of a Subjective Outcome Evaluation Tool for Participants in a Positive Youth Development Program in Hong Kong Daniel T.L. Sh...

262KB Sizes 0 Downloads 27 Views

Original Study Validation of a Subjective Outcome Evaluation Tool for Participants in a Positive Youth Development Program in Hong Kong Daniel T.L. Shek PhD, FHKPS, BBS, SBS, JP 1,2,3,4,*, Cecilia M.S. Ma PhD 1 1

Department of Applied Social Sciences, The Hong Kong Polytechnic University, Hong Kong, P.R. China Centre for Innovative Programmes for Adolescents and Families, The Hong Kong Polytechnic University, Hong Kong, P.R. China 3 Department of Social Work, East China Normal University, Shanghai, P.R. China 4 Kiang Wu Nursing College of Macau, Macau, P.R. China 2

a b s t r a c t Study Objective: Utilizing primary-factor and hierarchical confirmatory factor analyses, this study examined the factor structure of a subjective outcome evaluation tool for the program participants for the Project P.A.T.H.S. in Hong Kong. Design and Participants: A subjective outcome evaluation scale was used to assess the views of program participants on the program, implementer, and program effectiveness of the Project P.A.T.H.S. A total of 28,431 Secondary 2 students responded to this measure after they had completed the program. Results: Consistent with the conceptual model, findings based on confirmatory factor analyses provided support for the primary factor model and the higher-order factor model containing 3 primary factors. By randomly splitting the total sample into 2 subsamples, support for different forms of factorial invariance was found. There was also support for the internal consistency of the total scale and the 3 subscales. Conclusion: Confirmatory factor analyses provided support for the factorial validity of the subjective outcome evaluation instrument designed for program participants in the Project P.A.T.H.S. in Hong Kong. Key Words: Confirmatory factor analysis, Factorial invariance, Subjective outcome evaluation

Introduction

In the field of youth work, development of innovative education programs is commonly carried out by youth workers and researchers. For example, a youth worker may design training programs aiming to promote resilience, optimism, and positive identity of young people based on the advances in positive psychology. Similarly, in the field of education, new courses are designed by teachers in response to the changing socio-cultural environment, such as globalization, migration, and global financial crises. How far can such innovative education programs help young people? Obviously, this is a scientific evaluation question. Although different evaluation strategies can be employed (such as experimental evaluation), subjective outcome evaluation is commonly used.1 Subjective outcome evaluation adopts the client satisfaction approach.2 Generally speaking, program participants are invited to indicate whether they are satisfied with a program (such as program design, teachers' skills and attitudes, and logistic arrangement) and with their perceived effectiveness of the program.3e5 Some examples in the field of higher education can illustrate the use of subjective outcome evaluation where there are numerous subjective outcome evaluation tools to

The authors indicate no conflicts of interest. * Address correspondence to: Professor Daniel T.L. Shek, PhD, FHKPS, BBS, SBS, JP, Associate Vice President (Undergraduate Programme) and Chair Professor of Applied Social Sciences, Department of Applied Social Sciences, Faculty of Health and Social Sciences, The Hong Kong Polytechnic University, Room HJ407, Core H, Hunghom, Hong Kong E-mail address: [email protected] (D.T.L. Shek).

collect feedback of the students on a subject. Cohen6 proposed that 6 dimensions in teachingdskills, rapport, structure, difficulty, interaction, and feedbackdcould be used to understand student ratings. In another popularly used measure, 9 aspects of educational quality were proposed by Marsh and Roche7 in the Students' Evaluations of Educational Quality (SEEQ). The dimensions include learning/value, teacher enthusiasm, organization/clarity, group interaction, individual rapport, breadth of coverage, examinations and grading, assignments and readings, and workload. In another study, Litzelman et al8 proposed 7 dimensions of teaching effectivenessdpositive learning environment, control of the teaching session, communicating goals to the learners, promoting understanding and retention, evaluation of achievement of goals, feedback to the learners, and promotion of self-directed learning. In the Course Experience Questionnaire, Waugh9 identified 6 dimensions of course experiencedstudent support, learning resources, learning community, intellectual motivation, course organization, and graduate qualities. Spooren et al10 developed a theory on teaching quality with 8 dimensions and 22 sub-dimensions. The major dimensions include course objectives, subject matter, course structure, teaching activities, course materials, course feasibility, coaching and evaluation. Confirmatory factor analyses showed 10 empirically supported dimensions. In short, subjective outcome evaluation is commonly used in different education settings with the primary purpose to gauge the feedback of the students. Subjective outcome evaluation is also widely used in the medical field. Le et al11 used 6 items to assess the effectiveness of a distance asthma learning program for

1083-3188/$ - see front matter Ó 2014 North American Society for Pediatric and Adolescent Gynecology. Published by Elsevier Inc. http://dx.doi.org/10.1016/j.jpag.2014.02.011

S44

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

pediatricians. The items assessed (a) whether the education was useful, understandable, and interesting; (b) whether the format was easy to use and valuable; and (c) whether the participants learned a lot from the program. Bjerre et al12 evaluated the Measure of Processes of Care in Swedish participants which was assumed to have 5 dimensions, including enabling and partnership, providing general information, providing specific information about the child, coordinated and comprehensive care, and respectful and supportive care. They concluded that the measure showed good sensitivity as an assessment tool and it was recommended for future research and practical use. Wilkins et al13 examined the relationship between care experiences and ratings of care by parents in a Medicaid population. They showed that care experiences were related to parent ratings across different racial, ethnic, and language subgroups. Several observations can be noted in the subjective outcome evaluation tools reported in the literature. First, while some dimensions were proposed in some of the measures, empirical support for such dimensions was weak. For example, in the study of Oermann et al,14 17 items on chest physiotherapy (eg, efficacy, convenience, comfort, overall satisfaction) and 4 general questions (eg, disease severity, importance of therapies, prescribed versus missed therapies) were used to assess patient satisfaction. Unfortunately, there was no empirical support for the presence of the dimensions. Second, although factor analyses were performed in some studies, the quality was not high. For example, exploratory factor analysis was used by Seid et al15 to understand the Parent's Perceptions of Primary Care measure (P3C) and it was concluded that “with few exceptions, the items of the 6 factors are consistent with the a priori hypothesized P3C subscales.” p 267 Unfortunately, although a large sample was used in the study, stability of the factors was not assessed across different subsamples. Similarly, Garratt et al16 used exploratory factor analysis to examine the factor structure of the Parent Experiences of Paediatric Questionnaire. While general support for the measure was found, there was some overlap between “doctor services” and “organization/information-examinations and tests” factors. Furthermore, stability of the factors extracted was not evaluated in the study. Third, few studies used confirmatory factor analysis to explore the different aspects of subjective outcome evaluation scales. While different models can be tested and researchers usually use a set of criteria to select the “best” model in Exploratory Factor Analysis (EFA), the conclusions are usually tentative and it is possible to have support for different models. In contrast, a priori models are commonly proposed and tested in Confirmatory Factor Analysis (CFA) where multiple goodness-of-fit indicators can be used to assess model fit.17e19 Empirically, CFA constitutes a much more powerful tool in testing the dimensionality of subjective outcome evaluation scales. Researchers can perform CFA using commercially available computer software programs such as LISREL, Mplus, AMOS, and EQS. However, as such programs are usually not available in the generic statistical analyses programs (eg, SPSS), researchers may have to use additional funds to purchase the software.

Fourth, few studies have been conducted to examine hierarchical factor structures of subjective outcome evaluation tools. In scales where different aspects of satisfaction are proposed, it is expected that they are subsumed under an umbrella of “global” satisfaction. In CFA, if the observed variables are proposed to be correlated, it is logical to propose that the correlations among the first-order factors are explained by higher-order factors. The test of a higherfactor model will enable practitioners and researchers to use the total scale and subscales to assess satisfaction with the program. Finally, a survey of the literature shows that test of factorial invariance is almost nonexistent in the current literature on subjective outcome evaluation. Generally speaking, there are 3 types of factorial invariance, including configural invariance, measurement invariance, and structural invariance.20e22 Configural invariance refers to whether there is an equal pattern of the instrument across different populations, such as having the same number of factors in different groups. For measurement invariance, it addresses the question of how the items measure the latent construct across groups, which includes whether the factor loadings are equal across groups (metric invariance), whether intercepts (items means) are equal across groups (scalar invariance), and whether item error variances and/ or covariances are equal across groups (item uniqueness invariance). The third type is structural invariance, meaning the invariance of factor variances and covariances in different populations. Based on a review of the invariance literature, Vandenberg and Lance23 concluded that the most frequently conducted factorial invariance included configural invariance (equivalence in factorial pattern), metric invariance (invariance in factor loadings), structural invariance (equivalence of factor variances and covariances), and item uniqueness invariance (equal item error variances and covariances). In this study, we examined the factorial validity of the Subjective Outcome Evaluation Scale for Students (SOES-S) in the Project P.A.T.H.S. in Hong Kong (ie, Form A in the Tier 1 Program). P.A.T.H.S. denotes Positive Adolescent Training through Holistic Social Programmes. The Project P.A.T.H.S. is a positive youth development program which attempts to promote the holistic development of junior secondary school students in Hong Kong.24,25 There are 2 tiers of programs in the project. In the Tier 1 Program, junior secondary school students from Grade 7 to Grade 9 joined a curricular-based positive youth development program in which positive youth development constructs are incorporated. In the Tier 2 Program, positive youth development programs were developed for adolescents with greater psychosocial needs. School social workers worked with the school to design positive youth development programs for the students concerned utilizing positive youth development constructs. In the present paper, dimensionality of the SOES-S used in the Tier 1 Program of the Project P.A.T.H.S. was examined. Using CFA, factor models with primary factors and a higherorder were tested. In addition, factorial invariance test in 2 randomly formed subsamples was examined. As the present study was not an experimental study, no control group

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

S45

Fig. 1. Hypothesized hierarchical structure of the subjective outcome evaluation form.Ă

was included. The use of client satisfaction data to understand the dimensionality of a measure of subjective outcome evaluation was commonly used in the field of client satisfaction.3e10

 16 items for assessing participants' perceptions of the effectiveness of the program, such as promotion of moral competence, emotional competence in the program participants.

Methods Data Analytic Strategy Participants

In academic year 2010-2011, 236 secondary schools joined the Project P.A.T.H.S. in the fifth year of the Full Implementation Phase. The average number of students per school was 144.88, ranging from 4 to 240 students. The mean number of classes per school was 4.42, ranging from 1 to 12 classes. The details of the data can be seen in another paper in this Supplement. In the present study, the data of Secondary 2 students (N 5 28,431) were used. Procedures

Participants completed a questionnaire in the classroom at the last session of the program. During the data collection, trained research assistants distributed the questionnaires and explained instruction concerning, the purposes of the study, confidentiality in the data, voluntary participation, assessment administration, to the students. In general, participants took 5-10 minutes to complete this selfadministered questionnaire. The institutional review board of principal investigator approved the study protocol prior to implementation of the research. Measures

The SOES-S was divided into 3 parts as follows:  10 items for assessing participants' perceptions of the program, including program objectives, design, classroom atmosphere, interaction among the students and the overall assessment of the program.  10 items for assessing participants' perceptions of the program implementers, including the helpfulness of the instructor.

To examine the hierarchical structure of the SOES-S (see Fig. 1), multiple group confirmatory factor analysis via Mplus version 7.1126 was performed. As shown in Fig. 1, the 3 first-order factors (ie, program content, program implementer and program effectiveness) of the SOES-S loaded on a second-order factor (ie, overall program effectiveness). Prior to the measurement invariance test, by using the case number, the whole sample was divided into 2 subsamples: odd and even groups. Maximum likelihood estimator was used as all observed variables were normally distributed and missing data was less than 5% (ie, n 5 89).27 Following the suggestions by other researchers,20,28 a series of models was tested, including (a) baseline model for 2 subsamples (Model 1: odd group; Model 2: even group), (b) configural invariance (Model 3), (c) first-order factor loadings were constrained to be equal (Model 4), (d) second-order factor loading were constrained to be equal (Model 5), (e) intercepts of observed variables were constrained to be equal (Model 6), (f) intercepts of first-order factors were constrained to be equal (Model 7), (g) disturbances of first-order factors were constrained to be equal (Model 8), and (h) residuals of all observed variables were constrained to be equal (Model 9). The above analyses were performed by using the mean and covariance matrix.29e31 Model Evaluation

To evaluate the overall fit of the above models, a number of fit statistics were used, including chi-square goodness-offit test (c2), comparative fit index (CFI), Tucker-Lewis Index (TLI), standardized root-mean-square residual (SRMR), and root-mean-square error of approximation (RMSEA). For CFI and TLI, the values of .95 or greater indicate a satisfactory fit to the data.32,33 The values of both SRMR and RMSEA less

S46

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

Table 1 Descriptive Statistics and Internal Consistency by Groups Total (N 5 28,462)

Variables

Program content (10 items) Program Implementer (10 items) Program Effectiveness (16 items) Overall program effectiveness (36 items)

Odd (n 5 14,111)

SD

a

Mean

SD

a

Mean

SD

a

4.33 4.61 3.45 4.02

.95 .98 .86 .79

.95 .97 .98 .98

4.33 4.60 3.45 4.02

.95 .99 .86 .80

.95 .97 .98 .98

4.33 4.61 3.45 4.02

.95 .98 .86 .79

.95 .97 .98 .98

than .08 and .06, respectively, represent acceptable modeldata fit.33 To compare the relative fit of 2 nested models, the insignificance of chi-square difference test (Dc2) and changes in the value of DCFI less than or equal to .01 indicating the null hypothesis of invariance should not be rejected.34 Results

Descriptive statistics and internal consistency of all variables for odd and even groups are shown in Table 1. The interrelationships among the latent factors for the total sample are presented in Table 2. The hierarchical structure of SOES-S yields a good fit to the model as found in both groups (Model 1: c2 (591) 5 36391.24, P ! .01; CFI 5 .94; TLI 5 .93; RMSEA 5 .07; SRMR 5 .03; Model 2: c2 (591) 5 37094.13, P ! .01; CFI 5 .94; TLI 5 .93; RMSEA 5 .07; SRMR 5 .03, Table 3). In both groups, all factor loadings were significant (z-scores O 1.96, P ! .05) and above .63 (Table 4). Given the satisfactory fit of baseline models in both groups, multiple-group CFA were conducted to test measurement invariance across groups. In Model 3, no equality constraints were imposed. As shown in Table 3, this model yields a good fit to the model (Model 3: c2 (1182) 5 73485.38, P ! .01; CFI 5 .94; TLI 5 .93; RMSEA 5 .07; SRMR 5 .03), suggesting the number of factor and pattern of factor loadings were invariant across groups. In other words, the configural invariance of the SOES-S is supported. To test the metric invariance, the first- and second-order factor loadings were constraints in Model 4 and Model 5, respectively. These models fit the data well as indicated by the insignificant changes in Dc2 and DCFI (Model 3 vs Model 4: Dc2(33) 5 32.09, P O .05, DCFI 5 .00; Model 3 vs Model 5: Dc2(35) 5 33.89, P O .05, DCFI 5 .00). We concluded that the factor loadings of first- and secondorder factors were invariant in both groups. As metric invariance is supported, the intercepts of the observed variables and first-order factors were constrained to be equal in Model 6 and Model 7, respectively. Compared to the baseline model (ie, Model 3), there is no significant decrease in model fit (Model 3 vs Model 6: Dc2 (68) 5 51.76, Table 2 Correlation Coefficients among Factors (N 5 28,462) Factor

1

2

3

1. Program content 2. Program implementer 3. Program effectiveness

.76 .60

.53

-

All parameters were significant (P ! .05).

Even (n 5 14,351)

Mean

P O .05; DCFI 5 .00; Model 3 vs Model 7: Dc2 (71) 5 53.68, P O .05; DCFI 5 .00), suggesting the item and factor intercepts were invariant across groups (ie, scalar invariance). Lastly, residual variances and disturbances were constrained to be equal in Models 8 and 9, respectively. There was no significant change in Dc2 and DCFI (Model 3 vs Model 8: Dc2 (74) 5 54.60, P O .05; DCFI 5 .00; Model 3 vs Model 9: Dc2 (110) 5 125.91, P O .05; DCFI 5 .00), suggesting that residual variances and disturbances were equivalent in both groups. In summary, measurement invariance of the SOES-S was supported across groups. The results suggest an invariant factor structure of the SOES-S in both groups. Discussion

The primary objective of this study was to examine the factor structure of the SOES-S regarding the Tier 1 Program of the Project P.A.T.H.S. in Hong Kong. Using CFA, the present findings strongly suggested that SOES-S possessed good factorial validity and the related measures showed high internal consistency. Basically, findings showed that 3 factors were abstracted from the scale and these 3 factors could be subsumed under a higher-order factor reflecting global satisfaction. These findings are generally consistent with the findings reported by Shek and Ma's study35 based on the data collected in the early phase of the project. With the established dimensions of the scale, it is possible for researchers and practitioners to use this scale to examine both global and specific satisfaction of programs similar to the Tier 1 Program of the Project P.A.T.H.S. in Hong Kong. Indeed, SOES-S has been used as a subjective outcome evaluation tool in some other programs, and the present finding is an important contribution to the literature.36,37 One may ask why it is necessary to provide support for the factorial validity and internal consistency of the subscales of SOES-S of the Tier 1 Program. Fundamentally, it can be argued that with growing emphasis on accountability of human services professionals,38,39 there is a strong need to develop objective measures of subjective outcome evaluation. The development of standardized subjective outcome evaluation tools can enable practitioners and researchers to measure client satisfaction objectively so that program improvement and refinement is possible. Second, there are many studies showing that patient satisfaction with intervention affects the health outcomes. As pointed out by Garratt et al,16 “the measurement of patient experiences and satisfaction with health care is now recognized as an important component in the evaluation of health care

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

S47

Table 3 Summary of Goodness of Fit for All CFA Models Model 1 2 3 4 5 6 7 8 9

Description

c2

df

Odd group (second-order factor) Even group (second-order factor) Configural invariance (baseline model) First-order factor loading invariance Second-order factor loading invariance Intercept invariance of measured variables Intercept invariance of first-order factors Uniqueness invariance of first-order factors (disturbances) Uniqueness invariance of measured variables (residual variances)

36391.24* 37094.13* 73485.38* 73517.47* 73519.27* 73537.14* 73539.06* 73539.98* 73611.29*

591 591 1182 1215 1217 1250 1253 1256 1292

RMSEA (90% CI) .07 .07 .07 .07 .07 06 .06 .06 .06

(.07-.07) (.07-.07) (.08-.09) (.06-.07) (.06-.07) (.06-.06) (.06-.06) (.06-.06) (.06-.06)

CFI

TLI

SRMR

Dc2

.94 .94 .94 .94 .94 .94 .94 .94 .94

.93 .93 .93 .93 .93 .94 .94 .94 .94

.03 .03 .03 .03 .03 .03 .03 .03 .03

32.09 (Model 3 vs 4) 33.89 (Model 3 vs 5) 51.76 (Model 3 vs 6) 53.68 (Model 3 vs 7) 54.60 (Model 3 vs 8) 125.91 (Model 3 vs 9)

Ddf DCFI

33 35 68 71 74 110

.00 .00 .00 .00 .00 .00

CFA, Confirmatory factor analysis; CFI, Comparative fit index; CI, Confidence interval; RMSEA, Root mean square error of approximation; SRMR, Standardized root mean square residual; TLI, Tucker-Lewis index. Dc2, Change in goodness-of-fit c2 relative to previous model; Ddf, Change in degrees of freedom relative to previous model; DCFI, Change in comparative fit index relative to previous model nodd 5 14,111; neven 514,351. In Model 3, no equality constraint was imposed; Model 4, equality constraints were imposed on all first-order factor loadings; Model 5, equality constraints were imposed on all first- and second-order factor loadings; Model 6, equality constraints were imposed on all first- and second-order factor loadings, intercepts of the measured variables; Model 7, equality constraints were imposed on all first- and second-order factor loadings, intercepts of the measured variables and first-order latent factors; Model 8, equality constraints were imposed on all first- and second-order factor loadings, intercepts of the measured variables and first-order latent factors, disturbances of all first-order factors; Model 9, equality constraints were imposed on all first- and second-order factor loadings, intercepts of the measured variables and first-order latent factors, disturbances of all first-order factors and residual variances of all measured variables. * P ! .01.

interventions and for assessing service quality . greater satisfaction with health services results in better treatment adherence which leads to better health outcomes.”p 246 As such, as echoed by Seid et al,15 “to improve the quality of pediatric primary care, a reliable and valid measure must exist.” p 264 There are several implications of the present findings. First, the findings suggest that SOES-S is a measure with excellent factorial validity and internal consistency. This finding is important because psychometrically sound subjective outcome evaluation tools are sparse in the youth work context. For example, Garratt et al16 pointed out that in the field of neonatal and paediatric service, most satisfaction tools have “insufficient evidence for reliability and validity.”p 246 Besides, the establishment of the factorial validity of SOES-S is important because there are few validated measures of subjective outcome evaluation tools in the Chinese culture. Furthermore, “as adolescents who are direct recipients of health services are seldom included in patient satisfaction surveys,”40, p 1396 validation of SOES-S is important because it is a tool designed for early adolescents. This tool can be used to evaluate youth programs, such as health education and training programs, in the Chinese context. The second implication is that the present study underscores the importance of using CFA in understanding the dimensionality of subjective outcome evaluation tools. As mentioned above, while EFA is commonly employed in the field of subjective outcome evaluation, there are comparatively fewer studies using CFA. There are several advantages of using CFA. First, conceptual models are tested in CFA. Instead of “exploring” different possible factors in a scale under exploratory factor analyses, testing specific models is possible under CFA. Second, many goodness of fit indicators are available in CFA which permits acceptance or rejection of a model. Third, because of the above-mentioned characteristics, CFA enables researchers to test different models. The third implication is that the present study illustrates the importance of understanding higher-order factors in CFA. Researchers suggest that there are rewards in both

proposing and testing factor models with higher-order factors.19,41 For example, by understanding of intercorrelations among the lower-order factors, hierarchical factor model is a more parsimonious way of looking at the inter-relationships among the different aspects of a construct under examination. Besides, it is an elegant tool to look at the “whole-part” relationships between the whole scale and different subscales (ie, unity versus diversity issue). From a practice point of view, support for higherorder factor models implies that it is conceptually justified to use both the whole scale and subscales to understand client satisfaction. Finally, the present study illustrates the importance of examining factorial invariance. According to Wu et al,42 “an observed score is said to be measurement invariant if a person's probability of an observed score does not depend on his/her group membership, conditional on the true score. That is, respondents from different groups, but with the same true score, will have the same observed score . More formally, given a person's true score, knowing a person's group membership does not alter the person's probability of getting a specific observed score.”p 2 In the present study, we tested different types of factorial invariance, including configural invariance, first-order factor loading invariance, second-order factor loading invariance, intercepts of observed variables invariance, intercepts of firstorder factor invariance, disturbances of first-order factor invariance, and residual of all observed variables invariance. Using different indicators of factor invariance, the present findings showed that different factors are very stable in the odd group and even group. Assessment of factorial invariance is important because client satisfaction measures are commonly used in different populations. For example, in the context of positive youth development, the developed tool may be implemented in different grades (eg, junior secondary versus senior secondary school students) with different needs (students in mainstream schools versus special schools). There are several unique features of the study. First, as there are few related studies studying subjective outcome

S48

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

Table 4 Completely Standardized Factor Loadings, Uniqueness and Squared Multiple Correlations for the Models SMC

Model 1 (odd) First-order FL*

PC

.90

1 2 3 4 5 6 7 8 9 10

.63 .73 .72 .63 .54 .58 .65 .71 .75 .74

PI

.70

1 2 3 4 5 6 7 8 9 10

.76 .80 .81 .83 .79 .76 .76 .77 .73 .78

EF

.73

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

.67 .74 .74 .74 .71 .70 .72 .73 .67 .71 .71 .73 .71 .70 .70 .71

.79 .86 .85 .79 .74 .76 .80 .85 .87 .86

SMC

Model 2 (even)

Second-order U

First-order

FL*

D

.95

.10

.37 .27 .28 .37 .46 .42 .35 .29 .25 .26

.24 .20 .19 .17 .21 .24 .24 .23 .27 .22

.82 .86 .86 .86 .84 .83 .85 .85 .82 .84 .84 .86 .84 .83 .84 .84

.33 .26 .27 .27 .29 .30 .28 .27 .33 .29 .29 .27 .29 .30 .30 .29

.30

.57

.80 .86 .85 .79 .74 .76 .81 .85 .87 .86

.87 .89 .90 .90 .89 .87 .87 .88 .85 .88

.24 .20 .19 .18 .21 .24 .24 .23 .28 .22

.82 .85 .86 .86 .84 .83 .84 .85 .81 .84 .84 .86 .84 .84 .84 .84

.33 .27 .27 .27 .29 .32 .29 .28 .34 .29 .29 .27 .29 .30 .30 .30

.43 .67 .73 .73 .73 .71 .68 .71 .72 .66 .71 .71 .73 .71 .70 .71 .70

FL*

D

.95

.10

.83

.32

.65

.57

.36 .26 .28 .38 .46 .42 .35 .28 .25 .27

.68 .76 .80 .81 .82 .78 .76 .76 .77 .72 .78

.66

Second-order U

.90 .64 .74 .72 .62 .54 .58 .65 .72 .75 .73

.83 .87 .89 .90 .91 .89 .87 .87 .88 .85 .89

FL*

D, Disturbance; EF, Program effectiveness; FL, Completely standardized factor loading; PC, Program content; PI, Program implementer; SMC, Squared multiple correlation; U, Uniqueness nodd 5 14,111; neven 5 14,351. All parameters were significant (P ! .05).

evaluation tools in different Chinese contexts,24,25 this is a pioneer attempt. With the call for greater accountability in human services in different Chinese contexts, the support for the psychometric properties of SOES-S can help researchers and practitioners to assess client satisfaction in a more objective manner. Second, a large sample of participants (N 5 28,431) was employed. This is a unique marker of the study because studies usually do not have such a large sample size. Third, to assess the stability of the factor analyses conducted, 2 random subsamples were created to assess the stability of the factor structure. Despite the above-mentioned strengths, several limitations of the present study must be noted. Although this study has generated robust findings, one should be cautious about the generalizability of the findings in different Chinese communities. Obviously, researchers should examine different types of factorial invariance in different populations. Second, only factorial validity was examined in this study, so other types of validity on SOES-S should be

explored. For example, it would be exciting to see how the assessment based on SOES-S is related to evaluation findings based on other methods such as qualitative interviews and focus group discussion. Third, as there are only 3 aspects of client satisfaction in this scale, it is theoretically exciting to include other aspects of satisfaction such as arrangements, assignments, and the extent of reflective learning in the evaluation. As far as future research directions are concerned, there are 3 suggestions. First, it would be exciting to see whether similar dimensions of the scale could be found for SOES-S in other Chinese adolescent samples, such as senior high school students in mainland China, Taiwan, and Macau (ie, generalizaility across different populations). Second, it would be helpful to examine the psychometric properties of SOES-S in different adolescent programs in both pediatric and allied professional settings (ie, generalizability across different programs). Finally, it would be illuminating to examine the cross-cultural applicability of the measure in

D.T.L. Shek, C.M.S. Ma / J Pediatr Adolesc Gynecol 27 (2014) S43eS49

non-Western contexts (ie, generalizability across different cultures). Acknowledgments

This paper and the Project P.A.T.H.S. are financially supported by The Hong Kong Jockey Club Charities Trust. The authorship is equally shared by the first and second authors. References 1. Ginsberg LH: Social Work Evaluation: Principles and Methods. Boston, Allyn and Bacon, 2001, pp 32e44 2. Weinbach RW: Evaluating Social Work Services and Programs. Boston, Allyn and Bacon, 2005, pp 172e195 3. Winefield HR, Barlow JA: Client and worker satisfaction in a child protection agency. Child Abuse Negl 1995; 19:897 4. Peterson A, Esbensen FA: The outlook is G.R.E.A.T. What educators say about school-based prevention and the Gang Resistance Education and Training (G.R.E.A.T.) program. Eval Rev 2004; 28:218 5. Najavits LM, Ghinassi F, van Horn A, et al: Therapist satisfaction with four manual-based treatments on a national multisite trial: An exploratory study. Psychother Theor Res Pract Train 2004; 41:26 6. Cohen PA: Student ratings of instruction and student achievement: a meta-analysis of multi-section validity studies. Rev Educ Res 1981; 51:281 7. Marsh HW, Roche LA: Making students' evaluations of teaching effectiveness effective: the critical issues of validity, bias, and utility. Am Psychol 1997; 52:1187 8. Litzelman DK, Stratos GA, Marriott DJ, et al: Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med 1998; 73:688 9. Waugh RF: The Course Experience Questionnaire: a Rasch measurement model analysis. High Educ Res Dev 1998; 17:45 10. Spooren P, Mortelmans D, Denekens J: Student evaluation of teaching quality in higher education: development of an instrument based on 10 Likert-scales. Assess Eval High Educ 2007; 32:667 11. Le TT, Rait MA, Jarlsberg LG, et al: A randomized controlled trial to evaluate the effectiveness of a distance asthma learning program for pediatricians. J Asthma 2010; 47:245 12. Bjerre IA, Larsson M, Franzon AM, et al: Measure of Processes of Care (MPOC) applied to measure parent's perception of the habilitation process in Sweden. Child Care Health Dev 2004; 30:123 13. Wilkins V, Elliott MN, Richardson A, et al: The association between care experiences and parent ratings of care for different racial, ethnic, and language groups in a Medicaid population. Health Serv Res 2011; 46:821 14. Oermann CM, Swank PR, Sockrider MA: Validation of an instrument measuring patient satisfaction with chest physiotherapy techniques in cystic fibrosis. Chest 2000; 118:92 15. Seid M, Varni JW, Bermudez LO, et al: Parents' perceptions of primary care: measuring parents' experiences of pediatric primary care quality. Pediatrics 2001; 108:264 16. Garratt AM, Bjertnæs OA, Barlinn J: Parent experiences of paediatric care (PEPC) questionnaire: reliability and validity following a national survey. Acta Paediatr 2007; 96:246 17. Byrne BM: Structural Equation Modeling with LISREL, PRELIS, and SIMPLIS: Basic Concepts, Applications, and Programming. Mahwah, NJ, Lawrence Erlbaum Associates Inc, 1998, pp 3e42

S49

18. Quintana SM, Maxwell SE: Implications of recent developments in structural equations modeling for counseling psychology. Counseling Psychol 1999; 27:485 19. Brown TA: Confirmatory Factor Analysis for Applied Research. New York, Guilford, 2006, pp 113e131 20. Chen FF, Sousa KH, West SG: Testing measurement invariance of second-order factor models. Struct Equ Modeling 2005; 12:471 21. Meredith W: Measurement invariance, factor analysis and factorial invariance. Psychometrika 1993; 58:525 22. Gregorich SE: Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Med Care 2006; 44:S78 23. Vandenberg RJ, Lance CE: A review and synthesis of the measurement invariance literature: suggestions, practices, and recommendations for organizational research. Organ Res Meth 2000; 3:4 24. Shek DT, Sun RC: The Project P.A.T.H.S. in Hong Kong: development, training, implementation, and evaluation. J Pediatr Adolesc Gynecol 2013; 26(3 Suppl): S2 25. Shek DT, Sun RC, editors. Development and Evaluation of Positive Adolescent Training through Holistic Social Programs (P.A.T.H.S.). Berlin, Springer, 2013 n LK, Muthe n BO: Mplus User's Guide, (5th ed.). Los Angeles, Muthe n & 26. Muthe n, 2013 Muthe 27. Roth P: Missing data: a conceptual review for applied psychologists. Pers Psychol 1994; 47:537 28. Byrne BM, Stewart SM: The MACS approach to testing for multigroup invariance of a second-order structure: a walk through the process. Struct Equ Modeling 2006; 13:287 29. Chou C, Bentler PM: Estimates and tests in structural equation modeling. In: Hoyle RH, editor. Structural Equation Modeling: Concepts, Issues, and Applications. Thousand Oaks, CA, Sage, 1995, pp 37e54 30. Curran PJ, West SG, Finch JF: The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychol Methods 1996; 1:16 31. Finney SJ, DiStefano C: Non-normal and categorical data in structural equation modeling. In: Hancock GR, Mueller RO, editors. Structural Equation Modeling: A Second Course. Greenwich, CT, Information Age Publishing, 2006, pp 269e312 32. Bentler PM: Comparative fit indexes in structural models. Psychol Bull 1990; 107:238 33. Hu L, Bentler PM: Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling 1999; 6:1 34. Cheung GW, Rensvold RB: Evaluating goodness-of-fit indexes for testing measurement invariance. Struct Equ Modeling 2002; 9:233 35. Shek DT, Ma CMS: The use of confirmatory factor analyses in adolescent research: Project P.A.T.H.S. in Hong Kong. Int J Disabil Human Dev (In press). 36. Shek DT, Sun RC: Post-course subjective outcome evaluation of a course promoting leadership and intrapersonal development in university students in Hong Kong. Int J Disabil Hum Dev 2013; 12:193 37. Shek DT, Sun RC: Participants' evaluation of the Project P.A.T.H.S.: are findings based on different datasets consistent? ScientificWorldJournal 2012; 2012: 187450 38. Gambrill E: Evidence-based practice: an alternative to authority-based practice. Fam Soc 1999; 80:341 39. Sackett DL, Richardson WS, Rosenberg W, et al: Evidence-based Medicine: How to Practice and Teach EBM. New York, Churchill Livingstone, 1997, pp 205e220 40. Mah JK, Tough S, Fung T, et al: Parents' global rating of mental health correlates with SF-36 scores and health services satisfaction. Qual Life Res 2006; 15:1395 41. Gustafsson J, Balke G: General and specific abilities as predictors of school achievement. Multivar Behav Res 1993; 28:407 42. Wu AD, Li Z, Zumbo BD: Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: a demonstration with TIMSS data. Practical Assess Res Eval 2007; 12:1