Probing the relative psychometric validity of three measures of psychological inflexibility

Probing the relative psychometric validity of three measures of psychological inflexibility

Author’s Accepted Manuscript Probing the Relative Psychometric Validity of Three Measures of Psychological Inflexibility Tyler L. Renshaw www.elsevie...

689KB Sizes 0 Downloads 12 Views

Author’s Accepted Manuscript Probing the Relative Psychometric Validity of Three Measures of Psychological Inflexibility Tyler L. Renshaw

www.elsevier.com/locate/jcbs

PII: DOI: Reference:

S2212-1447(17)30156-4 https://doi.org/10.1016/j.jcbs.2017.12.001 JCBS211

To appear in: Journal of Contextual Behavioral Science Received date: 26 November 2016 Revised date: 13 November 2017 Accepted date: 28 December 2017 Cite this article as: Tyler L. Renshaw, Probing the Relative Psychometric Validity of Three Measures of Psychological Inflexibility, Journal of Contextual Behavioral Science, https://doi.org/10.1016/j.jcbs.2017.12.001 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Running head: PSYCHOLOGICAL INFLEXIBILITY

1

Probing the Relative Psychometric Validity of Three Measures of Psychological Inflexibility Tyler L. Renshaw* Louisiana State University Author Note Tyler L. Renshaw, Department of Psychology, Louisiana State University. *Correspondence to Tyler Renshaw, 236 Audubon Hall, Department of Psychology, Louisiana State University, Baton Rouge, LA 70803. Email: [email protected]

Abstract The present study probed the relative structural and concurrent validity of responses to three selfreport measures of psychological inflexibility with a large sample of college students (N = 797): the revised version of the Acceptance and Action Questionnaire (AAQ-II), the shorter version of the Avoidance and Fusion Questionnaire for Youth (AFQ-Y8), and the longer version of the Avoidance and Fusion Questionnaire for Youth (AFQ-Y17). Structural validity findings showed that responses to the AAQ-II and AFQ-Y8 indicated good data–model fit and latent construct reliability, whereas the data–model fit for responses to the AFQ-Y17 was poor, despite strong latent construct reliability. Concurrent validity findings demonstrated that scores derived from all three measures of psychological inflexibility had comparable correlations with several concurrent indicators of negative mental health (i.e., depression, anxiety, global negative affect), positive mental health (i.e., happiness, hope, global positive affect), and theoretically-similar therapeutic processes (i.e., mindfulness skills). Yet findings from hierarchical regressions evidenced some incremental validity when scores from AAQ-II, AFQ-Y8, and AFQ-Y17 were taken together to predict concurrent mental health outcomes—suggesting potential differential

PSYCHOLOGICAL INFLEXIBILITY

2

construct representation among these three measures. Limitations of the present study and future directions for research and practice are discussed.

Keywords: psychological inflexibility, self-report measurement, psychometric validity

PSYCHOLOGICAL INFLEXIBILITY

3

Probing the Relative Psychometric Validity of Three Measures of Psychological Inflexibility Psychological flexibility is the core construct of the transdiagnostic theory of mental health that underpins Acceptance and Commitment Therapy (ACT; Hayes, Strosahl, & Wilson, 2011). According to this theory, psychologically flexible behavior supports and maintains human wellbeing, whereas psychologically inflexible behavior facilitates the development and maintenance of mental health problems. Given that psychological flexibility refers to “changing or persisting in behavior to serve chosen values” (Biglan & Hayes, 2016, p. 56), psychological inflexibility is posited to contribute to mental health problems “when language and cognition interact with direct contingencies to produce an inability to persist or change behavior in the service of long-term valued ends” (Hayes, Luoma, Bond, Masuda, & Lillis, 2006, p. 6). Guided by this transdiagnostic theory, the first step within ACT is to identify inflexible behavior that interferes with wellbeing. After accomplishing this, the second step in ACT is to train skills and manipulate environmental events that support flexible behavioral repertoires that promote wellbeing (Hayes, Strosahl, & Wilson, 2011; Renshaw, Bolognino, Roberson, Upton, & Hammons, 2017). Meta-analyses and systematic reviews of treatments targeting psychological inflexibility suggest that these approaches are generally effective for improving a variety of mental health concerns. Specifically, syntheses of ACT studies indicate small-to-large treatment effects in comparison to both waitlist and treatment-as-usual controls (Hayes et al., 2006; Öst, 2008; Öst, 2014; Powers, Vörding, & Emmelkamp, 2009; Ruiz, 2012) and show that changes in psychological flexibility mediate changes in outcomes (Hayes et al., 2006). Findings from other contextually-oriented behavior therapies, which target psychological flexibility using interventions similar in nature to those used in ACT (Hayes, Villatte, Levin, & Hildenbrandt,

PSYCHOLOGICAL INFLEXIBILITY

4

2011), show comparatively positive treatment effects across a range of mental health problems and wellbeing behaviors (e.g., Khoury et al., 2013; Klingbeil et al., 2017). Furthermore, a recent meta-analysis of laboratory-based component studies aligned with the lower-order behavioral processes that comprise psychological flexibility indicates small-to-moderate therapeutic effects for each component process across a variety of mental health outcomes, with larger effects observed for interventions targeting values-based processes in combination with one or more other lower-order processes (Levin, Hildebrandt, Lillis, & Hayes, 2012). This line of treatment research is corroborated by indirect evidence drawn from cross-sectional studies investigating the relationship among scores derived from psychological flexibility measures and other measures of mental health processes and problems. Findings from these basic science studies indicate that psychological flexibility is a substantive correlate of psychopathology (Kashdan & Rottenberg, 2010), and that, compared to other cognitive-behavioral therapeutic processes (e.g., cognitive reappraisal and emotional regulation), it has incremental validity for predicting mental health outcomes (e.g., Kashdan, Barrios, Forsyth, & Steger, 2006; Gloster, Klostche, Chaker, Hummel, & Hoyer, 2011). Measures of Psychological Inflexibility Within the context of ACT, psychological inflexibility is commonly assessed via interviews and experiential exercises (Hayes, Stosahl, & Wilson, 2011). Yet empirical work has also undertaken to develop and validate self-report behavior rating scales that can function as formal measures of psychological inflexibility for both scientific and practical purposes. The first and most well-researched self-report measures of psychological flexibility is the Acceptance and Action Questionnaire (AAQ; Hayes et al., 2004). The original development study of the AAQ produced a 9-item measure that yielded responses with adequate internal consistency reliability

PSYCHOLOGICAL INFLEXIBILITY

5

and good data–model fit to a unidimensional measurement model. Concurrent validity analyses conducted with several clinical and non-clinical samples showed that scores derived from the AAQ had generally small positive correlations with other measures of therapeutic processes (e.g., thought suppression and behavioral avoidance) as well as small-to-moderate positive associations with various measures of mental health problems (i.e., anxiety, depression, trauma, and general symptomology; Hayes et al., 2004). Following its development, the AAQ was used as the primary measure of psychological inflexibility in a series of basic science studies. A metaanalysis of 27 studies employing the AAQ show that responses to the measure yielded small-tomoderate correlations with mental health problems and therapeutic processes as well as with adaptive functioning and wellbeing behavior (Hayes et al., 2006). Despite this promising evidence, applied researchers have expressed concern with the complexity of the measure’s item wording, and some studies have indicated that responses to the measure have low internal consistency reliability—ultimately warranting a need to revise the measure (Bond et al., 2011). The development study for the revised version of the AAQ, which is referred to as the AAQ-II, produced a 7-item measure that had completely different items from the original AAQ (Bond et al., 2011). Results from this study showed that responses to the AAQ-II yielded strong internal consistency reliability and good data–model fit with a unidimensional measurement model. Using both clinical and non-clinical samples, findings also indicated that scores from the AAQ-II had strong positive correlations with scores from the original version of the measure, and that they showed moderate positive correlations with responses to concurrent measures of other mental health problems, including anxiety, depression, thought suppression, and general symptomology (Bond et al., 2011). Subsequent studies provided further evidence in favor of the structural, concurrent, and incremental validity of responses to the AAQ-II, with both clinical

PSYCHOLOGICAL INFLEXIBILITY

6

and non-clinical samples (e.g., Fledderus et al., 2012) as well as with cross-cultural samples and special populations (e.g., Pennato, Berrocal, Bernini, & Rivas, 2013; Ruiz, Herrera, Luciano, Cangas, & Beltrán, 2013; Zhang, Chung, Si, & Liu, 2014). So far, however, the AAQ-II has only been validated with adults, as the readability of the items is still considered too advanced for use with older children and adolescents (see Fergus et al., 2012). Given the readability limitation of the AAQ and AAQ-II, the Avoidance and Fusion Questionnaire for Youth (AFQ-Y) was explicitly developed for the purpose of measuring psychological flexibility via self-report among youth (Greco, Lambert, & Baer, 2008). The original development study of the AFQ-Y produced both 17-item and 8-item versions of the measure—the longer version (AFQ-Y17) being intended for use in clinical practice and the shorter version (AFQ-Y8) being intended for use as a population-based screening instrument (Greco et al., 2008). Results from this development study indicated that scores derived from both versions of the measure had strong internal consistency and at least adequate data–model fit to a unidimensional measurement model. Concurrent validity analyses with two large community samples of youth indicated that scores derived from both versions of the measure had large positive correlations with scores from measures of overall mental health problems as well as selfreported anxiety, moderate-to-large positive relationships with scores from a self-reported measure of thought suppression, moderate positive associations with scores from a measure of self-reported somatization, and negligible-to-small positive relationships with scores derived from teacher-reported measures of behavior problems (Greco et al., 2008). Moreover, divergent validity analyses with these same samples indicated scores from both versions of the measure had moderate-to-large negative correlations with scores of self-reported mindfulness, small-tomoderate negative associations with scores of self-reported quality-of-life, and negligible-to-

PSYCHOLOGICAL INFLEXIBILITY

7

small negative correlations with teacher-reported social skills and academic competence (Greco et al., 2008). Since Greco et al.’s (2008) original development study, the technical adequacy of the AFQ-Y17 has been confirmed with a general sample of college students (Fergus et al., 2012), an inpatient sample of adults with anxiety disorders (Fergus et al., 2012), an inpatient sample of youth (Ventra, Sharp, & Hart, 2012), and a general sample of high-school students (Renshaw, 2017). Yet the technical adequacy of the AFQ-Y8 has only been confirmed with a general sample of high school students (Renshaw, 2017). Findings from these studies have indicated that responses to both the longer and shorter versions of the measure yielded at least adequate data– model fit with a unidimensional measurement model and were characterized by strong internal consistency reliability. Concurrent validity analyses conducted with these samples further indicated positive associations between scores derived from both versions of the AFQ-Y and scores from self-reported measures of internalizing problems and externalizing problems (Renshaw, 2017; Ventra et al., 2012). Results from the adult inpatient sample also demonstrated that scores from the AFQ-Y17 shared approximately 50% of their variance with scores from the AAQ-II (Fergus et al., 2012). Thus, unlike the AAQ-II, the structural and concurrent validity of responses to the AFQ-Y have been demonstrated for both clinical and non-clinical samples of adults and youth. That said, it is noteworthy that only one previous study has directly investigated the relative validity of responses to both the AAQ-II and the AFQ-Y17 with the same sample (Fergus et al., 2012), and that the only aspect of validity probed for both measures in that study was concurrent validity. Purpose of the Present Study The purpose of the present study was to advance the scientific basis of self-report

PSYCHOLOGICAL INFLEXIBILITY

8

measurement of psychological inflexibility by probing the relative psychometric validity of responses to all three instruments—the AAQ-II, AFQ-Y8, and AFQ-Y17—with a general sample of young adults. Within the framework of test validation outlined in the Standards for Educational and Psychological Testing (Joint Committee of the American Educational Research Association [AERA], American Psychological Association [APA], & National Council of Measurement in Education [NCME], 2014), the notion of validity is defined as “the degree to which evidence and theory support the interpretations of test scores for proposed test uses” (p. 11). The Standards outline several sources of potential validity evidence, but the present study takes up just two types: “evidence based on internal structure” and “evidence based on relations to other variables.” The first type of evidence, referred to hereafter as structural validity, is concerned with the degree to which responses to test items conform to the construct purported to be measured by the test. Given the AAQ-II and both versions of the AFQ-Y purport to measure the general construct of psychological flexibility, it was expected that responses to the items for each instrument would (a) have robust loadings onto a single latent factor, (b) demonstrate at least adequate internal consistency, and (c) have at least adequate data–model fit to a unidimensional measurement model. Such psychometric evidence would provide validation in favor of interpreting scores derived from each measure, for either basic research or practical purposes, as representing a single general construct. Furthermore, comparing the structural coefficients obtained for each measure would allow for exploring the relative validity of responses across measures. Although one study to date has tested the psychometrics of the AAQII and AFQ-Y with the same general sample of young adults (Fergus et al., 2012), this study’s analytic approach prevented direct considerations of relative validity in two key ways: it tested the structural validity of the AFQ-Y17 but not the AFQ-Y8, and it did not test the structural

PSYCHOLOGICAL INFLEXIBILITY

9

validity of the AAQ-II. Regarding the second type of validity evidence, the present study focused on what is referred to hereafter as concurrent validity, which tests the relation of target test scores to other theoretically meaningful variables obtained at the same time point. Given that the AAQ-II and both versions of the AFQ-Y are both purported to measure psychological inflexibility, and that this construct is posited to be theoretically integral to both human wellbeing and mental health problems (Hayes et al., 2011), it was expected that scores derived from each measure would demonstrate meaningful relationships with scores from an array of other mental health measures. More specifically, considering the patterns of validity evidence observed in previous research (see above for a review), it was expected that scores from each measure would show (a) strong positive associations with each other, (b) significant negative associations with mental health problems, (c) significant positive associations with wellbeing indicators, and (c) substantial associations with theoretically-similar therapeutic process variables (i.e., mindfulness skills; see Hayes & Shenk, 2004). Evidence demonstrating such broad relationships would provide validation of scores from these measures as representing a construct that is key to mental health and wellbeing in general, as opposed to a particular mental health problem or specific wellbeing behavior. Also, comparing the concurrent validity coefficients obtained for each measure would allow for exploring the relative validity of responses across measures. Although one study has demonstrated that both the AAQ-II and AFQ-Y have comparable concurrent validity coefficients with the same sample (Fergus et al., 2012), this study’s analytic approach was limited in two key ways: it tested responses to the AFQ-Y17 but not the AFQ-Y8, and it focused solely on concurrent mental health problems—ignoring validation with wellbeing indicators as well as with theoretically-similar therapeutic process variables.

PSYCHOLOGICAL INFLEXIBILITY

10

Assuming the AAQ-II and both versions of the AFQ-Y may demonstrate similar evidence in favor of concurrent validity, the present study also undertook to test a more particular type of concurrent validity, known as incremental validity, which refers to the degree to which an additional measure contributes to the predictive ability of a preexisting measure. Given that the AAQ-II was originally intended for use with adults, and considering that the AFQ-Y was originally intended for use with youth but has since been generalized to adults with promising results (Fergus et al., 2012), the present study probed whether both versions of the latter measure (AFQ-Y8 and AFQ-Y17) have incremental validity in comparison to the former (AAQ-II) for predicting an array of concurrent mental health indicators (described above). The one available study testing the AAQ-II and AFQ-Y17 with the same sample demonstrated that scores from the AFQ-Y17 provided incremental validity in comparison to AAQ-II scores for predicting a range of concurrent mental health problems (Fergus et al., 2012). Yet no study, so far, has explored the potential incremental validity of scores from the AFQ-Y8 in comparison to AAQ-II scores, nor has any study probed the incremental validity of scores from these three measures for predicting concurrent wellbeing—or positive mental health—outcomes. Taken together, then, findings from the present study were intended to contribute to the body of psychometric validation research for self-report psychological inflexibility measures by, first, directly comparing the structural validity coefficients of all three measures with the same sample, and, second, extending the repertoire of concurrent validity measures to target a broader range of mental health constructs that are germane to psychological inflexibility theory. Method Participants

PSYCHOLOGICAL INFLEXIBILITY

11

Participants were 797 undergraduate college students attending a large university located in the southern region of the U.S. The majority of participants were female (84.6%) and selfidentified as White (76.7%). Participants who self-identified as Black or African American (12.3%) and other or mixed ethnicities (11%) were also represented within the sample. The mean age of participants was 20 years (SD = 2.25), and, at the time of the study, participants were in various years of enrollment at the university (first year = 32.6%, second year = 22.3%, third year = 27%, fourth year or more = 18.1%). Participants were recruited via an online research management system administered by the university’s Department of Psychology, which was only accessible to students enrolled in undergraduate psychology courses. Participation in the study required that students be at least 18 years of age but was not restricted by any other personal characteristics. Each participant used a secure online server to complete the study survey, which consisted of a series of demographic questions followed by various self-report instruments (see the Measures subsection). All participants received partial course credit for completing the survey, which took approximately 20–30 minutes. Approval from the university’s Institutional Review Board was obtained prior to beginning the study, and informed consent was acquired for all participants prior to initiating the online survey. Measures AAQ-II The AAQ-II (Bond et al., 2011) is a 7-item measure of psychological inflexibility. All items are directly phrased to address the construct of interest (e.g., “I’m afraid of my feelings”), requiring no reverse scoring, and are arranged along a seven-point, agreement-based response scale (1 = never true, 2 = very seldom true, 3 = seldom true, 4 = sometimes true, 5 = frequently true, 6 = almost always true, 7 = always true). The AAQ-II is scored by summing all item

PSYCHOLOGICAL INFLEXIBILITY

12

responses, with higher scale scores indicating greater levels of psychological inflexibility. Previous research regarding the psychometrics of responses to the AAQ-II was reviewed above (see the Introduction section). Psychometrics regarding responses to the measure with the present sample are presented below (see the Results section). AFQ-Y The AFQ-Y (Greco et al., 2008) is a 17-item measure of psychological inflexibility. A shorter, 8-item version of the measure is obtained by removing several items from the full, longer version. Both versions of the AFQ-Y were originally validated in the same measure development study (see the Introduction section for a review). All items are directly phrased (e.g., “The bad things I think about myself must be true”), requiring no reverse scoring, and are arranged along a five-point, agreement-based response scale (0 = not at all true, 1 = a little true, 2 = pretty true, 3 = true, 4 = very true). Composite scores for both the long and short versions of the measure are created by summing the respective item responses. Previous research regarding the psychometrics of responses to both versions of the AFQ-Y was reviewed above (see the Introduction section). Psychometrics regarding responses to both the AFQ-Y17 and AFQ-Y8 with the present sample are presented below (see the Results section). Beck Depression Inventory-2 (BDI-2) The BDI-2 (Beck, Steer, & Brown, 1996) is a 21-item measure of general depression. Items are brief statements that are directly phrased to assess depressive symptoms (e.g., “sadness,” “loss of pleasure,” and “self-dislike”), which are arranged along a 4-point response scale that is unique for each item and indicates progressing symptom severity (e.g., “Sadness”: 0 = I do not feel sad, 1 = I feel sad much of the time, 2 = I am sad all of the time, 3 = I am so sad or unhappy that I can’t stand it; “Self-dislike”: 0 = I feel the same about myself as ever, 1 = I have

PSYCHOLOGICAL INFLEXIBILITY

13

lost confidence in myself, 2 = I am disappointed in myself, 3 = I dislike myself). A composite score is produced by summing all items responses and no reverse scoring is necessary. Previous research shows that responses to the BDI-2 have strong internal consistency reliability and concurrent validity with other mental health variables (Beck et al., 1996). Responses to the BDI2 with the present sample yielded strong internal consistency reliability (α = .93). Beck Anxiety Inventory (BAI). The BAI (Beck, Epstein, Brown, & Steer, 1988) is a 21-item measure of general anxiety. Items are brief and directly phrased statements of anxious symptom (e.g., “Nervous,” “Fear of losing control,” and “Scared”), which are arranged along a 4-point response scale that indicates levels of subjective symptom severity during the past month (0 = not at all, 1 = mildly—but it didn’t bother me much, 2 = moderately—it wasn’t pleasant at times, 3 = severely—it bothered me a lot). A total scale score is calculated by summing all items responses and no reverse scoring is necessary. Previous research shows that responses to the BAI have strong internal consistency reliability and concurrent validity with other mental health indicators (Beck et al., 1988). Responses to the BAI with the present sample yielded strong internal consistency reliability: α = .92. Positive and Negative Affect Schedule (PANAS). The PANAS (Watson et al., 1988) is a 20-item measure consisting of two subscales, one assessing global positive affect (10 items) and the other global negative affect (10 items). Items consist of feeling words (e.g., ‘‘interested,’’ ‘‘irritable,’’ ‘‘attentive,’’ and ‘‘ashamed’’), which are arranged along a five-point, relative-frequency based response scale (1 = not at all, 2 = a little, 3 = moderately, 4 = quite a bit, 5 = extremely). Subscale scores are produced by summing the respective responses to negative and positive items, requiring no reverse scoring. Previous

PSYCHOLOGICAL INFLEXIBILITY

14

research shows that responses to both subscales of the PANAS have strong internal consistency reliability and concurrent validity with other mental health variables (Watson et al., 1988). Responses to these scales with the present sample yielded strong internal consistency reliability: positive affect α = .89, negative affect α = .91. Adult Hope Scale (AHS). The AHS (Snyder et al., 1991) is an 8-item measure for assessing domain-general hope. Items are directly phrased to represent the construct of interest (e.g., “I energetically pursue my goals”) and are arranged along an eight-point, agreement-based response scale (1 = definitely false, 2 = mostly false, 3 = somewhat false, 4 = slightly false, 5 = slightly true, 6 = somewhat true, 7 = mostly true, 8 = definitely true). A composite score is derived by summing the responses to all items, with no reverse scoring necessary. Previous research shows that responses to the AHS have strong internal consistency reliability and concurrent validity with other mental health variables (Snyder et al., 1991). Responses to AHS with the present sample yielded strong internal consistency reliability: α = .88. Subjective Happiness Scale (SHS). The SHS (Lyubomirksy & Lepper, 1999) is a 4-item measure for assessing domaingeneral happiness. Three of the four items are positively worded, with two items assessing general self- perceptions of happiness (e.g., ‘‘In general, I consider myself . . .’’) and the other two items assessing perceptions of one’s happiness relative to one’s peers (e.g., ‘‘Some people are generally not very happy. Although they are not depressed, they never seem as happy as they might be. To what extent does this characterization describe you?’’). The one negatively phrased item is reverse-scored, and all items are arranged along seven-point response scales that have differing qualitative anchors, depending on the item stem (1 = not a very happy person . . . 7 = a

PSYCHOLOGICAL INFLEXIBILITY

15

very happy person; 1 = less happy . . . 7 = more happy; 1 = not at all . . . 7 = a great deal). Previous research shows that responses to the SHS have adequate-to-strong internal consistency reliability and concurrent validity with various variables relevant to mental health functioning (Lyubomirksy & Lepper 1999). Responses to the SHS with the present sample yielded strong internal consistency reliability: α = .85. Kentucky Inventory of Mindfulness Skills (KIMS). The KIMS (Baer, Smith, & Allen, 2004) is a 39-item self-report measure for assessing four classes of mindfulness skills: observing (12 items), describing (8 items), acting with awareness (10 items), and accepting without judgment (9 items). All items are arranged along a five-point, agreement-based response scale (1 = never or very rarely true, 2 = rarely true, 3 = sometimes true, 4 = often true, 5 = very often or always true), with higher scale scores indicating greater mindfulness skills. Approximately half of the items are worded to directly address the constructs of interest (e.g., “I intentionally stay aware of my feelings”), whereas the other half are phrased to represent inverse constructs (e.g., “I don’t pay attention to what I’m doing . . .”) and thus require reverse scoring. Subscale scores are derived for each of the four skill areas by summing item responses. Previous research indicates that responses to all KIMS subscales are characterized by at least adequate internal consistency reliability as well as concurrent validity with other measures of mindfulness and mental health (Baer et al., 2004). Responses to the KIMS with the present sample yielded adequate-to-strong internal consistency reliability: observing α = .84, describing α = .88, acting with awareness α = .74, accepting without judgement α = .90. Data Analyses

PSYCHOLOGICAL INFLEXIBILITY

16

The structural validity of the measurement models for the AAQ-II, AFQ-Y17, and AFQY8 was examined via confirmatory factor analyses (CFA) using the maximum likelihood estimator. To determine the goodness of data–model fit, a combination of absolute and incremental fit indices was evaluated. Tucker-Lewis index (TLI) and comparative fit index (CFI) values between .90–.95 and standardized root mean square residual (SRMR) values < .08 were considered indicative of adequate data–model fit, while TLI and CFI values > .95 and SRMR values < .05 were considered indicative of good data–model fit (Kenny, 2015; Kline, 2011). Although the root mean square error of approximation (RMSEA) is another commonly used fit index for evaluating measurement models, this particular index was not used in the present study, given that the difference in degrees of freedom was substantial between measurement models for the AAQ-II and AFQ-Y8 (df = 14 and 20, respectively) compared to the AFQ-Y17 (df = 119), and RMSEA statistics have been shown to be biased in favor of models with larger degrees of freedom (Kenny, Kaniskan, & McCoach, 2015). Regarding factor loadings, λ ≥ .50 were considered strong, as they accounted for ≥ 25% of the variance extracted from each item by the latent factor, yet λ ≥ .33 were considered adequate, accounting for at least 10% of extracted variance. For latent construct internal reliability, H ≥ .70 was deemed desirable, as this suggests a strong intrafactor correlation over repeated administrations (Mueller & Hancock, 2008). CFA were conducted using Amos for SPSS version 22. The concurrent validity of scores derived from the AAQ-II, AFQ-Y17, and AFQ-Y8 was investigated via three phases of data analyses. First, bivariate correlations (Pearson r) were conducted among scores derived from each of the psychological inflexibility measures as well as between scores derived from these measures and each of the concurrent validity measures representing negative mental health functioning (i.e., BAI, BDI-2, and the negative affect scale

PSYCHOLOGICAL INFLEXIBILITY

17

[NAS] of the PANAS), positive mental health functioning (i.e., SHS, AHS, and the positive affect scale [PAS] of the PANAS), and mindfulness (i.e., the four subscales of the KIMS). The magnitude of these correlations was interpreted using common effect-size decision rules: .10–.29 = small, .30–.49 = moderate, ≥ .50 = large (Cohen, 1988). Following, the incremental validity of scores derived from the AAQ-II, AFQ-Y8, and AFQ-Y17 was tested via a series of hierarchical regressions predicting the scores derived from the measures of negative and positive mental health functioning (i.e., BAI, BDI-2, NAS, PAS, SHS, and AHS). The effect size of interest for these analyses was R2, which was interpreted using common decision rules: .01–.08 = small, .09– .24 = moderate, ≥ .25 = large (Cohen, 1988). All concurrent validity analyses were conducted using SPSS version 22. Results Structural Validity AAQ-II. Fit indices from the original CFA (Model 1) for the 7-item AAQ-II measurement model, which regressed each of the items onto a single latent factor without the addition of any further parameter constraints, suggested poor-to-adequate data–model fit: χ2 = 322.32, df = 14, p < .001, SRMR = .051, TLI = .870, CFI = .913. Examination of modification indices indicated that adding a covariance between the error terms of the two items targeting painful experiences (i.e., “My painful experiences and memories make it difficult for me to live a life that I would value” and “My painful memories prevent me from having a fulfilling life”) would substantially improve data–model fit. Given the conceptual similarity among these items, this covariance was added and the revised model (Model 2) was reanalyzed, resulting in substantially improved data– model fit: χ2 = 133.63, df = 13, p < .001, SRMR = .036, TLI = .945, CFI = .966. Standardized

PSYCHOLOGICAL INFLEXIBILITY

18

factor loadings for Model 2 were robust, with λ ranging from .73–.84 (p < .001), and the latent construct reliability of the underlying psychological inflexibility factor was strong, H = .92. AFQ-Y8. Fit indices from the original CFA (Model 1) for the AFQ-Y8 measurement model, which regressed each of the items onto a single latent factor without the addition of any further parameter constraints, suggested adequate-to-good data–model fit: χ2 = 149.30, df = 20, p < .001, SRMR = .042, TLI = .902, CFI = .930. Examination of modification indices indicated that adding a covariance between the error terms of the two items targeting impaired performance (i.e., “I stop doing things that are important to me whenever I feel bad” and “I do worse in school when I have thoughts that make me feel bad”) would improve data–model fit even further. Given the conceptual similarity among these items, this covariance was added and the revised model (Model 2) was reanalyzed, resulting in improved data–model fit: χ2 = 74.64, df = 19, p < .001, SRMR = .031, TLI = .956, CFI = .970. Standardized factor loadings for Model 2 were moderateto-robust, with λ ranging from .42–.74 (p < .001), and the latent construct reliability of the underlying psychological inflexibility factor was strong, H = .84. AFQ-Y17. Fit indices from the original CFA (Model 1) for the AFQ-Y17 measurement model, which regressed each of the items onto a single latent factor without the addition of any further parameter constraints, suggested poor data–model fit: χ2 = 905.40, df = 119, p < .001, SRMR = .062, TLI = .796, CFI = .822. Examination of modification indices indicated that adding covariances between the error terms of several item sets would substantially improve data–model fit, but that covariances between one particular item set would yield enormous improvement (i.e., “I try hard to erase hurtful memories from my mind” and “I push away thoughts and feelings that

PSYCHOLOGICAL INFLEXIBILITY

19

I don’t like”). This covariance was added and the revised model (Model 2) was reanalyzed, resulting in improved yet still poor data–model fit: χ2 = 774.80, df = 118, p < .001, SRMR = .058, TLI = .828, CFI = .851. Modification indices were again examined, suggesting that adding covariances between the error terms of two additional items sets (i.e., “I stop doing things that are important to me whenever I feel bad” and “I do worse in school when I have thoughts that make me feel bad”; “I don’t try out new things if I’m afraid of messing up” and “I do all I can to make sure I don’t look dumb in front of other people”) would add incremental improvements to data–model fit. These two covariances were added and the revised model (Model 3) was reanalyzed, resulting in improved yet still sub-optimal data–model fit: χ2 = 620.80, df = 116, p < .001, SRMR = .053, TLI = .866, CFI = .886. Modification indices were again examined, but no other parameter changes were identified that were both conceptually coherent and statistically relevant. Thus, Model 3 was selected as the preferred measurement model, despite sub-optimal data–model fit. Standardized factor loadings for Model 3 were weak-to-robust, with λ ranging from .33 to .75 (p < .001), and the latent construct reliability of the underlying psychological inflexibility factor was strong, H = .90. Concurrent Validity Bivariate correlations. Findings from bivariate correlations indicated strong positive correlations between scores from the AAQ-II and both versions of the AFQ-Y, as well as a very strong positive correlation between scores from both versions of the AFQ-Y (see Table 2). Calculation of the disattenuated correlation between these scores, which accounts for unreliability, yielded very strong coefficients, indicating that all measures of psychological inflexibility were essentially assessing the same construct: AAQ-II and AFQ-Y17 r = 1.0, AAQ-II and AFQ-Y8 r = .97, AFQ-Y17 and

PSYCHOLOGICAL INFLEXIBILITY

20

AFQ-Y8 r = .91. Strong positive correlations were also observed between scores for all psychological inflexibility measures and scores from each of the measures of negative mental health functioning (i.e., BAI, BDI-2, and NAS; see Table 2). Moderate-to-strong negative correlations were observed between all psychological inflexibility scores and measures of positive mental health functioning (i.e., SHS, AHS, and PAS; see Table 2). Results for the correlations of scores from the AAQ-II and AFQ-Y with the mindfulness measures varied depending on the construct of interest, showing strong negative associations with accepting without judgement, moderate negative associations with describing and acting with awareness, and small positive associations with observing (see Table 2). Hierarchical regressions. Results from the hierarchical linear regression models showed that scores derived from both the AAQ-II and AFQ-Y8 were substantive predictors of scores from the several measures of negative mental health (i.e., BAI, BDI-2, and NAS; see Table 3) and positive mental health (i.e., SHS, AHS, and PAS; see Table 4). Comparatively, scores derived from the AFQ-Y17 had negligible predictive power in relation to measures of negative mental health (see Table 3) and small predictive power in relation to measures of positive mental health (see Table 4). Furthermore, findings showed that when added in Step 2 of the regression models, scores from the AFQ-Y8 substantially reduced the magnitude of the standardized regression coefficients for the AAQ-II predictor and added incremental improvements to the amount of variance accounted for by the model for all concurrent mental health measures (∆R2 ranging from .01–.06; see Table 3 and Table 4). Adding the AFQ-Y17 in Step 3 of the models had a negligible effect on the magnitude of the other predictors and variance accounted for in the negative mental health

PSYCHOLOGICAL INFLEXIBILITY

21

measures (see Table 3), yet had a small positive predictive effect that slightly increased the variance accounted for in two of the positive mental health measures (see Table 4). Discussion Interpretation of Results The present study aimed to advance the scientific basis of self-report measurement of psychological inflexibility by directly investigating the relative structural and concurrent validity of responses to the AAQ-II, AFQ-Y8, and AFQ-Y17 with a general sample of college students. Although one previous study explored the relative validity of responses to AAQ-II and AFQY17 within the scope of the same sample (Fergus et al., 2012), that study was limited by the fact that it only tested the structural validity for the longer version of the AFQ-Y and because it only explored concurrent validity at the level of mental health problems. Thus, the present study intended to contribute to the validation evidence supporting the use of the AAQ-II, AFQ-Y8, and AFQ-Y17 by directly testing the structural validity of all three measures with the same sample and expanding the repertoire of concurrent validity measures to include both positive mental health indicators and theoretically-similar therapeutic processes. Regarding structural validity, findings demonstrated that responses to both the AAQ-II and AFQ-Y8 yielded good and comparable data–model fit as well as strong latent construct reliability, suggesting comparable structural validity evidence. However, the measurement model for the AFQ-Y17 was generally poor fitting. Although it is likely that removing several of the lower-loading items (λ range .30–.40) from the 17-item measurement model would substantially improve model fit, the purpose of the present study was to investigate the fit of the proposed measurement model with all original items—not to optimize the model with the present sample. Therefore, despite indicating strong latent construct reliability, the conclusion is that responses to

PSYCHOLOGICAL INFLEXIBILITY

22

the AFQ-Y17 did not evidence adequate structural validity. Interpreted within the framework of test validation outlined in the Standards for Educational and Psychological Testing (Joint Committee of the AERA, APA, & NCME, 2014), these findings suggest that scores from the AAQ-II and AFQ-Y8 could be reasonably interpreted as representing a single latent construct, whereas scores from the AFQ-Y17 should not be interpreted as such. It is noteworthy that these findings regarding poor data–model fit for the AFQ-Y17 differ from those found by Fergus et al. (2012), who reported adequate fit with a similarly general sample of young adults. Further research is therefore warranted to replicate and generalize these findings with additional samples, as the adequacy of data–model fit may be partially a function of sample demographics. Regarding concurrent validity, bivariate correlations indicated meaningful associations in the expected directions among scores derived from all psychological inflexibility measures and the three types of concurrent validity measures: negative mental health functioning, positive mental health functioning, and theoretically-similar therapeutic processes (see Table 2). Especially noteworthy is the fact that the magnitude of correlations between each of the concurrent validity measures and each of the psychological inflexibility measures was consistently similar, only differing from measure-to-measure by Δr range of .03–.08. These results are both theoretically and empirically consistent with findings from the previous studies investigating the independent validity of scores derived from these measure (e.g., Bond et al., 2011; Greco et al., 2012; Fergus et al., 2012).Taken together, these findings also suggest the conclusion that there are no meaningful differences in the concurrent validity evidence of scores derived from the AAQ-II, AFQ-Y8, and AFQ-Y17 when analyzed independently, as the difference in the magnitude of effect sizes yielded by all measures across all correlations was unremarkable. Within the framework of validation presented in the Standards for Educational

PSYCHOLOGICAL INFLEXIBILITY

23

and Psychological Testing (Joint Committee of the AERA, APA, & NCME, 2014), this suggests that all measures of psychological inflexibility represent a construct that has meaningful relations with an array of mental health indicators, which is congruent with the purported role of psychological inflexibility in supporting theory (see Hayes, Strosahl, & Wilson, 2011). The present study also yielded some results that warrant further empirical attention. Specifically, the correlation observed between scores derived from the two versions of the AFQY indicated 85% shared variance, suggesting the measures are largely empirically redundant, yet the association between scores from the AAQ-II and both versions of the AFQ-Y indicated far less shared variance (61–62%), suggesting greater empirical distinctness. Compared to the highest amount of variance shared by the scores from the psychological inflexibility measures and any of the concurrent validity variables (R2 = .48), the variance shared by scores from psychological inflexibility measures was characterized by ΔR2 of at least .14, which suggests the target measures were indeed tapping more similar constructs. Yet the substantial amount of difference in shared variance (ΔR2 = .23) between the different measures of psychological inflexibility (i.e., AAQ-II with both versions of the AFQ-Y, compared with the correlation between both versions of the AFQ-Y) is greater than that observed among the associations of the target and concurrent validity measures, suggesting that the AAQ-II and both versions of the AFQ-Y are likely tapping some unique aspects of the construct. This suggestion of potential differential construct representation is further supported by the incremental validity analyses, which indicated small-to-moderate changes in the variance accounted for in the concurrent mental health measures when adding the AFQ-Y8 in addition to the AAQ-II alone (ΔR2 range = .01–.06; see Table 3 and Table 4). Interestingly, however, the addition of the AFQ-Y17 in the final step of the regression models added no additional power in

PSYCHOLOGICAL INFLEXIBILITY

24

predicting the negative mental health measures (see Table 3), suggesting redundancy with the AFQY-8. Yet the addition of the AFQ-Y17 did make minor positive contributions to the prediction of two of the three positive mental health measures (see Table 4), which suggests some empirical distinctness. On the whole, then, the conclusion seems to be that the relative concurrent validity evidence for all measures of psychological inflexibility is similar when examined independently (i.e., via bivariate correlations), but that some incremental validity is evidenced when these same scores are considered in tandem (i.e., via hierarchical regressions)— suggesting potential differences in construct representation. These findings are generally congruent with the pattern of incremental validity findings observed by Fergus et al. (2012), yet the additional variance accounted for by the AFQ-Y scores in the present study was generally smaller than in that study. Moreover, it is noteworthy that Fergus et al. (2012) only probed the longer (17-item) version of the AFQ-Y, and therefore the present study offers the first relative validity evidence for responses to both versions of the AFQ-Y in relation to the AAQ-II as well as in relation to an array of concurrent mental health outcomes. Limitations and Future Directions Findings from the present study should be considered in light of a few key methodological limitations. First, the sample was largely homogenous according to ethnic and gender demographics (i.e., majority White and female), which prevented the possibility of more nuanced analyses, such as measurement invariance across demographic groups and further exploration of concurrent validity as a function of demographic factors. To remedy this limitation, further research is warranted to replicate these findings with larger and more diverse samples of young adults. Given the general nature of the sample, follow-up studies that investigate the comparative validity of responses to the AAQ-II and both versions of the AFQ-Y

PSYCHOLOGICAL INFLEXIBILITY

25

with targeted clinical samples are likely to be especially informative (cf. Bond et al., 2011). Second, considering that all of the positive and negative mental health measures were obtained via self-report, conclusions related to the relative concurrent validity of scores from the AAQ-II and AFQ-Y are limited to a mono-method source. Future research is therefore warranted to expand the methodological repertoire of validity measures by testing the functionality of scores derived from both measures of psychological inflexibility in relation with (a) longitudinal selfreports, (b) concurrent clinician-reports of relevant variables (e.g., historical mental health diagnosis or rating of current symptomology), and (c) direct behavioral responses indicative of psychological flexibility, such as those developed for use in laboratory-based experiments (e.g., the implicit relational assessment procedure; Power, D. Barnes-Holmes, Y. Barnes-Holmes, & Stewart, 2009). Ultimately, establishing validity evidence via multimethod and multisource data will allow for more nuanced analyses and stronger conclusions regarding the psychometric validity of scores derived from these measures. Finally, the results of the present study are limited to the extent that they provide no direct evidence regarding the relative treatment validity—or usefulness for informing the selection and evaluation of interventions (see Hayes, Nelson, & Jarrett, 1987)—of scores derived from these measures. Although psychometric validation is often considered a prerequisite to treatment validation, it is imperative to recognize that evidence in favor of the former does not suggest the latter, and vice versa (Ciarrochi et al., 2016). Thus, a companion line of treatment validity research investigating the relative functionality of scores derived from the AAQ-II and AFQ-Y for informing intervention is warranted on its own terms. Within the framework of test development provided in the Standards for Educational and Psychological Testing (Joint Committee of the AERA, APA, & NCME, 2014), this type of inquiry falls under the category of

PSYCHOLOGICAL INFLEXIBILITY

26

“evidence for validity based on the consequences of testing,” and is distinct from the other types of validity evidence taken up in the present study. Since its inception, the intention of the psychological flexibility model of mental health and wellbeing has been to inform actual clinical practice (Hayes et al., 2011), and some researchers have even suggested that interventions based on this model can be scaled-up to promote the mental health of entire communities (Levin, Lillis, & Biglan, 2016). So far, however, the validation evidence with treatment implications is primarily indirect in nature, showing that measures of psychological inflexibility partially mediate intervention outcomes (Levin & Villate, 2016). Researchers interested in progressing this line of work are therefore encouraged to continue probing the relative psychometric validity of scores derived from self-report measures of psychological inflexibility, while also investigating the consequences of using these measures in practice.

PSYCHOLOGICAL INFLEXIBILITY

27 References

Baer, R. A., Smith, G. T. & Allen, K. B. (2004). Assessment of mindfulness by self-report: The Kentucky Inventory of Mindfulness Skills. Assessment, 11, 191–206. doi:10.1177/1073191104268029 Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the Beck Depression Inventory-2. San Antonio, TX: Psychological Corporation. Beck, A. T., & Steer, R. A. (1993). Manual for the Beck Anxiety Inventory. San Antonio, TX: Psychological Corporation. Biglan, A., & Hayes, S. C. (2016). Functional contextualism and contextual behavioral science. In R. D. Zettle, S. C. Hayes, D. Barnes-Holmes, & A. Biglan (Eds.), The Wiley Handbook of Contextual Behavioral Science (pp. 37–61). West Sussex, UK: Wiley. Bond, F. W., Hayes, S. C., Baer, R. A., Carpenter, C. M., Guenole, N., Orcutt, H. K., . . . Zettle, R. D. (2011). Preliminary psychometric properties of the Acceptance and Action Questionnaire-II: A revised measure of psychological inflexibility and experiential avoidance. Behavior Therapy, 42, 676–688. doi:10.1016/j.beth.2011.03.007 Ciarrochi, J., Zettle, R. D., Brockman, R., Duguid, J., Parker, P., Sahdra, B., & Kashdan, T. B. (2016). Measures that make a difference: A functional contextualistic approach to optimizing psychological measurement in clinical research and practice. In R. D. Zettle, S. C. Hayes, D. Barnes-Holmes, & A. Biglan (Eds.), The Wiley Handbook of Contextual Behavioral Science (pp. 320–346). West Sussex, UK: Wiley. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Fergus, T. A., Valentiner, D. P., Gillen, M. J., Hiraoka, R., Twohig, M. P., Abramowitz, J. S., &

PSYCHOLOGICAL INFLEXIBILITY

28

McGrath, P. B. (2012). Assessing psychological inflexibility: The psychometric properties of the Avoidance and Fusion Questionnaire for Youth in two adult samples. Psychological Assessment, 24, 402–408. doi:10.1037/a0025776 Greco, L. A., Lambert, W., & Baer, R. A. (2008). Psychological inflexibility in childhood and adolescence: Development and evaluation of the Avoidance and Fusion Questionnaire for Youth. Psychological Assessment, 20, 93–102. doi:10.1037/1040-3590.20.2.93 Hayes, S. C., Luoma, J. B., Bond, F. W., Masuda, A., & Lillis, J. (2006). Acceptance and commitment therapy: Model, processes and outcomes. Behaviour Research and Therapy, 44, 1–25. doi:10.1016/j. brat.2005.06.006 Hayes, S. C., Nelson, R. O., & Jarrett, R. B. (1987). The treatment utility of assessment: A functional approach to evaluating assessment quality. American Psychologist, 42, 963–974. doi:10.1037/0003-066X.42.11.963
 Hayes, S. C., Strosahl, K. D., & Wilson, K. G. (2011). Acceptance and commitment therapy: The process and practice of mindful change (2nd ed.). New York, NY: Guilford. Hayes, S. C., Strosahl, K. D., Wilson, K. G., Bissett, R. T., Pistorello, J., Toarmino, D., . . . McCurry, S. M. (2004). Measuring experiential avoidance: A preliminary test of a working model. The Psychological Record, 54, 553–578. Hayes, S. C., & Shenk, C. (2004). Operationalizing mindfulness without unnecessary attachments. Clinical Psychology: Science and Practice, 11, 249–254. doi:10.1093/clipsy.bph079 Hayes, S. C., Villatte, M., Levin, M., & Hildebrandt, M. (2011). Open, aware, and active: Contextual approaches as an emerging trend in behavioral and cognitive therapies. Annual Review of Clinical Psychology, 7, 141–168. doi:10.1146/annurev-clinpsy-032210-104449

PSYCHOLOGICAL INFLEXIBILITY

29

Kenny, D. A. (2015). Measuring model fit in structural equation modeling. Retrieved from www.davidak- enny.net/cm/fit.htm Klingbeil, D. A., Renshaw, T. L., Willenbrink, J. B., Copek, R. A., Chan, K. T., Haddock, A., Yassine, J., & Clifton, J. (2017). Mindfulness-based interventions with youth: A comprehensive meta-analysis of group-design studies. Journal of School Psychology, 63, 77–103. doi:10.1016/j.jsp.2017.03.006 Joint Committee of the American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. Kenny, D. A., Kaniskan, B., & McCoach, D. B. (2015). The performance of RMSEA in models with small degrees of freedom. Sociological Methods & Research, 44, 486–507. doi:10.1177/0049124114543236 Khoury, B., Lecomte, T., Fortin, G., Masse, M., Therien, P., Bouchard, V., . . . Hofmann, S. G. (2013). Mindfulness-based therapy: A comprehensive meta-analysis. Clinical Psychology Review, 33, 763–771. doi:10.1016/j.cpr.2013.05.005 Levin, M. E., Hildebrandt, M. J., Lillis, J., & Hayes, S. C. (2012). The impact of treatment components suggested by the psychological flexibility model: A meta-analysis of laboratory-based component studies. Behavior Therapy, 43, 741–756. doi:10.1016/j.beth.2012.05.003 Levin, M. E., & Villatte, M. (2016). The role of experimental psychopathology and laboratorybased intervention studies on contextual behavioral science. In R. D. Zettle, S. C. Hayes, D. Barnes-Holmes, & A. Biglan (Eds.), The Wiley Handbook of Contextual Behavioral

PSYCHOLOGICAL INFLEXIBILITY

30

Science (pp. 347–364). West Sussex, UK: Wiley. Lyubomirksy, S., & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social Indicators Research, 46, 137–155. doi:10.1023/A:1006824100041. Mueller, R. O., & Hancock, G. R. (2008). Best practices in structural equation modeling. In J. Osborne (Ed.), Best practices in quantitative methods (pp. 488-508). Thousand Oaks, CA: Sage. doi:10.4135/9781412995627.d38 Power, P., Barnes-Holmes, D., Barnes-Holmes, Y., & Stewart, I. (2009). The implicit relational assessment procedure (IRAP) as a measure of implicit relative preferences: A first study. The Psychological Record, 59, 621–640. Renshaw, T. L. (2017). Screening for psychological inflexibility: Initial validation of the Avoidance and Fusion Questionnaire for Youth as a school mental health screener. Journal of Psychoeducational Assessment, 35, 482–493. doi:10.1177/0734282916644096 Renshaw, T. L., Bolognino, S. J., Roberson, A. J., Upton, S. R., & Hammons, K. N. (2017). Using acceptance and commitment therapy to support bereaved students. In J. A. Brown & S. R. Jimerson (Eds.), Supporting bereaved students at school (pp. 223–235). New York, NY: Oxford. Snyder, C. R., Harris, C., Anderson, J. R., Holleran, S. A., Irving, L. M., Sigmon, S. T., . . . Harney, P. (1991). The will and the ways: Development and validation of an individualdifferences measure of hope. Journal of Personality and Social Psychology, 60, 570–585. doi:10.1037/0022-3514.60.4.570 Ventra, A., Sharp, C., & Hart, J. (2012). The relation between anxiety disorder and experiential avoidance in inpatient adolescents. Psychological Assessment, 24, 240–248.

PSYCHOLOGICAL INFLEXIBILITY

31

doi:10.1037/a0025362 Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070. doi:10.1037/0022-3514.54.6.1063 Zhang, C., Chung, P., Si, G., & Liu, J. D. (2014). Psychometric properties of the Acceptance and Action Questionnaire-II for Chinese college students and elite Chinese athletes. Measuring and Evaluation in Counseling and Development, 47, 256–270. doi:10.1177/0748175614538064

PSYCHOLOGICAL INFLEXIBILITY

32

Table 1 Observed Descriptive Statistics for All Psychological Inflexibility Measures Measure Skewness Kurtosis α M SD IQR AAQ-II AFQ-Y8

21.65 9.14

9.67 6.22

14 9

.50 .87

–.44 .34

.91 .82

AFQ-Y17

23.33

12.23

18

.52

–.12

.89

Note. AAQ-II = Acceptance and Action Questionnaire–II; AFQ-Y8 = Avoidance and Fusion Questionnaire, 8-item version; AFQ-Y17 = Avoidance and Fusion Questionnaire, 17-item version.

PSYCHOLOGICAL INFLEXIBILITY

33

Table 2 Correlations Between Measures of Psychological Inflexibility and Concurrent Validity Variables Measure of Psychological Inflexibility Variable AAQ-II AFQ-Y8 AFQ-Y17 AAQ-II – .79** .78** AFQ-Y8 .79** – .92** AFQ-Y17 .78** .92** – BAI .63** .60** .57** BDI-2 .69** .69** .66** NAS .65** .62** .62** SHS –.61** –.58** –.53** AHS –.40** –.42** –.36** PAS –.34** –.32** –.27** OS–KIMS .23** .23** .26** DS–KIMS –.35** –.32** –.30** AAS–KIMS –.35** –.33** –.31** AJS–KIMS –.69** –.65** –.66** Note. AAQ-II = Acceptance and Action Questionnaire–II; AFQ-Y8 = Avoidance and Fusion Questionnaire, 8-item version; AFQ-Y17 = Avoidance and Fusion Questionnaire, 17-item version; BAI = Beck Anxiety Inventory; BDI-2 = Beck Depression Invention–2; NAS = Negative Affect Scale; SHS = Subjective Happiness Scale; AHS = Adult Hope Scale; PAS = Positive Affect Scale; KIMS = Kentucky Inventory of Mindfulness; OS–KIMS = Observing Scale of the KIMS; DS–KIMS = Describing Scale of the KIMS; AAS–KIMS = Acting with Awareness Scale of the KIMS; AJS–KIMS = Accepting without Judgement Scale of the KIMS. ** p < .01

Table 3 Hierarchical Linear Regressions for Negative Mental Health Outcomes

PSYCHOLOGICAL INFLEXIBILITY

34

Outcome

Step

Predictor

β

R2

∆R2

BAI

Step 1 Step 2

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

.63** .43** .25** .43** .26** –.01 NS

.40 – .43 – – .43

– – .03 – – .00

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

.69** .38** .39** .39** .42** –.03 NS

.48 – .53 – – .53

– – .06 – – .00

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

.65** .40** .31** .38** .20** .13NS

.42 – .45 – – .45

– – .04 – – .00

Step 3

BDI-2

Step 1 Step 2 Step 3

NAS

Step 1 Step 2 Step 3

Note. AAQ-II = Acceptance and Action Questionnaire–II; AFQ-Y8 = Avoidance and Fusion Questionnaire, 8-item version; AFQ-Y17 = Avoidance and Fusion Questionnaire, 17-item version; BAI = Beck Anxiety Inventory; BDI-2 = Beck Depression Inventory–2; NAS = Negative Affect Scale. NS = Non-significant, p > .05 ** p < .01

PSYCHOLOGICAL INFLEXIBILITY

35

Table 4 Hierarchical Linear Regressions for Desirable Concurrent Outcomes Outcome Step Predictor β R2 ∆R2 SHS

Step 1 Step 2 Step 3

AHS

Step 1 Step 2 Step 3

PAS

Step 1 Step 2 Step 3

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

.63** –.40** –.26** –.43** –.42** .19**

.40 – .40 – – .40

– – .03 – – .00

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

–.40** –.18** –.28** –.23** –.50** .28**

.16 – .19 – – .20

– – .03 – – .01

AAQ-II AAQ-II AFQ-Y8 AAQ-II AFQ-Y8 AFQ-Y17

–.34** –.24** –.14** –.28** –.35** .27**

.11 – .12 – – .13

– – .01 – – .01

Note. AAQ-II = Acceptance and Action Questionnaire–II; AFQ-Y8 = Avoidance and Fusion Questionnaire, 8-item version; AFQ-Y17 = Avoidance and Fusion Questionnaire, 17-item version; SHS = Subjective Happiness Scale, AHS = Adult Hope Scale, PAS = Positive Affect Scale. ** p < .01 Highlights     

Investigated the psychometric validity of responses to AAQ-II, AFQY-8, and AFQY-17 Responses to AAQ-II and AFQY-8 demonstrated comparably good structural validity Responses to the AFQY-17 showed poorer structural validity Scores from all measures show concurrent validity with mental health indicators Potential incremental validity is demonstrated for some but not all measures