CYSR-03104; No of Pages 6 Children and Youth Services Review xxx (2016) xxx–xxx
Contents lists available at ScienceDirect
Children and Youth Services Review journal homepage: www.elsevier.com/locate/childyouth
Letter to the Editor
Rebuttal to Lyons & Israel
Introduction Our original publication – “The Treatment Outcome Package (TOP): A multi-dimensional level of care matrix for child welfare” (Kraus, Baxter, Alexander, & Bentley, 2015) – has sparked a spirited debate on tool construction. Lyons and Israel (this volume) take issue with four of our conclusions as it relates to CANS: validity, measurement of functioning, measure of meaningful change, and item specificity.1 We respond to each criticism and contrast our review with the standards we used to develop the Treatment Outcome Package (TOP). We obviously think we have developed a better tool (in terms of validity, ease of use, and utility) and our analysis should be taken in this context. 1. Validity Of primary concern, Lyons and Israel contest the quote from our original article: “there is little validity data available” on CANS (p. 173). They cite seven publications purported to document the tools validity. We disagree. The American Psychological Association (2014) publishes standards that tool developers (especially psychologists) should follow. As reviewed in detail below, none of the Lyonscited articles have a primary purpose of documenting these APAdefined psychometric properties. For example, none of the articles document how CANS correlates with gold standards or other established measures of the underlying CANS domains (i.e., concurrent validity). 1.1. Concurrent validity We could find no peer-reviewed publications of CANS on this topic. However, Lyons has assessed concurrent validity between CANS and CAFAS in an APA poster presentation (Dilly & Lyons, 2003). The authors appear to conflate small but significant correlations with validity. Only four of hundreds of CANS constructs document some validity in this publication: • School/Day Care Functioning (r = 0.63 with CAFAS School/Work Performance) • Substance Abuse (r = 0.72 with CAFAS Substance Use) • Danger to Self (r = 0.63 with CAFAS Self-Harm Behavior) • Psychosis (r = 0.55 with CAFAS Thinking) ☆ Declaration: David Kraus is the original owner and developer of TOP. In the field of child welfare, which was the focus of this publication, Dr. Kraus licensed TOP for worldwide, perpetual use for a nominal fee of $10 to a 501(c)(3) public charity doing business as Kids Insight. Dr. Kraus is compensated by Kids Insight for consulting services provided to Kids Insight, and is employed by Outcome Referrals, Inc., a company which provides data analytical services to Kids Insight at below market rates. 1 This is a revised response letter from Lyons and Israel. The first letter was posted by this journal on its website and downloaded by a small number of readers. This rebuttal responds only to the issues raised in the current version of the publication.
The other CANS items demonstrated poor correlations with CAFAS items. For example, the next highest correlation (r = 0.40) was between Oppositional Behavior and CAFAS Home Performance, meaning that only 16% of the variance was shared in common between these two constructs. The average reported correlation was only 0.17. Given these poor results it is doubtful that Lyons can conclude that CANS has documented concurrent validity. Other concurrent validity testing is warranted. By contrast, we developed and refined TOP items over twenty years of research and development efforts, by adding, modifying and deleting items until all types of validity were established. Concurrent validity of the fourth and current iteration of TOP has been established in peerreviewed publications (Kraus, Seligman, & Jordan, 2005; Baxter et al., 2016) comparing TOP with excellent results to the: o o o o o o o
Child Behavior Checklist (CBCL; Achenbach & Rescorla, 2001); Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997); Beck, Steer, & Garbin, 1988); Brief Symptom Inventory (BSI; Derogatis, 1975); MMPI-2 (Hathaway & McKinley, 1989); BASIS 32 (Eisen, Grob, & Klein, 1986); SF-36 (Ware & Sherboume, 1992).
1.2. Construct validity Construct validity is the degree to which a test measures what it claims, or purports, to be measuring (APA, 2014). We argue that this type of validity is the most important type of validity for an assessment as it relates to level of care placement decisions. Child welfare caseworkers are using a tool such as TOP or CANS to assess a child's need for a costly and potentially disruptive level-of-care placement decision. Caseworkers need to know that the constructs they think they are measuring are meaningful and valid (e.g., dangerousness to self or others). A prototypical way of establishing a tool's construct validity is through confirmatory factor analysis (Brown, 2006). In 2008, Lyons agreed, acknowledging that “future research should establish the … construct validity of the CANS scales used here through exploratory and confirmatory factor analysis” (Sieracki, Leon, Miller, & Lyons, 2008, p. 806). He conducted the recommended study and presented the findings to the Midwestern Psychological Association. However, the results call into question the validity of CANS (Miller, Leon, & Lyons, 2007). Lyons lists three relevant conclusions: • Equally weighting of CANS items (a central tenet of communimetric theory) was unsupported, yet this scoring methodology is still used today; • The CANS domains identified in CANS manuals (e.g., risk behaviors and traumatic stress) did not document validity – “Lyons' six-factor model did not fit well” (p. 2), yet Lyons still promotes the use of these constructs;
http://dx.doi.org/10.1016/j.childyouth.2016.10.007 0190-7409/© 2016 Elsevier Ltd. All rights reserved.
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007
2
Letter to the Editor
• A CANS scoring system that met construct validity standards could not be derived (i.e., GFI was only 0.81; TLI was not reported).
Given these poor results, CANS should be revised to meet the standard that Lyons acknowledged are important. By contrast, all twenty domains measured by the various age versions of TOP have excellent construct validity, exceeding the standards that Lyons tried to achieve with CANS (Kraus, Boswell, Wright, Castonguay, & Pincus, 2010; Kraus et al., 2005). 1.3. Face validity Face validity is the degree to which a psychological assessment appears effective in terms of its stated aims. Here we partially agree with Lyons that almost all CANS items are important constructs to measure, but disagree that they can be validly assessed with just one nominal scale per construct. We are not sure what Lyons means by “field validity” a term not defined by the APA and used in Lyons writings (cf., Chor, McClelland, Weiner, Jordan, & Lyons, 2012), yet we assume it is related to face validity or simply a shorthand notation meaning that people are using the scale in practice. We also question the face validity of CANS response choices, especially the score of one (1) on a four-point scale that ranges from zero to three. Since a CANS score of one (1) can hold three different meanings, as described below, we conclude that CANS response choices are nominal rather than ordinal. CANS ratings are supposed to present the “shared vision” of the case – a consolidation of the viewpoints of all stakeholders into one rating per item (Lyons, 2009a, 2009b, p. 4). CANS users are to include all relevant parties (child, treatment teams, family, etc.) in arriving at a shared consensus for each of a hundred plus ratings. “When agreements are not resolvable, the ‘1’ rating (indicate watchful waiting/prevention) is recommended” (Obeid & Lyons, 2011, p. 76). This rating procedure may introduce serious error into CANS domains and total scores. A mid-point score of ‘1’ on CANS appears to have three meanings that cannot be differentiated in research and jurisdictional analytic studies: • A mild sub-clinical problem; or • We all cannot agree on a score, so we'll agree to disagree and accept a score in the middle of the scale; or • We don't have enough information yet, and need to ‘watchfully wait’ for more data.
These errors may arbitrarily miscue the final score by the value of 1.0 each time a rater chooses the second or third meaning of the score. When CANS total scores are in the 16–17 range (cf., Lyons, Woltman, Martinovich, & Hancock, 2009), these errors of measurement probably distort CANS profiles. Relatively healthy children could quickly look just as disturbed as those in residential care. On the other hand, children with serious problems (e.g., those who are actively suicidal) may have raters that disagree as to whether the “real” score was a two (2) or three (3). These children could look much healthier on CANS than they really are. It appears that research on this unique scoring procedure is warranted. 1.4. Predictive validity If a tool is able to reliably predict some other criterion it is said to have predictive validity (APA, 2014). For example, do SAT tests predict future academic achievement in college? For this discussion, if an assessment tool can help caseworkers predict the level of care in which a child has the best chance of make long-term sustainable progress in their well-being, it would have predictive validity. We have documented TOP's ability to:
• predict the future performance of providers (Kraus et al., 2016); • predict child welfare placement instability (Alexander et al., submitted for publication); • identify rapid responders to treatment (Nordberg, Castonguay, Fisher, Boswell, & Kraus, 2014); • predict future congregate care placements (independent, proprietary, third-party analyses include: Beacon Health Strategies, 2008; Kelly, O'Donnell, Pelletier, & Simmons, 2008; Stelk & Berger, 2009).
The seven studies cited by Lyons and Israel are offered as evidence that “individual CANS items and composite scores have repeatedly been shown to predict behavioral health, child welfare, and juvenile justices outcomes of clinical and functional importance” (p. 1). We see little evidence within these articles to support this conclusion, and none of these articles appear to have a primary or secondary aim of establishing CANS validity: • One, (Ellis et al., 2012) is unrelated to this LOC debate as it reports on an entirely different version of CANS than that used to make level of care recommendations (cf. Chor, McClelland, Weiner, Jordan, & Lyons, 2014, Chor et al., 2012). • Another is unrelated to any version of CANS or its items or composite scores, (Yampolskaya, Armstrong, & Vargo, 2007). No CANS scores were used in their analyses to determine the factors associated with negative child welfare outcomes. Instead, “Based on the assessment conducted using CANS-MH and collateral information available, case workers obtain the information regarding the child's physical health and presence of emotional and behavioral problems” (p. 1356–7). In other words, based on this chart review, two binary variables were created to represent the presence or absence of emotional and behavioral problems. As such, this study cannot be cited to support the conclusion Lyons and Israel ask readers to accept. • Several cited studies simply look at how CANS items relate to each other during the same time period and is therefore unrelated to validity (Cordell, Snowden, & Hosier, 2016; Dunleavy & Leon, 2011). For example, in the second study the authors concluded that “difference scores, rather than Time 1 CANS-MH scores, stood out as the primary predictors of resolution” of antisocial behaviors (p. 2352). Resolution was defined as a reduction of CANS antisocial scores and was associated with reduction of other CANS scores. • Similarly, another (Weiner, Schneider, & Lyons, 2009) documented that intake scores on CANS were correlated with change scores on the same construct (i.e., traumatic stress symptoms). This is not an established form of validity. If, by contrast, Lyons could establish that a CANS outcome score on a single construct could be predicted by intake scores from different CANS items or constructs, we would agree that this shows some predictive validity. However, even here, CANS publications suggest CANS performs poorly. For example, CANS initial scores are only associated with 5% of the variance in CANS follow-up scores (Sieracki et al., 2008). An additional 1% of the outcome variance is explained by the provider. These statistics are very low compared to variance explained by other tools (cf., Wampold & Brown, 2005; Wampold & Imel, 2015; Saxon & Barkham, 2012). For example, the Adult TOP, which is used with older youth and parents, can account for more than 53% of outcome variance based on intake client scores (40.16%) and the provider associated with the treatment (12.93%) with additional variance explained by changes in the client's life stress and medical issues (Kraus et al., 2016). We contend that this is an important differentiator and related to the LOC debate at hand. Explaining outcome variance is directly related to the predictive power of an outcome tool to help establish the right level of care. • Although the purpose of another study (Lyons, Griffin, Quintenz, Jenuwine, & Shasha, 2003) was related to whether linking children to community services was associated with better outcomes, the
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007
Letter to the Editor
partial logistic regression results reported in Table 3 (p. 1632) might have provided some evidence of CANS predictive validity. However, there is insufficient data to evaluate the contribution of CANS to the model. The most significant factor in the prediction appears to be a non-CANS item: “(for) youths who were provided with direct services from the liaison for more than one month, the re-arrest rate was only 29 percent” compared to the 42% for the overall sample. Large improvement scores (not initial functioning on CANS) were also listed as accounting for a large percent of the variance. • Lyons and Israel cite one study that is directly related to the LOC debate at hand (Chor et al., 2014). Chor and colleagues conclude that when the CANS LOC algorithm was followed by the placement team (the concordant group), children documented the most improvement. This is a promising finding. However, since the CANS LOC algorithm does not predict concordance, we are not sure how this study provides evidence of predictive validity. What does appear clear is that when the placement team decided to ignore the LOC algorithm and place children in a higher level of care (e.g., residential) the team noted concerns about the child that were reflected on these children's higher overall CANS scores but not in the LOC algorithm. These more troubled children had more variability in outcomes and were likely harder to treat (e.g., the finding might be an artifact of nonrisk-adjusted data). By contrast, children that were placed in a lower level of care showed poorer outcomes on CANS. However, since CANS rates children on a four-point scale, these low-scoring children at intake had few domains on which to document real improvement.
1.5. Reliability A frequently-cited study on CANS validity (cf., Lyons, McClelland, & Jordan, 2010) is a reliability study (Anderson, Lyons, Giles, Price, & Estes, 2003). Lyons has also elevated the importance of inter-rater reliability in tool construction over other reliability constructs like inter-item consistency (Lyons, 2009a, 2009b). Anderson and colleagues report the interrater reliability of CANS when used as a chart-review tool by a psychologist or other trained expert. This study is offered by Lyons as an important study setting the stage for CANS LOC algorithm (Chor et al., 2014). However, CANS is not used as a chart review tool in the CANS LOC application in child welfare. We therefore see several problems in citing Anderson as evidence of either CANS reliability or validity. Most importantly we can find no studies (published or otherwise) on the reliability of CANS when used in child welfare as a team-decision-making tool where the expert completing CANS is to solicit and negotiate a consensus rating from all participants (cf., Lyons, 2009a, 2009b, p. 4; Obeid & Lyons, 2011, p. 76). This approach is quite different from a chart review process. This calls into question the CANS training process that simulates the chart review process and not the recommended team-decisionmaking process. “Every CANS user must have at least a bachelor's degree, complete a CANS training module based on case vignettes/records, and meet at least 0.70 reliability for annual re-certification.” (Chor et al., 2014, p. 3). For example, Lyons chides users who do not administer the CANS according to his team-decision-making requirement. “It is always possible that someone who completes the measure fails to actually involve others in its [completion. As such, one should] ensure that all parties expect to see it and participate in its production” (Obeid & Lyons, 2011, p. 76). We conclude that the reliability of the child-welfare administration of CANS that reflects the consensus rating of all stakeholders has not been, and should be, studied. In summary, Lyons and Israel complain that we falsely concluded that CANS has little validity data available in the published literature. We find no articles, published or otherwise that establish CANS'
3
validity according to the standards of the American Psychological Association. 2. Current functioning Lyons and Israel argue that we falsely concluded that “CANS scores do not reflect a child's current level of functioning,” yet they proceed to tell the reader that this is precisely how one should complete certain CANS items. They write: “CANS defines a person's functioning without interventions in place.” This means that each rater is required to rate children as if they are in a hypothetical situation – what the child would look like now, with no supports or treatment. We are aware of no other tool used in health or welfare that asks raters to assess children by using their professional judgment to predict a child's behavior without supports or treatment. We also find no guidance on how the other raters (e.g., foster parents and the child him/herself) should participate in such a process. Caseworkers and clinicians do not have crystal balls. No one knows how a child would currently look deprived of his or her medication, treatments or supports. Similarly, we find no guidance in CANS manuals or training material that tell raters how far one should proceed in stripping away the various onion-layers of treatments and supports that children receive. Is it only medications? Is it only medications and residential/institutional care? Is it all of the above as well as any individual or family psychological treatment? Does this include pastoral counseling or guidance counseling in school? Does it include a warm and supportive pediatrician? Does it include the welcome hug of a friend or one's prized transitional object? Does it mean we are to complete CANS as if there are no child welfare supports like caseworkers or foster parents in place? Wherever this onion-layer-peeling stops, Lyons should provide evidence that this process yields valid and reliable results. Lyons has called residential care a “transformational offering” (Lyons, 2014) and has championed the residential cause as the editor of the American Association of Child Residential Centers' journal. However, we don't see how the image of residential care is improved when residential providers are encouraged to ignore how symptoms improved while children are in their facilities and, instead encouraged to assume the child will return to their violent past when discharged unless their condition is “cured”. Instead, we suggest that providers should look at 30, 60, 90 and one-year follow-up results to see what services lead to the best long-term outcomes. Let's identify these transformational offerings. These discoveries will help the children who need our services the most. 3. Norms Lyons and Israel cry foul that we chide CANS for not having general population norms (either regional or national). They claim that CANS is unique in that it uses “non-arbitrary anchors” and does not require such psychometric data. Even if this were true for individual items that are purported to be scored based on clinical judgment as to whether treatment or immediate intervention is required, it is clearly not an accurate statement for domain scores or total scores. For example, there are nine items on the CANS-Comprehensive Behavior and Emotion subscale resulting in scores that range from zero to twenty-seven (Praed Foundation website). In Wisconsin, three additional items were added to this subscale (Somatization, Behavioral regression, and Affect dysregulation; wcwpds.wisc.edu). This yields scores that range from zero to thirty-six. These are arbitrary scores, the interpretation of which could be augmented by norms. Both the American Psychological Association and the Society for Psychotherapy Research recommend that outcome tools provide general population norms in order to help users differentiate between normal and abnormal scores (Horowitz, Lambert, & Strupp, 1997). We agree with Lyons and Israel that regional
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007
4
Letter to the Editor
and frequently updated norms would improve the interpretation of both the TOP and CANS. 4. Ceiling effects and sensitivity to change Lyons and Israel attempt to disprove our criticism that individual constructs like depression on CANS have serious ceiling effects by claiming that CANS is used as an outcome measure. We don't see the relationship. Each individual item on CANS (like depression) is measured by one, single, four-point-likert scale. We believe that it is rarely a good idea to diminish a continuous variable into an ordinal construct like CANS does with each item. Lyons himself has hypothesized that this limits his tool's sensitivity to change: “because of the item construction, the CANS is likely less sensitive to change” (Lyons, 2004, p. 473). Sensitivity to change is arguably the most important psychometric quality of an outcome tool. Without a sensitive tool, good treatment and client progress is harder to document. Instead of responding to our concern about individual items, Lyons and Israel answer with an unrelated discussions of total scores. But even here, CANS has limitations. For example, analysis of Massachusetts CANS scores (http://www.mass.gov/eohhs/docs/ masshealth/cbhi) suggest that on nearly every CANS construct (e.g., adjustment to trauma) 20–40% of children are at the ceiling. This means that it is impossible to assess whether these children are getting worse in care. In order to show change, when little might exist, Lyons sometimes multiplies average CANS scores by 10 (cf., Lyons, 2009a, 2009b). Sometimes CANS items are averaged and sometimes they are totaled. Sometimes these different, ad hoc scoring procedures are done within the same publication (cf., Figs. II vs. III in Lyons et al., 2009). If there were not endless variants of CANS scales, this limitation might not matter. However, without standardization, the lack of scoring procedures makes CANS publications hard, if not impossible, to compare. Chor et al. (2012), for example, only report what they call “standardized scores”. This ill-defined score transformation does not aid in the analyses, but it does potentially mask the results so that other researchers will struggle to replicate the findings. 5. Development of TOP Finally, Lyons and Israel impugn our ethics by asking why we did not make explicit our company affiliations. Obviously, this is a false statement as the author's affiliations are disclosed on the first page of the primary article as well as this one. Lyons and Israel are also incorrect about independent research regarding TOP. For example, an independent group translated TOP into Norwegian and conducted concurrent and construct validity tests on a patient sample in Norway (Nordberg, Moltu, & Rabu, 2015). We have even given our competitors (like Dr. Lyons in Oklahoma) access to TOP data. For the adult population, competitive researchers at Harvard University took a random sample of in-patient and out-patient clients and attempted to find a better/simpler factor structure for TOP. Instead, they confirmed the strong construct validity of TOP and its invariance across levels of care (Blais, Sinclair, & Shorey, 2007). We are proud of the extensive academic collaborators that include independent researchers at Penn State, SUNY Albany, University of Massachusetts, Boston University, Adelphi, U Penn, Stanford and many others. We developed TOP to meet the standards set forth by the independent body of key members of the American Psychological Association and the Society for Psychotherapy Research's Core Battery Conference (Horowitz et al., 1997). As it relates to the reply and rebuttal debate on level of care tools, we believe TOP provides an alternative to the CANS approach. A few examples are noteworthy.
5.1. Multiple perspectives The TOP approach to gathering information is quite different from that of CANS. The TOP allows respondents (whether they be caseworkers, placement families, biological families or the youth him/herself) to provide independent and private assessments of the child's behavior. Rather than hiding disagreements and assuming there is a shared consensus, the TOP encourages each rater to accurately inform child welfare as to how they see the child who may be behaving very differently in different contexts (school, church, home …). With little effort, teams can focus discussions on areas where there is wide disagreement or a consensus that there is a major problem. Foster parents who often feel intimidated in meetings may feel that they have a platform from which their voice can be heard. This approach also gives youth an opportunity to privately and independently self-report emotional problems of which others, including caseworkers, are frequently unaware. Indeed, our own research has found that 57% of a sample of adolescents who self-reported suicidal ideation on the TOP had not reported this to their caseworker, clinician, or any other adult rater. An additional benefit of the multi-rater approach is that it can help minimize biases that occur with reporting healthcare outcomes. When outcome data are used for accountability or rate-setting goals, providers face a clear conflict of interest. A residential provider, for example, may have a financial conflict and feel the need to justify keeping a child in care in order to keep a bed filled. 5.2. Central database Lyons and Israel tell a chilling story of a boy who was discharged from residential care and killed his grandparents. However, we believe that the solution to preventing death is not to lock up such children for long periods of time, or to make hypothetical evaluations of their functioning without treatment and supports in place. Instead, we believe the best solution is an empirical one. When is it safe to discharge children who have been violent? What supports do they need to succeed in a family-like setting? Let us not guess that these children will always be violent; let us discover, together, what works. Kids Insight centrally processes all TOP data for child welfare and juvenile justice and links this data to claims, SACWIS, and other databases to help answer these critical questions that can be addressed only with large samples and longitudinal data that follows children post discharge. For this, we believe multiple perspectives are essential as they help to control for the bias of different raters and allows us to establish which raters or combination of raters helps to best predict the outcomes that we all want for these children, including long-term outcomes like adoption. 5.3. Services Lyons and Israel have a misunderstanding as to why users pay Kids Insight a fee and who owns and controls TOP. The TOP is free to use. Dr. Kraus donated it to Kids Insight which is a 501(c)(3) public charity. Support from the Annie E. Casey Foundation and the Duke Endowment have subsidized the creation of a data collection infrastructure that makes administering TOPs and processing results as easy as possible. For example, the web service reminds caseworkers when TOPs are to be completed and with the click of a button, every rater can be sent a link, reminding raters to complete a TOP. Multi-rater reports are returned instantly for the caseworker's review. The TOP can be completed on-line, on a phone or PDA, or on paper with results faxed to Kids Insight for immediate processing and scoring.
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007
Letter to the Editor
Implied by Lyons and Israel's criticism is an undisclosed profit motive. Fees paid Kids Insight are not royalty fees that flow to the authors. Instead, jurisdictions pay a subsidized fee for this web service and for the help of seven full-service support teams: • Computer Support Team – customizing applications for jurisdictions and supporting installations • Implementation and Practice Support Team – on-site training, video, knowledge management help • Customer Service Team (toll free) – humans stay with users until problems are solved • Jurisdiction Management Report Team –standard and customized reports to senior managers • Research Team – publishing results and creating industry benchmarks • Data Analysis Team – documenting the outcomes and building customized algorithms • Financial Analysis Team – documenting fiscal savings of the investment
5.4. Alerts Real-time reporting of results includes access to a growing list of TOP alerts to caseworkers, supervisors, administrators and clinicians that a high-risk event or negative outcome has or will soon happen. By clinician and caseworker report, this process has prevented countless suicides, homicides, runaways and other disruptions. The TOP LOC algorithm publication that sparked this current debate with Lyons and Israel is an example of a TOP alert. The TOP is also able to identify the best providers for each child. Analysis of published adult TOP data (Kraus et al., 2016) suggests that effect size can be quadrupled by matching clients with bestmatched providers. Once there is enough data on a jurisdiction (may take 6–18 months), we supplement each TOP report with a list of best-matched providers. Led by research teams at the University of Massachusetts and State University of New York we are testing this process empirically. This is the first referral system in any field of health care with enough scientific support to receive federal funding for a randomized clinical trial (PCORI ID#: 150328573). In conclusion, we believe a modified version of CANS – modified to meet the construct validity standards that all authors in this debate agree are important – will be an important contribution to the field. Disclosures Dr. David Kraus is the original owner and developer of TOP. In the field of child welfare, which was the focus of this publication, he licensed TOP for worldwide, perpetual use for a nominal fee of $10 to a 501(c)(3) public charity doing business as Kids Insight. Dr. Kraus is compensated by Kids Insight for consulting services provided to Kids Insight, and is employed by Outcome Referrals, Inc., a company which provides data analytical services to Kids Insight at below market rates. References Achenbach, T. M., & Rescorla, L. A. (2001). Manual for the ASEBA school-age forms & profiles. Burlington, VT: University of Vermont, Research Center for Children, Youth, & Families. Alexander, P. C., Bentley, J. H., Tracy, A. J., Kraus, D. R., Baxter, E. E., Boswell, J. F., & Castonguay, L. G. (2016). Placement disruptions and the predictive validity of the treatment outcome package in child welfare. Submitted for publication. American Psychological Association (2014). The standards for educational and psychological testing. Anderson, R. L., Lyons, J. S., Giles, D. M., Price, J. A., & Estes, G. (2003). Examining the reliability of the child and adolescent needs and strengths-mental health (CANS-MH) scale from two perspectives: A comparison of clinician and researcher ratings. Journal of Child and Family Studies, 12, 279–289.
5
Baxter, E. E., Alexander, P. C., Kraus, D. R., Bentley, J. H., Boswell, J. F., & Castonguay, L. G. (2016). Concurrent validation of the child and adolescent versions of the treatment outcome package (TOP). Journal of Child and Family Studies. Beck, A. T., Steer, R. A., & Garbin, M. G. (1988). Psychometric properties of the Beck depression inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8(1), 77–100. Blais, M. A., Sinclair, S. J., & Shorey, H. (2007). A psychometrtic evaluation of the treatment outcome package – TOP. Massachusetts General Hospital psychological evaluation and research laboratory. Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press. Chor, B. K. H., McClelland, G. M., Weiner, D. A., Jordan, N., & Lyons, J. S. (2012). Predicting outcomes of children in residential treatment: A comparison of a decision support algorithm and a multidisciplinary team decision model. Children and Youth Services Review, 34, 2345–2352. Chor, K. H. B., McClelland, G. M., Weiner, D. A., Jordan, N., & Lyons, J. S. (2014). Out-ofhomeplacement decision-making and outcomes in child welfare: a longitudinal study. Administration and Policy in Mental Health. http://dx.doi.org/10.1007/s10488014-0545-5. Cordell, K., Snowden, L., & Hosier, L. (2016). Patterns and priorities of service need identified through the Child and Adolescent Needs and Strengths (CANS) assessment. Child and Youth Services Review, 60, 129–135. Derogatis, L. R. (1975). Brief symptom inventory. Baltimore, MD: Clinical Psychometric Research. Dilly, J. B., & Lyons, J. S. (2003). The validity of the child and adolescent needs and strengths assessment. Toronto, Ontario: Poster presentation at the American Psychological Association. Dunleavy, A. M., & Leon, S. C. (2011). Predictors for resolution of antisocial behavior among foster care youth receiving community-based services. Children and Youth Services Review, 33, 2347–2354. Eisen, S. V., Grob, M. C., & Klein, A. A. (1986). BASIS: The development of a self-report measure of psychiatric inpatient evaluation. The Psychiatric Hospital, 17(4), 165–171. Ellis, B. H., Fogler, J., Hansen, S., Forbes, P., Navalta, C. P., & Saxe, G. (2012). Trauma systems therapy: 15-month outcomes and the importance of effecting environmental change. Psychological Trauma: Theory, Research, and Practice.. http://dx.doi.org/10.1037/ a0025192. Goodman, R. (1997). The strengths and difficulties questionnaire: A research note. Journal of Child Psychology and Psychiatry, 38, 581–586. Hathaway, S. R., & McKinley, J. C. (1989). Manual for administration and scoring. Minneapolis, MN: University of Minnesota Press. Horowitz, L. M., Lambert, M. J., & Strupp, H. H. (Eds.). (1997). Measuring patient change in mood, anxiety, and personality disorders: Toward a core battery. Washington, DC: American Psychological Association Press. Kelly, K., O'Donnell, R., Pelletier, A., & Simmons, J. (2008). Behavioral health outcomes program update. Blue Cross and Blue Shield of Massachusetts. Kraus, D. R., Baxter, E. E., Alexander, P. C., & Bentley, J. H. (2015). The Treatment Outcome Package (TOP): A multi-dimensional level of care matrix for child welfare. Children and Youth Services Review, 57, 171–178. Kraus, D. R., Bentley, J. H., Alexander, P. C., Boswell, J. F., Constantino, M. J., Baxter, E. E., & Castonguay, L. G. (2016, February 15). Predicting therapist effectiveness from their own practice-based evidence. Journal of Consulting and Clinical Psychology. http://dx. doi.org/10.1037/ccp0000083 Advance online publication. Kraus, D. R., Boswell, J. F., Wright, A., Castonguay, L. G., & Pincus, A. L. (2010). Factor structure of the treatment outcome package for children. Journal of Clinical Psychology, 66, 1–14. Kraus, D. R., Seligman, D. A., & Jordan, J. R. (2005). Validation of a behavioral health treatment outcome and assessment tool designed for naturalistic settings: The treatment outcome package. Journal of Clinical Psychology, 61, 285–314. Lyons, J. S. (2004). Redressing the emperor: Improving the children's public mental health system. Westport, Connecticut: Praeger Publishing. Lyons, J. S. (2009a). Knowledge creation through total clinical outcomes management: A practice-based evidence solution to address some of the challenges of knowledge translation. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 18, 37–45. Lyons, J. S. (2009b). Communimetrics. A communication theory of measurement in human services. New York: Springer. Lyons, J. S., Griffin, G., Quintenz, S., Jenuwine, M., & Shasha, M. (2003). Clinical and forensic outcomes from the Illinois mental health juvenile justice initiative. Psychiatric Services, 54(12), 1629–1634. Lyons, J. S., McClelland, G., & Jordan, N. (2010). Fire setting behavior in a child welfare system: Prevalence, characteristics and co-occurring needs. Journal of Child and Family Studies, 19, 720–727. Lyons, J. S., Woltman, H., Martinovich, Z., & Hancock, B. (2009). An outcomes perspectives on residential treatment. Residential Treatment for Children & Youth, 26, 115–123. Miller, A. S., Leon, S. C., & Lyons, J. S. (2007). The child and adolescent needs and strengths scale: Factor analytic investigations. Paper presented at the Midwestern psychological association annual meeting. Nordberg, S. S., Castonguay, L. G., Fisher, A. J., Boswell, J. F., & Kraus, D. R. (2014). Validating the rapid responder construct within a practice research network. Journal of Clinical Psychology, 1–18. Nordberg, S. S., Moltu, C., & Rabu, M. (2015). Norwegian translation and validation of a routine outcome monitoring measure: The treatment outcome package. Nordic Psychology. http://dx.doi.org/10.1080/19012276.2015.1071204. Obeid, N., & Lyons, J. S. (2011). Pre-measurement triangulation. Considerations for Program Evaluation in Human Services. Canadian Journal of Program Evaluation, 25, 59–82. Saxon, D., & Barkham, M. (2012). Patterns of therapist variability: Therapist effects and the contribution of patient severity and risk. Journal of Consulting and Clinical Psychology, 80, 535–546.
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007
6
Letter to the Editor
Sieracki, J. H., Leon, S. C., Miller, S. A., & Lyons, J. S. (2008). Individual and provider effects on mental health outcomes in child welfare: A three level growth curve approach. Children and Youth Services Review, 30, 800–808. Stelk, W., & Berger, M. (2009, June). Predictive modeling: Using TOP clinical domain items to identify adult Medicaid recipients at risk for high utilization of behavioral health services in a managed care provider network. Paper presented at the 40th annual meeting of the Society for Psychotherapy Research, Santiago, Chile, 2009. Wampold, B. E., & Brown, G. S. (2005). Estimating variability in outcomes attributable to therapists: A naturalistic study of outcomes in managed care. Journal of Consulting and Clinical Psychology, 73, 914–923. Wampold, B. E., & Imel, Z. E. (2015). The great psychotherapy debate: The evidence for what makes psychotherapy work (2nd ed ). New York, NY: Routledge. Ware, J. E., & Sherboume, C. D. (1992). The MOS 36-item short-form survey (SF-36): I. Conceptual framework and item selection. Medical Care, 30, 473–483. Weiner, D. A., Schneider, A., & Lyons, J. S. (2009). Evidence-based treatments for trauma among culturally diverse foster care youth: Treatment retention and outcomes. Children and Youth Services Review, 31(11), 1199–1205.
Yampolskaya, S., Armstrong, M. I., & Vargo, A. C. (2007). Factors associated with exiting and reentry into out of home care under community-based care in Florida. Children and Youth Services Review, 29, 1352–1367.
David R. Kraus PhD Kids Insight, Brookline, MA, Outcome Referrals, Inc., Framingham, MA, United States E-mail address:
[email protected] 27 May 2016 Available online xxxx
Please cite this article as: Kraus, D.R., , Children and Youth Services Review (2016), http://dx.doi.org/10.1016/j.childyouth.2016.10.007