Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index

Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index

Author’s Accepted Manuscript Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index Yantao Ma, Huimin Gao, Xin Yu, Ti...

734KB Sizes 0 Downloads 17 Views

Author’s Accepted Manuscript Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index Yantao Ma, Huimin Gao, Xin Yu, Tianmei Si, Gang Wang, Riru Fang, Zhening Liu, Jing Sun, Haichen Yang, Xueyi Wang, Jing Li, Yonghua Zhang, Gary Sachs www.elsevier.com/locate/jad

PII: DOI: Reference:

S0165-0327(16)30192-6 http://dx.doi.org/10.1016/j.jad.2016.05.039 JAD8253

To appear in: Journal of Affective Disorders Received date: 8 February 2016 Revised date: 13 April 2016 Accepted date: 21 May 2016 Cite this article as: Yantao Ma, Huimin Gao, Xin Yu, Tianmei Si, Gang Wang, Riru Fang, Zhening Liu, Jing Sun, Haichen Yang, Xueyi Wang, Jing Li, Yonghua Zhang and Gary Sachs, Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index, Journal of Affective Disorders, http://dx.doi.org/10.1016/j.jad.2016.05.039 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Bipolar diagnosis in China: Evaluating diagnostic confidence using the Bipolarity Index Yantao Ma 12 3MD, Huimin Gao1 2 3 MD, Xin Yu1 2 3* MD, Tianmei Si1 2 3 MD, Gang Wang4 MD, Riru Fang5 MD, Zhening Liu6 MD, Jing Sun7 MD, Haichen Yang8 MD, Xueyi Wang9 MD, Jing Li10 MD, Yonghua Zhang11 MD, Gary Sachs12* MD 1

Peking University Sixth Hospital,

2

Peking University Institute of Mental Health,

3

Key Laboratory of Mental Health, Ministry of Health, Beijing, China;

4

Beijing An-ding Hospital, Beijing, China;

5

Shanghai Mental Health Center, Shanghai, China;

6

Hunan Xiangya Second Hospital, Changsha, Hunan province, China;

7

Nanjing Brain Hospital, Nanjing, Jiangsu province, China;

8

Shenzhen Kangning Hospital, Shenzhen, Guangdong province, China;

9

Hebei Medical University First Hospital, Shi Jiazhuang, Hebei province, China;

10

Sichuan University Huaxi Hospital Psychology Center, Chengdu, Sichuan province, China;

11

Hangzhou Seventh Hospital, Hangzhou, Jiangsu province, China;

12

Harvard University Massachusetts General Hospital, Boston, MA, USA

[email protected] [email protected]

1

* Corresponding authors: Xin Yu, 1 Peking University Sixth Hospital, 2 Peking University

Institute of Mental Health, 3 Key Laboratory of Mental Health, Ministry of Health (Peking University), No 51 Hai Dian QuHua Yuan Bei Rd., Beijing, P.R. China 100191. Tel/fax: 8610-82801999.

*

Corresponding author: Gary Sachs, Massachusetts General Hospital, 50 Staniford Street,

Suite 580,Boston, MA 02114, USA. Tel: 617/934-6757 fax: 617/726-6768.

Abstract Background: Diagnosis of bipolar disorder is inherently difficult. The goal of this study was to examine the utility and psychometric properties of the Bipolarity Index (BPx) in a population of patients treated in China. Methods: At nine Chinese health facilities participating in CAFÉ-BD, clinicians completed a standardized affective disorder evaluation for consecutive patients (N=615) with a clinical diagnosis of MDD and BPD and scored the Bipolarity index. The investigators constructed ROC curves to determine the optimal cut off points to discriminate subjects in three clinical diagnostic groups: bipolar disorder (BPD), major depressive disorder (MDD) and healthy (no psychiatric diagnosis) controls (HC). This study is registered with ClinicalTrials.gov, number NCT02015143. Results: 2

1) The cut-off score between the MDD and BPD groups was 42.0, with a sensitivity of 0.957 and specificity of 0.881 (Z = 63.064, P < 0.001); the cut-off score between the MDD and BPD II groups was 34.0, with a sensitivity of 0.810 and specificity of 0.855 (Z = 20.174, P < .001); and the cut-off score between the BPD II and BPD I groups was 57.0, with a sensitivity of 0.680 and specificity of 0.772 (Z = 9.636, P < 0.001). 2) Five domains contributed to the discrimination results. State-related domains (episode characteristics and course of illness) made greater contributions than trait-related domains (age of onset, family history, and treatment response). Limitations: The data are purely descriptive. The BPD II sample and the family history dataset were small. Conclusions: Our finding indicates good reliability and validity for the Chinese version of the BPx, which encourages its use as a measure of diagnostic confidence for bipolar spectrum disorders. Further prospective study is necessary to determine if the BPx is useful in identifying subgroups among MDD subjects at high risk for conversion to BPD. Keywords: bipolar disorder; unipolar depression; bipolarity index; dimensional measurement; diagnosis

Introduction The diagnosis of bipolar disorder is inherently difficult and its true prevalence remains a matter of scholarly debate. For patients seeking clinical services, however, the lack of a reliable diagnosis is an undeniably urgent problem. Typically management of this need relies

3

on the inconsistent application of screening instruments in combination with varying approaches to clinical assessment. Studies in China report varied prevalence or screening rates of BPD I and II. While the variance in rates likely reflect the use of different assessment instruments (Ma et al., 2013; Merikangas et al., 2011), clinicians seek a confident basis on which to base their diagnosis and treatment plans. Little guidance is available, however, on best practice for diagnosis of lifetime conditions like bipolar disorder. In 1970, Robins and Guze proposed validating psychiatric disorders based on five dimensions of illness: signs and symptoms, age of onset, course of illness, response to treatment and family history. A group of BPD experts (Sachs, 2004; Aiken et al., 2015) designed the Bipolarity Index (BPx) based on the Robins and Guze approach as a measure of diagnostic confidence. Unlike the Mood Disorder Questionnaire (Hirschfeld et al, 2000), Hypomania Checklist (Angst et al, 2005), and the Bipolar Spectrum Diagnostic Scale (Ghaemi et al, 2005), the bipolarity index is neither a screening instrument nor a self-report measure. It provides a systematic process which clinicians can apply to their full assessment to judge the confidence of a bipolar diagnosis. For each of the five dimensions, the clinician obtains a score by mapping all available historical and current information about the patient onto a hierarchical scale based on the presence of elements considered characteristic of the classic (Kraepelinian) conceptualization of what is now called bipolar disorder. The Bipolarity Index was initially validated on the same sample used to validate the MDQ and had been used STEP-BD and 11 randomized clinical trials (please see the attached document of BPx use and translation). Psychometric results from routine clinical practice have been published from the United States (Aiken et al, 2015) and Russia (Mosolov et al, 2014). This study reports the psychometric properties of the BPx in a Chinese population and examines the contribution of different dimension in identifying bipolar spectrum disorders. 4

Patients and methods Subjects Study Sample: This sample of convenience consisted of consecutive outpatients and inpatients with affective disorders enrolled from the Comprehensive Assessment and follow up Descriptive Study on Bipolar Disorder study (CAFE-BD) and healthy controls recruited from respondents to flyers distributed near the participating health centers. CAFE-BD is a collaborative study of nine health centers, which agreed to implement a set of standardized intake procedures. CAFE-BD includes six psychiatric hospitals (sixth Hospital of Peking University, Beijing Anding Hospital, Shanghai Mental Health Center, seventh Hospital in Hangzhou, Nanjing Brain Hospital, and Shenzhen Kangning Hospital), the mental health departments of three general hospitals (First Hospital of Hebei Medical University, Psychology Center of Huaxi Hospital, Sichuan University, and Hunan Xiangya Second Hospital). For the CAFE-BD study, see at http://ClinicalTrials.gov Identifier: NCT02015143. The study obtained ethical approval from the 6th Hospital of Peking University (ethics approval [2011] # 37) and agreement from the ethics committees of all the other participating centers. Sample accession and assessment procedure: The MDD and BPD subjects were outpatients or inpatients at a health facility affiliated with CAFÉ-BD. Healthy control subjects recruited from respondents to flyers distributed near the participating health centers.

5

All subjects signed written informed consent and were initially evaluated for CAFÉ-BD trained researchers using the MINI. During a subsequent visit, a different CAFÉ-BD investigator independently completed the ADE and scored the Bipolar Index. The MINI diagnosis was used to assign patients to groups based on the presence or absence of a mood disorder. Inclusion and exclusion criteria: A. Inclusion criteria for Mood disorder (MDD and BPD) patients 1) Aged 18-65 years 2) Met the DSM-IV diagnostic criteria for major depressive disorder (in any current mood state) 3) Met the DSM-IV diagnostic criteria for bipolar disorder (in any current mood state) B. Exclusion criteria for Mood disorder patients 1) Currently or previously diagnosed with schizophrenia, schizophreniform disorder, or schizoaffective disorder 2) Currently or previously diagnosed with delirium, dementia, other cognitive disorders, mental retardation (IQ < 80) or mental disorders caused by physical illness 3) Had a history or current serious physical illness (such as coronary heart disease, diabetes, chronic kidney disease, active thyroid diseases or disease of him nerve system could be the basis for a secondary mood disorder diagnosis) 4) Had undergone electroconvulsive therapy within 4 weeks before enrollment. C. Inclusion criteria for healthy controls (HC) 6

1) The control group subject matched a case group subject for gender, age (difference within 2 years), and level of education. 2) Aged 18-65 years 3) Ability to understand the consent form and willingness to cooperate with the assessment process D. Exclusion criteria for HC 1) Met diagnostic criteria in DSM-IV for any Axis I psychiatric disorder including mental retardation. 2) Use of antipsychotics, antidepressants, or inappropriate use of prescribed medications. 3) Had a serious physical illness including coronaryheart disease, diabetes, chronic kidney disease, any active thyroid diseases or disease of nerve system. MDD and BPD patients were recruited consecutively and classified according to their MINI diagnosis. Each site enrolled at least 62 patients (BPD N=37, MDD N=25). HC subjects were recruited after enrolment of the mood disorder sample and matched by gender, age and education information to MDD and BPD subjects. Each site enrolled at least 45 HC (23 matched BPD and 22 matched MDD patients).

Sample size estimation: A diagnostic instrument is generally considered useful when the area under the curve (AUC) of the receiver-operating characteristic (ROC) curve is greater than 0.8 (Yan, 2010), or a same number of patients in the case groups. A minimum of 203 patients was required for each of the MDD and BPD group, while 405 subjects for HC group, which based on the sample size required for diagnostic trials, as calculated using PASS software according to an 7

estimated Type I error (α) and type II error (β) probability of 0.05. As it is presumed 20% sample with unqualified information in such a multi-site study according to our previous experience of the BRIDGE-China study, we enlarged the case sample to 244 MDD patients, 244 BPD patients. It’s also estimated the sample size of BPD type II may be too small, such as 10% of BPD patients in previous study, so we enlarged BPD enrolment from 244 to 320 to better testify its efficacy.

Clinical diagnosis and measurement instrument Diagnostic instrument: Mini-International Neuropsychiatric Interview (MINI): The MINI is a brief structured interview for Axis I psychiatric disorders described in DSMIV and ICD-10. Studies have compared the reliability and validity of the MINI with the SCID-P and CIDI. The results show that the Chinese version of MINI has good reliability and validity (Si et al., 2009). Measurement instrument: Bipolarity index (BPx): The revision of the Chinese version of this domains scale was authorized in writing by the original instrument’s author, Gary Sachs. Our group was the first to translate the BPx in Chinese. The process included forward and back translation as well as local clinical review. The original translation into Mandarin was undertaken by two attending psychiatrists also proficient in English. Their initial translation included a year of modification and polishing between 2010 and 2011. Then two clinical experts finished proofreading. Finally two other professionals, previously unacquainted with the scale, completed a reverse-translation which was compared with the original version. Their work including both the Chinese and the English version were sent to Gary Sachs for review (Li et al., 2014).

8

Then we used the Delphi method to evaluate the reasonableness of the BPx: The coefficient of variation for each item was < 0.25; the coordination coefficient was 0.264 (χ2 = 118.39, P < 0.05) (Li et al., 2014). Along the correlation dimensions (five domains, including episode characteristics, age of onset (1st affective episode), course of illness/associated features, response to treatment, and family history), the coefficient of variation for each item was < 0.25, and the coordination factor was 0.223 (χ2 = 100.02, P < 0.05) (Li et al., 2014). The 2002 version of the Affective Disorder Evaluation (ADE), developed by Gary Sachs (Sachs, 2004) was used to collect clinical data.

Data processing and statistical analysis Data entry and organization: Microsoft SQLServer 2005 Express Edition was used to establish an online database; general patient information, the MINI diagnosis, the standardized clinical data collected with ADE, and the BPx score for each subject were input into the database. The internal logic of the data content was verified using a computer program, and 10% of the data were randomly selected for checking and correction; the Statistical Package for Social Science (SPSS19.0) and MedCal 12.0 were used to analyse the data. Our data analysis also included the retested 87 subjects in Li’s paper (2014).

Statistical analysis method: A two-sided test was conducted on all statistics, and the difference was considered significant if P ≤ 0.05.

9

Quantitative data that follow a normal distribution are described as the mean ± standard deviation. Data that do not follow a normal distribution are described with Median (Q25, Q75). The qualitative and rank data are described as frequency (structure proportion) with the statistics and specific P values listed. When comparing the three groups, normality and homogeneity of variance tests were conducted on the quantitative data, and a single-factor analysis of variance (ANOVA) was conducted if the three groups had homogeneous variance. Nonparametric tests with multiple independent variables were conducted if the three groups showed nonhomogeneous variance. A chi-square test was conducted to compare dichotomous variables between two groups. The Kruskal-Wallis test was used to analyse differences in distribution of samples in each group, and the Brown-Mood median test was used to compare the medians. Cronbach’s α coefficient was used to evaluate the overall internal consistency of the measurement form (Cronbach, 1951). The ROC curve was used to calculate the cut-off score, sensitivity, and specificity of the BPX score, and the Z test was used to check the significance with AUC = 0.5. The contribution of various dimensions was evaluated with discriminant analysis.

Quality control In the present study, researchers who used the MINI and the ADE measurement forms were rigorously trained and achieved a satisfactory level on the post-training conformity measurement (kappa values were greater than 0.85). Two rounds of on-site quality control were conducted during the project to spot-check the completeness and logic of the information. Trained staff entered all data with 10% of the data randomly selected for 10

checking.

Results The screening measurement took place between January 2012 and March 2013. Significant differences in demographic data existed between case groups. BPD subjects tended to be younger and have a higher education level, and they were more likely to be single (Table 1).

Course of illness The quartiles for disease duration were 1, 4 and 10 years for the MDD group; 1, 6 and 10 years for the BPD II group; and 3, 7 and 13 years for the BPD I group. There was no significant difference between the BPD II and BPD I groups (Z = -1.919, P = 0.055).

Instrument reliability Internal reliability: Cronbach’s α coefficient for the entire form: The mean correlation coefficient among the five dimensions was 0.418; Cronbach’s α coefficient was 0.770, and the standardized Cronbach’s α coefficient was 0.811. Split-half reliability: The two parts of the Cronbach’s α values were 0.637 and 0.604; the correlation coefficient was 0.860. External reliability: the r-values of test-retest reliability were between 0.932 and 0.986 (MDD N = 40; BPD N = 47; Table 2).

11

ROC curve for the instrument HC vs. case group: The cut-off score was 2.0, the sensitivity was 0.998, the specificity was 0.978, and AUC = 0.997 (Z = 416.084, P < 0.001); BPD vs. MDD group: The cut-off score was 42.0, the sensitivity was 0.957, the specificity was 0.881, and AUC = 0.964 (Z = 63.064, P < 0.001); BPD II vs. MDD group: The cut-off score was 34.0, the sensitivity was 0.810, the specificity was 0.855, and AUC = 0.91 (Z = 20.174, P < 0.001); BPD II vs. BPD I group: the cut-off score was 57.0, the sensitivity was 0.680, the specificity was 0.772, and AUC = 0.780 (Z = 9.636, P < 0.001). The BPx data is consistent with a spectral concept of bipolar disorder: The positive predictive value (PPV) goes down when using higher cut-off scores, and the negative predictive value (NPV) goes up when higher cut-off scores are used for classification (Table 3). Scores of each domain of the BPx were counted. Data that do not follow a normal distribution are described with median and quartile (Q25, Q75) (Table 4). Structure proportion of five domains Structure proportion refers to the percentage of the total Bipolarity Index score contributed by each of the five domains. The structure proportion of each domain = domain score (Table 4)/BPx score (Table 4) × 100%. The score proportion of EC increased from the MDD to the BPD I group, and the AOO decreased. The proportion of TR was lower in the MDD group. The proportion median of FH in all groups was 0. (Figure 1)

12

Discriminant analysis The two best discriminant functions were examined. The coefficients of the functions are shown as follows in which a negative number represents a negative correlation with the classification results, a positive number represents a positive correlation with the classification results, and the absolute value of the coefficient indicates its degree of impact on the classification results. In function 1, the dimensions of episode characteristics, treatment response, and course of illness made the greatest contributions. In function 2, the dimensions of the onset age and treatment response made the greatest contributions. Function 1 = 0.900 × Episode characteristics -0.001 × Age of onset + 0.107 × Course of illness + 0.198 × Treatment response + 0.034 × Family history. Function 2 = -0.292 × Episode characteristics + 0.662 × Age of onset -0.349 × Course of illness + 0.621 × Treatment response + 0.284 × Family history. Function 1 could explain 99.5% of the discrimination results (the eigenvalue =0.526), and function 2 could explain 0.5% of the discrimination results (the eigenvalue =0.026). The P values of both functions were smaller than 0.05 on Wilks’ lambda tests (from function 1 to 2 the wilks' lambda value=0.149, the value=0.975, the

=1159.663; for function 2 the wilks' lambda

=15.430). Both functions had statistical significance.

Structure matrix Structure matrix was used to calculate the correlation coefficients between various dimensions and the discriminant functions, and the maximum absolute correlation coefficient represented the actual correlation between that dimension and the function. The structure

13

matrix of function 1 were: 0.974 in episode characteristics, 0.289 in course of illness, 0.13 in age of onset, 0.46 in treatment response, and 0.06 in family history. For function 2, the matrix were: -0.1 in episode characteristics, -0.3 in course of illness, 0.659 in age of onset, 0.550 in treatment response, and 0.336 in family history. The results indicated that the episode characteristics and course of illness had the greatest absolute correlation with the discriminant function 1; the age of onset, treatment response, and the family history had the greatest absolute correlation with the discriminant function 2.

Cross-validation of discriminant functions The results of cross-validation on the discriminant function showed that 88.1% of the cases in cross-validation groups were correctly classified, 94.1% of the MDD cases were correctly classified, 67.1% of the BPD II cases were correctly classified, and 88.6% of the BPD I cases were correctly classified, indicating that the discriminant functions had a high recognition rate (Table 5).

Discussion In light of the difficulties inherent to lifetime disorders such as Bipolar disorder, we explored the psychometric properties of the BPx for patients treated in China. This study demonstrates that the BPx Chinese version is valid and reliable for the diagnosis of BPD in Chinese patients with Cronbach’s α coefficients of internal and split-half reliability of 0.77 and 0.86, respectively, for retest reliability above 0.9.

14

BPx score showed higher sensitivity and specificity than reported by other investigators for the MDQ (sensitivity= 0.66, specificity = 0.88) (Yang et al., 2014) and HCL-32 (sensitivity = 0.77, specificity =0.62) (Chou et al., 2012). In this study, the AUC of all ROC curves was approximately 0.8, and various cut-off scores had sensitivities of approximately 0.7 and specificities greater than 0.7, confirming that these cut-off scores produce acceptable results. Previous studies have shown that the AUC of HCL-32 was 0.69 for BPD II identification (Chu et al., 2010); the AUC, sensitivity, and specificity of the Bipolar Spectrum Disorder Scale (BSDA) were 0.78, 0.79 and 0.93, respectively, for identifying BPD II in a Chinese affective disorder population (Lee et al., 2013). The sensitivity and specificity of BPx for distinguishing BPD II from MDD were significantly higher than those of HCL-32 and comparable to those of the BSDA, while the BPx’s AUC was significantly greater than that of BSDA; therefore, BPx may outperform the HCL-32 and the BSDA for identifying BPD II and MDD. Any apparent indications of an advantage for the BPx over the MDQ, HCL-32, and BSDA, may, however, be explained based on the well-recognized relative sensitivity of clinician observation over self-report for recognition of symptoms associated with mood elevation. Furthermore, unlike screening instruments the BPx takes into account all available information. Confident comparison cannot be made between these instruments, since the results are likely confounded by many differences between the samples as well as important aspects of the methodology. In terms of BPD II and BPD I identification, the AUC of MDQ was 0.60 (Yang et al., 2014), and the AUC of HCL-32 was 0.57 (Chu et al., 2010); the sensitivity of BSDA was 0.69 in a Spanish study (Yang et al., 2014); and studies in the United States showed a sensitivity of 15

0.75 (Lish et al., 1994) and a specificity of 0.85 (Lish et al., 1994; Yang et al., 2014) in distinguishing the two groups. In this study, the BPx showed sensitivity, specificity, and AUC that were higher than those of MDQ and HCL-32 but lower than that of BSDA for distinguishing BPD II from BPD I. Results from this study show the BPx to be robust for distinguishing bipolar spectrum disorder from MDD. The clinical diagnosis of BPD II is less reliable than the diagnosis of BPD I, because the diagnosis of (even current) hypomania is far less reliable than mania. Therefore there is no gold standard for establishing diagnosis of BPD II and it is not surprising that the BPX is less robust in distinguishing it from BPD I or from MDD. An interesting question relates to those subjects currently diagnosed as MDD by MINI and having BPx scores in the range suggestive of BPD II (cut-off between 34 and 42). We expect these subjects are at higher risk of conversion to BPD. CAFÉ-BD aims to compare the rate of MDD to BPD conversion of MDD groups with higher vs lower BPx scores based on prospective assessments.

Contribution of dimensional approach to bipolarity recognition The treatment response domain contributed significantly less to the MDD than the BPD groups, since that domain mainly evaluates the effect of mood stabilizer treatment. It is understandable a low frequency of treated with mood stabilizers in the MDD group. The score proportion of age of onset was highest in MDD group and significantly different from the BPD groups. This result is consistently with literature suggesting early onset MDD subgroup as risk factor for evolving into BPD (Azorin et al., 2013; Post et al., 2014; Topor et

16

al., 2013). In our next follow-up study we will further exam the correlation of early-onset MDD and risk of reclassification of diagnosis of BPD.

Family history domain contributed only a small proportion of BPx in case groups and showed no significant difference among groups. 3.5%–11.3% of the patients had BPD-positive firstdegree relatives (FDR), with 11.3% of BPD group and 3.5% of MDD group. More than 60% of the patients scored 0 on the family history domain, indicating a reporting flaw or a floor effect. The BRIDGE study previously reported 19.1% (N=105) of the BPD sample in China had positive FDR compared with 7.6% in MDD (N=622) group (Ma et al., 2013). Such discrepancy of the positive FDR with BPD might be difficult to explain. The variance in positive FDR is unlikely to be a true difference between the samples, and may reflect reporting biases related to stigma, different diagnosis systems used in prior generations, or reluctance of family members to share information about psychiatric diagnosis.

Besides the sampling bias and method difference of the two studies, we note the literature showing higher rates of FDR than we observed (Perlis et al., 2006) and the correlation of family history and time of onset (Post et al., 2014; Althoff et al., 2005). Thus we interpret the median scores “0” for family history (in table 4) as indicating one or more types of bias operating to produce under reporting of family history in all the groups.

A truly comprehensive model is not yet feasible available for BPD or any other mental disorder.

Both of the discriminant functions in our study were significant, indicating that the selection of the five domains was reasonable. They all showed significantly different impacts in recognition results and can be divided into two categories according to the structure matrix

17

results: i) episode characteristics and course of illness and ii) age of onset, family history, and treatment response. The episode characteristics and course of illness domains are associated with symptoms and match well with current diagnostic criteria such as DSM-4. They constitute a state-related category that the 1st discriminant function explained major power of contribution. Thus the state-related category represents a key role correlated with bipolar diagnosis. Plus, according to the 2nd discriminant function result, domains of time of onset, family history and treatment response constitute a trait-related category, which means they represents lifetime characteristics of BPD diagnosis. Though its contribution power of the discriminant function is only 0.5% in this cross-sectional study, our next longitudinal study may be available to test its predictive value. Ostergaard et al found 7.1% of MDD patients were diagnosed with BPD within 12 years, and the risk factors that were closely related to the diagnosis change included early onset, treatment resistance, and positive family history (Østergaard et al., 2014). Thesing et al also reported age at onset can define distinct BPD phenotype (Thesing et al., 2015). Thus this finding suggests the trait-related domains may be investigated in follow-up work.

Limitations There are important pragmatic limitations on the use of instruments like the BPx. Longitudinal aspects of the illness such as course and response to treatment may be absent or highly uncertain for months or years after the patients initial presentation, including variability in the information available for assessment and the absence of consensus methodology for weighing the body of available data. There are several important limitations of this study. The results are purely descriptive and based on review of subject reported medical history collected without collateral informants. 18

The analysis is therefore prone to the bias due to influence of recall, stigma and expectation. The BPD II sample was relatively small and probably suffers from the weakness due to uncertain boundaries and overlap with MDD and BPD I that impact all studies of BPD II. In conclusions we have shown that the BPx Chinese version is a reliable diagnostic aid. i) It can be used clinically to improve diagnosis accuracy. ii) All five domains of BPx contributed to identification of BPD. iii) BPx scores for MDD and BPD had different structures, and state-associated domains made greater contributions to diagnosis of BPD in this crosssectional study. The trait-related domains should be investigated in future studies.

Role of funding source The funding source had no involvement in this study. Disclosures and Funding Sources This study was supported by grants from the capital health research and development of special (2011-4024-01), the capital characteristics of clinical application research( Z121107001012040) by the Beijing Municipal Science and Technology Commission, the Janssen Research Foundation, and the research grant of the AstraZeneca China.

Conflicts of interest Yantao Ma, Huimin Gao, Xin Yu, Tianmei Si, Gang Wang, Yiru Fang, Zhening Liu, Jing Sun, Haichen Yang, Xueyi Wang, Jing Li, and Yonghua Zhang declare no conflict of interest. Gary Sachs: Bracket: employee; Massachusetts General Hospital: employee; Astra-Zeneca: consultant; Merck: consultant; Ostuka: consultant; Pfizer: consultant; Sunovion: consultant, speaker/advisory board; Takeda: consultant, speaker/ advisory board; Teva: consultant, speaker/advisory board. 19

Stock shareholder: Amyris, Express Scripts, Collaborative Care Initiative. Copyright holder: Bipolarity Index. Acknowledgments None.

References Aiken, C.B., Weisler, R.H., Sachs, G. S., 2015. The Bipolarity Index: a clinician-rated measure of diagnostic confidence. J. Affect. Disord. 177:59-64. http://dx.doi.org/10.1016/

j.jad.2015.02.004 Althoff, R.R., Faraone, S.V., Rettew, D.C., Morley, C.P., Hudziak, J.J., 2005. Family, twin, adoption, and molecular genetic studies of juvenile bipolar disorder. Bipolar Disord. 7(6), 598-609. doi:10.1111/j.1399-5618.2005.00268.x. Angst, J., Adolfsson, R., Bennazzi, F., Gamma, A., Hantouche, E., et al., 2005. The HCL-32: Towards a self-assessment tool for hypomanic symptoms in outpatients. J. Affect. Disord. 88: 217–233. DOI: http://dx.doi.org/10.1016/j.jad.2005.05.011 Azorin, J.M., Bellivier, F., Kaladjian, A., Adida, M., Belzeaux, R., Fakra, E., Hantouche, E., Lancrenon, S., Golmard, J.L., 2013. Characteristics and profiles of bipolar I patients according to age-at-onset: findings from an admixture analysis. J. Affect. Disord. 150(3), 993-1000. Chou, C.C., Lee, I.H., Yeh, T.L., Chen, K.C., Chen, P.S., Chen, W.T.et al.. 2012. Comparison of the validity of the Chinese versions of the Hypomania Symptom Checklist-32 (HCL-32) and Mood Disorder Questionnaire (MDQ) for the detection of 20

bipolar disorder in medicated patients with major depressive disorder. Int. J. Psychiatry Clin. Pract. 16(2), 132-137. doi:10.3109/13651501.2011.644563. Chu, H., Lin, C.J., Chiang, K.J., Chen, C.H., Lu, R.B., Chou, K.R., 2010. Psychometric properties of the Chinese version of the bipolar Spectrum diagnostic scale. J. Clin. Nurs. 19(19-20), 2787-2794. doi:10.1111/j.1365-2702.2010.03390.x. Cronbach, L.J., 1951. Coefficient alpha and the internal structure of tests. Psychpmetrika. 297–334 Ghaemi, S.N., Miller, C.J., Berv, D.A., Klugman, J., Rosenquist, K.J., Pies, R.W., 2005. Sensitivity and specificity of a new bipolar spectrum diagnostic scale. J. Affect. Disord. 84 (2-3): 273–7. doi:10.1016/S0165-0327(03)00196-4. Hirschfeld, R.M., Williams, J.B., Spitzer, R.L., Calabrese, J.R., Flynn, L., et al., 2000. Development and validation of a screening instrument for bipolar spectrum disorder: the Mood Disorder Questionnaire. Am J Psychiatry 157: 1873–5. doi: 10.1176/appi.ajp.157.11.1873 Läge, D., Egli, S., Riedel, M., Strauss, A., Möller, H.J., 2011. Combining the categorical and the dimensional perspective in a diagnostic map of psychotic disorders. Eur. Arch. Psychiatry Clin. Neurosci. 261(1), 3-10. doi:10.1007/s00406-010-0125-y. Lee, D., Cha, B., Park, C.S., Kim, B.J., Lee, C.S., Lee, S., 2013. Usefulness of the combined application of the mood disorder questionnaire and bipolar Spectrum diagnostic scale in screening for bipolar disorder. Compr. Psychiatry 54(4), 334-340. doi:10.1016/j.comppsych.2012.10.002. Li, Z., Ma, Y.T., Yu, X., Dang, W.M., 2014. Investigation of applicability of bipolar index assessment form using Delphi method. Chin. J. Nerv. Ment. Dis. 40, 102-105. (In Chinese) 21

Lish, J.D., Dime-Meenan, S., Whybrow, P.C., Price, R.A., Hirschfeld, R.M., 1994. The National depressive and manic-depressive association (DMDA) survey of bipolar members. J. Affect. Disord. 31(4), 281-294. Ma, Y.T., Yu, X., Wei, J., Zheng, Y., Zhang, J.P., Mei, Q.Y., Zhang, X.B., Liu, T.B., Miao, G.D., Gao, C.G., Meng, H.Q., Xu, X.F., Tian, H.J., Sun, X.L., Liu, Y., Chen, Z.Y., Wu, W.Y., Jiang, K.D., Ji, J.L., Wang, G.J., Lin, L., 2013. Recognition validity of bipolarity specifier for bipolar disorders among patients with major depressive episode: BRIDGE-China. Chin. J. Psychiatry 46(5), 271-276. Merikangas, K.R., Jin, R., He, J.P., Kessler, R.C., Lee, S., Sampson, N.A.,Viana, M.C., Andrade, L.H., Hu, C., Karam, E.G., Ladea, M., Medina-Mora, M.E., Ono, Y., Posada-Villa, J., Sagar, R., Wells, J.E., Zarkov, Z., 2011. Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch. Gen. Psychiatry 68(3), 241-251. doi:10.1001/archgenpsychiatry.2011.12. Mosolov, S., Ushkalova, A., Kostukova, E., Shafarenko, A., Alfimov, P., Kostyukova, A., Angst, J., 2014. Bipolar II disorder in patients with a current diagnosis of recurrent depression. Bipolar Disorders.16 (4), 389–399, 2014. doi: 10.1111/bdi.12192 Østergaard, S.D., Straszek, S., Petrides, G., Skadhede, S., Jensen, S.O., Munk-Jørgensen, P.,Nielsen, J., 2014. Risk factors for conversion from unipolar psychotic depression to bipolar disorder. Bipolar Disord. 16(2), 180-189. doi:10.1111/bdi.12152. Perlis, R.H., Brown, E., Baker, R.W., Nierenberg, A.A., 2006. Clinical features of bipolar depression versus major depressive disorder in large multicenter trials. Am. J. Psychiatry.163:225-31. doi:10.1176/appi.ajp.163.2.225

22

Post, R.M., Leverich, G.S., Kupka, R., Keck, P. Jr, McElroy, S., Altshuler, L.,Frye, M.A., Luckenbaugh, D.A., Rowe, M., Grunze, H., Suppes, T., Nolen, W.A., 2014. Increased parental history of bipolar disorder in the United States: association with early age of onset. Acta Psychiatr. Scand. 129(5), 375-382. doi:10.1111/acps.12208. Sachs, G.S., 2004. Strategies for improving treatment of bipolar disorder: integration of measurement and management. Acta Psychiatr. Scand. Suppl. 422(422), 7-17. doi:10.1111/j.1600-0447.2004.00409.x. Si, T.M., Shu, L., Dang, W.M., Se, Y.A., Chen, J.X., Dong, W.T., Kong, Q.M., Zhang, W.H., 2009. Evaluation of the reliability and validity of the Chinese version of the MiniInternational neuropsychiatric interview in patients with mental disorders. Chin. Ment. Health J. 23(7), 493-503. Thesing, C.S., Stek, M.L., Grootheest, D.S., Van de ven, P.M., Beekman, A.T., Kupka, R.W., Comijs, H.C., Dols, A., 2015. Childhood abuse, family history and stressors in older patients with bipolar disorder in relation to age at onset. J. Affect. Disord. 184:249255. http://dx.doi.org/10.1016/ j.jad.2015.05.066

Topor, D.R., Swenson, L., Hunt, J.I., Birmaher, B., Strober, M., Yen, S.,Hoeppner, B.B., Case, B.G., Hower, H., Weinstock, L.M., Ryan, N., Goldstein, B., Goldstein, T., Gill, M.K., Axelson, D., Keller, M., 2013. Manic symptoms in youth with bipolar disorder: factor analysis by age of symptom onset and current age. J. Affect. Disord. 145(3), 409-412. doi:10.1016/j.jad.2012.06.024. Yan, H., 2010, Medical Statistics, second ed. People’s Medical Publishing House, Beijing. Yang, H.C., Liu, T.B., Rong, H., Bi, J.Q., Ji, E.N., Peng, H.J.,Wang, X.P., Fang, Y.R., Yuan, C.M., Si, T.M., Lu, Z., Hu, J., Chen, Z.Y., Huang, Y., Sun, J., Li, H.C., Hu, C., 23

Zhang, J.B., Li, L.J., 2014. Evaluation of mood disorder questionnaire (MDQ) in patients with mood disorders: a multicenter trial across China. PLOS ONE 9(4), e91895. doi:10.1371/journal.pone.0091895.

Fig 1. Structure proportion of five domains of BPX. p < 0.05*, p < 0.01**. Episode characteristics, EC; age of onset, AOO; course of illness, C; treatment response, TR; family history, FH

Table 1. Demographic data MDD

BPD II

BPD I

HC

(n=255)

(n=79)

(n=281)

(n=355)

χ2/Z

p

35 (28.49) 30 (23.37) 31 (24.40) 33 (27.43) 34.7 <0.001 Age (Med(Q25,Q75)) Gender

4.33

Male (%)

113 (44.3) 44 (55.7) 124 (44.1) 172 (48.5)

Female (%)

142 (55.7) 35 (44.3) 157 (55.9) 183 (51.5)

Education Elementary school or lower (%)

0.228

25.16 <0.001 12 (4.7)

2 (2.5) 24

6 (2.1)

12 (3.4)

Middle school (%)

53 (20.8) 14 (17.7) 53 (18.9) 42 (11.8)

High school (%)

55 (21.6) 19 (24.1) 90 (32.0) 70 (19.7)

Associate degree (%)

40 (15.7) 11 (13.9) 53 (18.9) 66 (18.6)

Bachelor (%)

82 (32.2) 29 (36.7) 61 (21.7) 136 (38.3)

Master (%)

10 (3.9)

3 (3.8)

15 (5.3)

28 (7.9)

Marital status

17.12 0.001**

Single (%)

74 (29.0) 42 (53.2) 122 (43.4) 112 (31.5)

Married (%)

169 (66.3) 34 (43.0) 136 (48.4) 238 (67.0)

Divorced / Widowed (%)

12 (4.7)

3 (3.8)

23 (8.2)

5 (1.4)

Table 2. Re-test reliability of BPX and five domains R

p**

Episode characteristics

0.984

<0.001

Age of onset

0.984

<0.001

Course of illness

0.932

<0.001

Treatment response

0.934

<0.001

Family history

0.945

<0.001

Total BPX score

0.986

<0.001

p<0.05*, p<0.01**

Table 3. Predictive values of BPX cut-off score Score ≥34

Score ≥42

25

Score ≥ 57

Ppv

Npv

Ppv

Npv

Ppv

Npv

Full sample (n=970)

0.939 0.947

0.883

0.990

0.583

0.997

Mood disorder (n=615)

0.939 0.855

0.883

0.957

0.583

0.992

BP only (n=360)

0.978 0.202

0.943

0.329

0.683

0.797

Ppv: positive predictive value; Npv: negative predictive value

Table 4. BPx Dimension and Total scores in groups based on MINI diagnosis BPx Domain

MDD

BD-II

BD-I

HC

(n=255)

(n=79)

(n=281)

(n=355)

2 (2.2)

10 (10.10)

20 (20.20)

0 (0.0)

15 (10.15)

15 (15.20)

15 (15.20)

0 (0.0)

Course of illness

5 (0.5)

5 (5.10)

10 (5.20)

0 (0.0)

Treatment response

0 (0.0)

15 (0.15)

15 (15.20)

0 (0.0)

Family history

0 (0.2)

0 (0.5)

0 (0.5)

0 (0.0)

22 (15.27)

50 (35.55)

65 (55.72)

0 (0.0)

Episode characteristics Age of onset

Total BPX score

Data that do not follow a normal distribution are described with median and quartile (Q25, Q75). Quartile: Q. Table 5. Classification results a, b, c Group

Initial

Count

%

Predicted group member

Total

MDD

BD-II

BD-I

MDD

240

12

3

255

BD-II

10

53

16

79

BD-I

2

30

249

281

MDD

94.1

4.7

1.2

100.0

26

Cross-

Count

validated

%

a

BD-II

12.7

67.1

20.3

100.0

BD-I

0.7

10.7

88.6

100.0

MDD

240

12

3

255

BD-II

10

52

17

79

BD-I

2

30

249

281

MDD

94.1

4.7

1.2

100.0

BD-II

12.7

65.8

21.5

100.0

BD-I

0.7

10.7

88.6

100.0

Cross-validation was conducted only for cases included in the analysis; in the cross-

validation, all cases were classified using the function derived from cases other than the one being classified; b

Eighty-eight percentage of the cases in the initial groups were correctly classified;

c

Eighty-eight percentage of the cases in the cross-validation groups were correctly classified.

27

28