A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy

A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy

European Journal of Oncology Nursing xxx (2015) 1e7 Contents lists available at ScienceDirect European Journal of Oncology Nursing journal homepage:...

630KB Sizes 0 Downloads 19 Views

European Journal of Oncology Nursing xxx (2015) 1e7

Contents lists available at ScienceDirect

European Journal of Oncology Nursing journal homepage: www.elsevier.com/locate/ejon

A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy Ji Hyun Park a, Hyeon-Young Kim b, Hanna Lee c, Eun Kyoung Yun d, * a

Graduate School of Public Policy & Civic Engagement, Kyung Hee University, 26 Kyunghee-daero, Dongdaemun-gu, Seoul 130-701, Republic of Korea Department of Nursing, Shinhan University, 30, Beolmadeul-ro 40beon-gil, Dongducheon-si, Gyeonggi-do 483-777, Republic of Korea c College of Nursing Science, Kyung Hee University, 26 Kyunghee-daero, Dongdaemun-gu, Seoul 130-701, Republic of Korea d College of Nursing Science and East-West Nursing Research Institute, Kyung Hee University, 26 Kyungheedae-ro, Dongdaemun-gu, Seoul 130-701, Republic of Korea b

a b s t r a c t Keywords: Chemotherapy Infection Logistic regression Decision tree

Purpose: This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. Method: The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. Results: The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. Conclusions: The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. © 2015 Elsevier Ltd. All rights reserved.

Introduction Cancer patients undergoing chemotherapy are highly sensitive to almost any type of infection. This is particularly true for patients with neutropenia. Infections in cancer patients are associated with significant morbidity and mortality (Vento and Cainelli, 2003). In the past, between 6 and 30% of patients undergoing chemotherapy have died due to infections during the course of their treatment (Kern, 2006). Patients with infections may also experience physical and mental stress and delays in their cancer treatments. Therefore, accurate identification of the risk factors for infection is essential for the management and follow-up of cancer patients undergoing chemotherapy. Previous studies have identified a variety of infection risk factors that are associated with chemotherapy. These risk * Corresponding author. College of Nursing Science, Kyung Hee University, 26 Kyunghee-daero, Dongdaemun-gu, Seoul 130-701, Republic of Korea. Tel.: þ82 2 961 2348; fax: þ82 2 961 9398. E-mail address: [email protected] (E.K. Yun).

factors include age, general condition, diabetes mellitus (DM), heart disease, kidney disease, the administration of specific cytotoxic chemotherapy drugs (including cyclophosphamide, doxorubicin, vincristin, mitomycin-C and etoposide), serum albumin under 35 g/ L and elevated lactate dehydrogenase (LDH) (Intragumtornchai et al., 2000; Kim et al., 2000; Kloess et al., 1999; Lyman et al., 2003; Morrison et al., 2001; Pan et al., 2012; Voog et al., 2000; Yasufuku et al., 2013). In addition to the identification of numerous risk factors for infection, these studies have commonly reported correlations between one or more variables. However, most of the previous studies have taken a clinical approach, with a primary focus on identifying the routes of microbial invasion and the bacterial strains involved. Little research has been conducted using an integrated approach that includes a consideration of host, agent and environment characteristics. Classifying related factors and predicting the outcomes of variables are the most challenging tasks for developing data-mining applications in medical research (Kurt et al., 2008). In recent

http://dx.doi.org/10.1016/j.ejon.2015.03.006 1462-3889/© 2015 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

2

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7

years, various techniques for classification, comparison or prediction and several methods of data mining have been applied in biomedical studies. However, few studies have compared the performance of these various methods, particularly in relation to predicting the incidence of infection in chemotherapy patients (Delen et al., 2005; El-Solh et al., 1999; Li et al., 2012; Samanta et al., 2009; Ture et al., 2005). Logistic regression modeling has been widely used for the analysis of multivariate data involved in binary responses (Bensic et al., 2005). The chi-square automatic interaction detector (CHAID), which is one form of decision tree analysis, has proved useful for solving classification problems in complex data sets. This process of analysis closely resembles medical reasoning and can help to structure understanding and enable prediction (Kurt et al., 2008). However, these various analytical methods may produce different results and thus cause uncertainty. Therefore, it is worthwhile to find the most appropriate and effective model to use for the prediction of infection in chemotherapy patients. This study compares the performance of two classification techniques (logistic regression and decision tree analysis) for predicting the incidence of infection in cancer patients undergoing chemotherapy. The aims of this study were to describe the infection occurrence among patients receiving chemotherapy and to conduct a secondary analysis of this data by exploring the use of datamining techniques to identify the factors that are most highly associated with infection among these patients. Patients and methods From March 2011 to February 2013, 866 cancer patients received chemotherapy at K university hospital in Seoul, Korea. Of these patients, 134 were excluded from the study for a variety of reasons. This was a convenience sample, chosen because it was of adequate size to explore a reasonable number of explanatory variables using multivariable analysis. Accordingly, the sample size was sufficient to support the development of models for a decision tree analysis. There were three main types of patients that were excluded from the study: (1) patients who had incomplete information, including a lack of relevant clinical data; (2) patients diagnosed with an infection caused by an operation and (3) patients who had received trans-catheter arterial chemoembolization for hepatocellular carcinoma. Patients with postoperative infections were excluded because these could have caused ambiguity in the analysis of the independent variable, i.e., infections caused by chemotherapy. Patients who had received trans-catheter arterial chemoembolization for hepatocellular carcinoma were excluded because doxorubicine is used during trans-catheter arterial chemoembolization. This retrospective study therefore included 732 patients, of which 97 had at least one secondary infection-related diagnosis (such as pneumonia, sepsis or bacteremia) during their chemotherapy treatment. This study was approved by the Institutional Review Board of K university hospital (No. 2013-01-117-003). For the purposes of this study, a patient with an infection was defined as one who had at least one secondary infection diagnosis during their chemotherapy treatment at K university hospital in Seoul, Korea. Following the opinions of two hematooncologists, the infections considered as part of this study were limited to pneumonia, sepsis and bacteremia because these are the most important infections with respect to the clinical outcome of patients receiving chemotherapy treatment. The variables used in this study were chosen based on the epidemiology of these diseases and their triangle between the agent, host and environment (Mausner and Kramer, 1985). This approach sees disease as the product of an interaction between an agent, a host

and the environment. Of the various variables extracted from electronic medical records, patients' socio-demographic characteristics were regarded as the host's characteristics and their clinical characteristics were classified as either agent or environmental characteristics. The socio-demographic characteristics listed in this study were gender, age, education and a history of smoking and/or drinking alcohol. The environmental condition factors included were the type of insurance, the hospital ward to which the patient was admitted (oncology or the general ward) and the place where the chemotherapy was administered (in an out-patient clinic or in the hospital). The clinical characteristics included cancer diagnosis, stage, operation history, underlying diseases, chemotherapy regimen and netropenia. In terms of the chemotherapy regimen, the anti-cancer agents were classified into one of 10 regimens based on their biochemical mechanisms: 1) alkylating agents, such as Carboplatin, Cyclophosphamide, Melphalan, Oxaliplatin, Ifosfamide and Cisplatin; 2) antimetabolites, such as Capecitabine, 5-fu, Gemcitabine, Methotrexate, UFT and Pemetrexed; 3) anthracyclines (antitumor antibiotics), such as Mitomycin, Doxorubocin and Epirubicin; 4) plant alkaloids (Camptothecin), such as Irinotecan and Topotecan; 5) plant alkaloids (Epipodophyllotoxin), such as Etoposide; 6) plant alkaloids (Taxane), such as Docetaxel and Paclitaxel; 7) plant alkaloids (Vinca alkaloid), such as Vincristine and Vinorelbine; 8) miscellaneous biologic response modifiers, such as Thalidomide (oral); 9) monoclonal antibodies (Moab), such as Bevacizumab, Cetuximab Trastuzumab and Rituximab; and 10) small molecule inhibitors, such as Bortezomib, Sorafenib, Sunitinib, Dasatinib, Gefitinib, Erlotinib, Lapatinib and Imatinib. Each of the above-mentioned factors was cross-tabulated for both patients with infections and the group without infections. Following the chi-square test or Fisher's exact test, logistic regression and decision tree analyses (CHAID) were performed to determine the factors involved in infections or chemotherapy treatment delays. Logistic regression analysis is a common analytical regression method for estimating dependent variables with two or more categorical forms. We conducted a univariate analysis on the independent variable before performing the logistic regression to prevent a distortion of the estimated results that might have been caused by categorical independent variables or multicollinearity. Then, we examined the correlations between the variables and conducted a multivariable logistic regression analysis using suitable variables. The logistic regression models were developed using backward variable elimination, and we implemented a logistic regression to predict the occurrence of infection using backward variable elimination in the clinical baseline sample. The initial model contained the eight predictor variables for general and environmental characteristics listed in Table 1, and the six predictor variables for clinical characteristics listed in Table 2. Variables were sequentially eliminated until all of those retained in the final model were significant with P < 0.05. The predictive accuracy of the resultant model was assessed in the infection follow-up validation sample. The analysis generated regression coefficients, odds ratios, confidence intervals and Nagelkerke R2 and HosmereLemeshow goodness-of-fit chi-square values. The Nagelkerke R2 attepmts to quantify the proportion of explained variance in the logistic regression model, similar to the R2 in linear regression, although the variation in a logistic regression model must be defined differently. The CHAID analysis was run in duplicate with parent nodes defined at four subjects, child nodes defined at three subjects and a significance level (a merge, a split and P value) set at 0.05. The IBM SPSS Statistics 19 and Modeler 15.1 (IBM, Seoul, Korea) programs were used to conduct these statistical analyses. All of the values are expressed as mean ± SD, c2 and Fisher's exact test.

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7

3

Table 1 General and environmental characteristics by occurrence of infection (n ¼ 732). General characteristics

Categories

n

%

Occurrence of infection

c2

p

0.970

.329

6.380

.173

2.362

.307

1.011

.363

0.516

.523

2.021

.364

2.713

.258

0.619

.734

n (%)

Gender Age

Education

Smoke Drink Insurance

Administration

Injection place

Male Female <50 50-59 60e69 70e79 S 80 & Middle school High school S College Yes None Yes None Health insurance Lower income group Medical care 1st Oncology ward General ward OPD OPD Admission OPD/Adm

381 351 134 196 190 180 32 278 283 171 167 565 175 557 686 12 34 216 356 160 185 401 146

Differences in which P < 0.05 were considered statistically significant (Table 3). In decision tree modeling, the random forest model can result in large differences in predictability if the data mining techniques are used with random sampling or small samples. We therefore used the decision tree model. We thought that a comparison of the performance of these two data mining techniques was more reasonable than simply judging their performance based on just one reference. Thus we compared the accuracy and predictability of the decision tree model with the accuracy and predictability of the logistic regression analysis, which is the traditional method used for prediction analysis. Results The study included 732 patients (381 male and 351 female) with a mean age of 61.8 ± 12.1 years. Table 1 shows the influence of the demographic factors on infection rates during chemotherapy. Out of the 732 patients, 97 (13.25%) exhibited infections. The chi-square analysis did not show any significant effects of any of the demographic factors on infection rates during chemotherapy treatment. Among the clinical variables listed in the records, underlying disease was clearly associated with infection during chemotherapy (c2 ¼ 11.312, p ¼ .047). There were statistically significant differences in the use of alkylating agents (c2 ¼ 3.989, p ¼ .047), vinca alkaloids (c2 ¼ 12.989, p ¼ .001), miscellaneous biologic response modifiers (c2 ¼ 9.571, p ¼ .018) and small molecule inhibitors (c2 ¼ 5.279, p ¼ .035). To determine the factors affecting infection and delayed chemotherapy, a logistic regression analysis was performed. Significant differences in rates of infection were found depending on the place where chemotherapy was administered, the ward in which the patient was hospitalized during cancer treatment, the existence of underlying diseases and the chemotherapy regimen used. The logistic regression analysis also showed that the main risk factors for infection during chemotherapy were underlying DM

52.0 48.0 18.3 26.8 26.0 24.6 4.4 38.0 38.7 23.4 22.8 77.2 23.9 76.1 93.8 1.6 4.6 29.5 48.6 21.9 25.3 54.8 19.9

No (n ¼ 635)

Yes (n ¼ 97)

326 309 123 171 166 149 26 235 247 153 141 494 149 489 595 9 31 181 311 143 158 348 129

55 42 11 25 24 31 6 43 36 18 26 71 26 71 91 3 3 35 45 17 27 53 17

(51.3) (48.7) (19.4) (26.9) (26.1) (23.5) (4.1) (37.0) (38.9) (24.1) (22.2) (77.8) (23.5) (76.5) (93.7) (1.4) (4.9) (28.5) (49.0) (22.5) (24.9) (54.8) (20.3)

(56.7) (43.3) (11.3) (25.8) (24.7) (32.0) (6.2) (44.3) (37.1) (18.6) (26.8) (73.2) (26.8) (73.2) (93.8) (3.1) (3.1) (36.1) (46.4) (17.5) (27.8) (54.6) (17.5)

disease [odds ratio (OR) 2.44, 95% consistency index (CI) 1.18 to 5.05, p ¼ 0.016], HTN with DM (OR 2.716, 95% CI 1.22 to 6.07, p ¼ 0.015), the use of alkylating agents (OR 1.91, 95% CI 1.16 to 3.14, p ¼ 0.011), vinca alkaloids (OR 3.81, 95% CI 1.75 to 8.30, p ¼ 0.001), miscellaneous biologic response modifiers (OR 14.06, 95% CI 2.26 to 87.61, p ¼ 0.005) and small molecule inhibitors (OR 2.65, 95% CI 1.46 to 4.81, p ¼ 0.001). Similarly, the decision tree analysis showed a significantly higher risk of infection in patients whose chemotherapy treatments included vinca alkaloids. In addition, the decision tree analysis identified underlying DM disease and an alkylating agent regimen as the greatest risk factors for infections in patients undergoing chemotherapy. The decision tree analysis showed three major paths for predicting the risk of infection. The tree analysis in Fig. 1 shows the 4-level CHAID tree with a nine nodes, of which five are terminal nodes. Four major predictor variables showed sufficient significance to be included in this model: vinca alkaloids, underlying disease, anthracyclines and an alkylating agent regimen. The first level of the tree was split into two initial branches according to whether vinca alkaloids were used. The overall prevalence of infection in the chemotherapy patients was 13.3% but this rate was elevated to 34.4% in patients who received vinca alkaloids. As seen in the second level of the tree, underlying diseases were shown to be the next predictor variable for each of the splits from the first level. In the subset of subjects with DM or a combination of DM and hypertension, the likelihood of infection was 25.3%. The use of anthracycline was the most prominent variable in the third level of the tree and was associated with a 3.5% prevalence of infection. The terminal nodes of the tree were split according to whether an alkylating agent regimen was administered. This factor elevated the likelihood of infection in chemotherapy patients to 15.6%. To test the prediction discrimination of the whole model, we divided the data from the 732 patients into training data and testing data by boosting using a balance mode of an infection and non-infection occurrence group. The sensitivity, specificity and accuracy of the two models were compared and the results of this analysis are shown in Table 4.

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

4

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7 Table 2 Clinical characteristics by occurrence of infection (n ¼ 732). General characteristics

Categories

n

%

Occurrence of infection

c2

p

n (%)

Diagnosis

Stage

Operation history Neutropenia Underlying disease

Chemotherapy

Hepatocellular carcinoma Lung ca Breast ca Colon ca Pancreas ca Gallbladder ca Advanced gastric ca Rectal ca Multiple myeloma Endometrium ca AOV CML Head & Neck ca RCC Thymus Bladder ca Sarcoma 1 2 3 4 Yes None Yes None HTN DM HTN, DM None 1 (Yes) 1 (No) 2 (Yes) 2 (No) 3 (Yes) 3 (No) 4 (Yes) 4 (No) 5 (Yes) 5 (No) 6 (Yes) 6 (No) 7 (Yes) 7 (No) 8 (Yes) 8 (No) 9 (Yes) 9 (No) 10 (Yes) 10 (No)

108 116 140 99 28 6 111 55 9 10 3 7 26 2 1 1 10 83 170 198 281 415 317 116 616 89 45 34 564 430 302 452 280 133 599 91 641 13 719 124 608 32 700 5 727 51 681 97 635

14.8 15.3 19.1 13.5 3.8 0.8 15.2 7.5 1.2 1.4 0.4 1.0 3.6 0.3 0.1 0.1 1.4 11.3 23.2 27.0 38.4 56.7 43.3 15.8 84.2 12.2 6.1 4.6 77.0 58.7 41.3 61.7 38.3 18.2 81.8 12.4 87.6 1.8 98.2 16.9 83.1 4.4 98.6 0.7 99.3 7.0 93 13.3 86.7

No (n ¼ 635)

Yes (n ¼ 97)

90 100 124 89 23 5 99 43 8 10 3 5 24 2 1 1 9 72 150 173 240 367 268 101 534 78 34 25 498 364 271 391 244 122 513 74 561 9 626 107 528 21 614 2 633 47 588 77 558

18 (18.6) 16 (16.5) 16 (16.5) 11 (11.3) 5 (5.2) 1 (1.0) 12 (12.4) 12 (12.4) 1 (1.0) 0 0 2 (2.1) 2 (2.1) 0 0 2 (2.7) 1 (1.0) 11 (11.3) 20 (20.6) 25 (25.8) 41 (42.3) 48 (49.5) 49 (50.5) 15 (15.5) 82 (84.5) 11 (11.3) 11 (11.3) 9 (9.3) 66 (68.0) 66 (68) 31 (93.2) 61 (62.9) 36 (37.1) 11 (11.3) 86 (88.7) 17 (17.5) 80 (82.5) 4 (4.1) 93 (95.9) 17 (17.5) 80 (82.5) 11 (11.3) 86 (88.7) 3 (3.1) 94 (96.9) 4 (4.1) 93 (95.9) 20 (20.6) 77 (79.4)

(14.2) (15.7) (19.5) (13.9) (3.6) (0.8) (15.6) (6.8) (1.3) (1.6) (0.5) (0.8) (3.8) (0.3) (0.2) (0.2) (1.4) (11.3) (23.6) (27.2) (37.8) (57.8) (42.2) (15.9) (84.1) (12.3) (5.4) (3.9) (78.4) (57.3) (42.7) (61.6) (38.4) (19.2) (80.8) (11.7) (88.3) (1.4) (98.6) (16.9) (83.1) (3.3) (96.7) (0.3) (99.7) (7.4) (92.6) (12.1) (87.9)

11.440

.781

0.833

.842

2.367

.125

0.012

1.00

11.312

.010**

3.989

.047*

.061

.824

3.507

.066

2.665

.135

3.553

.080

.027

.885

12.989

.001**

9.571

.018*

1.395

.290

5.279

.035*

AOV ¼ Ampulla of vater; CML ¼ Chronic myeloid leukemia; RCC ¼ Renal cell carcinoma. p < .05. ** p < .01. *** p < .001. 1. Alkylating agent: Carboplatin, Cyclophosphamide, Melphalan, Oxaliplatin, Ifosfamide, Cisplatin. 2. Antimetabolite: Capecitabine, 5-fu, Gemcitabine, Methotrexate, UFT, Pemetrexed(Alimta). 3. Anthracycline(Antitumor antibiotic): Mitomycin, Doxorubocin, Epirubicin. 4. Plant alkaloid (Camptothecin): Irinotecan, Topotecan. 5. Plant alkaloid (Epipodophyllotoxin): Etoposide. 6. Plant alkaloid (Taxane): Docetaxel, Paclitaxel. 7. Plant alkaloid (Vinca alkaloid): Vincristine, Vinorelbine. 8. Miscellaneous biologic response modifiers(BRMs): Thalidomide(oral). 9. Monoclonal antibody(Moab): Bevacizumab, Cetuximab Trastuzumab, Rituximab. 10. Small molecule inhibitor: Bortezomib, Sorafenib, Sunitinib, Dasatinib, Gefitinib, Erlotinib, Lapatinib, Imatinib. *

The logistic regression of the testing data analysis showed 88.9% specificity and 66.7% sensitivity. The decision tree for the testing data analysis reported a specificity and sensitivity of 89% and 55%, respectively. As for the whole classification accuracy, the logistic regression of the testing data explained 88.0% and the

decision tree analysis explained 87.2%. The logistic regression of the training data analysis showed 74.5% specificity and 73.7% sensitivity. The decision tree for the training data analysis showed a specificity and sensitivity of 90.8% and 73.2%, respectively. As for the whole classification accuracy, the logistic regression of the

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7 Table 3 Risk factors for infection as assessed by multivariable logistic regression (n ¼ 732). Characteristics

Categories

B

OR

95% CI

Underlying disease

HTN DM HTN & DM Alkylating agent Vinca alkaloid Biologic response modifiers Small molecule inhibitor

0.062 0.892 0.999 0.65 1.34 2.64

1.064 2.441 2.716 1.91 3.81 14.06

0.53 1.18 1.22 1.16 1.75 2.26

0.97

2.65

Anti-cancer agent

to to to to to to

p 2.10 5.05 6.07 3.14 8.30 87.61

1.46 to 4.81

.858 .016 .015 .011 .001 .005 .001

OR ¼ odds ratio, 95% CI ¼ 95% confidence interval.

training data explained 74.1% and the decision tree analysis explained 79.5%. The Nagelkerke R2 value for the risk of infection prediction model was 0.094, and the HosmereLemeshow goodness-of-fit chisquare value was 0.967. Discussion The occurrence of unexpected infection frustrates the treatment of cancer patients undergoing chemotherapy. Infections lengthen hospital stays and increase medical expenses. They also cause delays in chemotherapy treatment, which can significantly affect the prognosis of the patient and even result in death. Therefore, the identification of risk factors for infection in cancer patients receiving chemotherapy is an important issue that needs to be addressed to improve the effectiveness of treatment regimes. Our findings showed that there were no significant differences in infection risk according to the patients' general characteristics. Lyman reported that patients aged over 65 had a higher risk of infection but our study did not show similar results related to age (Lyman et al., 2003). One potential cause for this difference in results may be the difference in the ages of the subjects. The mean age of the subjects in our study was 61.8 and the mean age of those who experienced infections was 63.9 years. For those without infections, the average age was 61.5 years. Further research is needed to investigate the influence of age on patients who suffer from infection during chemotherapy. Previous studies have reported that patients with underlying diseases, such as DM or heart and kidney disease, are at high risk of infection during chemotherapy (Morrison et al., 2001; Yasufuku et al., 2013). In this study, DM was confirmed as a major risk factor for infection. Pneumonia in patients with DM can cause bacteremia and patients with this combination of conditions have high mortality rates (Koziel and Koziel, 1995). With DM as an underlying disease, the onset of complications such as pneumonia and bacteremia is common in chemotherapy patients. Therefore, DM should be seen as one of the major influencing factors for infection among chemotherapy patients. The findings of this study were not consistent with previous research concerning the influence of neutropenia and neutropenic periods on infection in patients undergoing chemotherapy (Kim et al., 2000; Lyman et al., 2005). This difference in results may reflect the specific varieties of cancers and treatments found in our sample. Chemotherapy treatments vary for different underlying cancers and the use of strong cytotoxic anti-cancer drugs tends to increase the potential for neutropenia. Further research will be necessary to investigate this issue through studies involving statistical controls for underlying cancers. The main factors affecting infection rates as identified by the logistic regression analysis were the place of treatment, hospital ward type, underlying diseases (including DM and hypertension)

5

and anti-cancer agents. The main factors affecting infection as identified by the decision tree analysis were underlying DM disease and anti-cancer agents. Among the various chemotherapy agents, both the logistic regression and the decision tree analysis indicated that the use of alkylating agents and vinca alkaloids commonly influenced infection rates. In previous studies, the administration of anthracyclines, taxanes, alkylators, topoisomerase inhibitors or combinations of over three kinds of anti-cancer agents have been shown to be significant predictors of infection (Lyman et al., 2005). Toxic chemotherapy cyclophosphamide, doxorubicin, vincristin, etoposide and mitomycin-C tend to cause bone marrow suppression and a poor general condition, factors that may increase the risk of infection (Kloess et al., 1999). Cytotoxic anti-cancer chemotherapy administered with alkylating agents and vinca alkaloids can be predicted to increase the risk of infection. The logistic regression model showed a better overall classification accuracy (88.0%) than the proposed CHAID model. The predictive performance of the decision tree analysis was lower than that of the logistic regression, as is shown in the comparison of the results (Table 4). The sensitivity to infection and accuracy of the classification provided by the logistic regression analysis offer a firm basis for predicting the infection risks of cancer patients receiving chemotherapy. The CHAID and logistic regression models provide a comprehensive analytic framework to inform the optimal design of clinical guidelines and health policies for the prevention and management of infection in patients undergoing chemotherapy. However, each classification technique may have the potential to complement existing statistical models and to contribute to the interpretation and prediction of risk in computerized decision support systems (Kurt et al., 2008). The results of this study identified the major environmental factors affecting the occurrence of infection and the biological properties of the main types of infection affecting chemotherapy patients. Therefore, this study has demonstrated that a multidisciplinary approach is needed for chemotherapy management and intervention. The comparison of factors influencing the risk of infection based on two methods illustrates the accuracy of the outcomes of this study and that the use of two classification methods can help increase the accuracy of the predictions regarding the occurrence of infections in chemotherapy patients. However, we also wish to emphasize the limitations of this study. First, the cancer types, stages of the patients and treatments applied to the subjects in this study were too numerous and varied to allow for definitive conclusions to be drawn about these variables. This study also had the limitations inherent in a retrospective study. Prospective research and long-term studies are needed to generalize these findings. This study has provided practical examples of comparisons between various predictive models. The test results of the models showed that the logistic regression model performed better than the decision tree analysis. The results also suggested that alkylating agents, vinca alkaloids and DM can be used as reliable indicators to predict the incidence of infections. This knowledge can help to develop patient-centered care planning for cancer patients during chemotherapy. In addition, the results of this study emphasize the importance of the management-based approach to cancer patients receiving chemotherapy. Data-mining techniques have highlighted several factors that are associated with infection from the perspective of the agent, host and environment. These results show the importance of knowledge and education for medical personnel in the place where chemotherapy is being administered and for patients' self-management. Accordingly to minimize the occurrence of

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

6

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7

Fig. 1. Decision tree.

infection in cancer patients receiving chemotherapy, patients who are being treated as outpatients, who have diabetes or who are being administered an anti-cancer agent, such as an alkylating agent or vinca alkaloid, will need to receive continuous monitoring and additional management education programs to minimize their risk of infection.

The identification and understanding of the factors affecting infections in patients undergoing chemotherapy may provide insights that could help to develop nursing interventions to enhance the quality of nursing care and patient well-being. However, this study has limitations in representing environmental characteristics. Given that the retrospective data used in this study were collected from a single hospital, the generalizability of the findings

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006

J.H. Park et al. / European Journal of Oncology Nursing xxx (2015) 1e7

7

Table 4 Comparison of the performance of models. Data classification

Group

Logistic regression

Decision making tree

Non

With

Total

Predicted ratio

Non

With

Total

Predicted ratio

Testing data

Non-infection With infection Total

312 5 317

39 10 49

351 15 366

88.9% (Specifity) 66.7% (Sensitivity) 88.0% (Accuracy)

308 9 317

38 11 49

346 20 366

89.0% (Specifity) 55.0% (Sensitivity) 87.2% (Accuracy)

Training data

Non-infection With infection Total

232 86 318

79 241 320

311 327 638

74.5% (Specifity) 73.7% (Sensitivity) 74.1% (Accuracy)

208 110 318

21 301 322

229 411 640

90.8% (Specifity) 73.2% (Sensitivity) 79.5% (Accuracy)

of this study is limited. Further research is necessary to extend and supplement our understanding of the risk factors for infection in patients undergoing chemotherapy identified in this study. Conflicts of interest The authors have no competing interests to declare. References Bensic, M., Sarlija, N., Zekic-Susac, M., 2005. Modelling small-business credit scoring by using logistic regression, neural networks and decision trees. Intelligent Systems in Accounting, Finance & Management 13, 133e150. Delen, D., Walker, G., Kadam, A., 2005. Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34, 113e127. El-Solh, A.A., Hsiao, C., Goodnough, S., Serghani, J., Grant, B.J., 1999. Predicting active pulmonary tuberculosis using an artificial neural network. Chest 116, 968e973. Intragumtornchai, T., Sutheesophon, J., Sutcharitchan, P., Swasdikul, D., 2000. A predictive model for life-threatening neutropenia and febrile neutropenia after the first course of CHOP chemotherapy in patients with aggressive nonHodgkin's lymphoma. Leukemia and Lymphoma 37, 351e360. Kern, W.V., 2006. Risk assessment and treatment of low-risk patients with febrile neutropenia. Clinical Infectious Diseases 42, 533e540. Kim, Y.J., Rubenstein, E.B., Rolston, K.V., 2000. Colony-stimulating factors (CSFs) may reduce complications and death in solid tumor patients with fever and neutropenia. Proceedings of the American Society of Clinical Oncology 19, 612a (abstr 2411). Kloess, M., Wunderlich, A., Truemper, L., Pfreundschuh, M., Loeffler, M., 1999. Predicting hematotoxicity in multicycle chemotherapy. Blood 94, 87a (abstr 381). Koziel, H., Koziel, M.J., 1995. Pulmonary complications of diabetes mellitus: pneumonia. Infectious Disease Clinics of North America 9, 65e96. Kurt, I., Ture, M., Kurum, A.T., 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Systems with Applications 34, 366e374.

Li, C., Zhi, X., Ma, J., Cui, Z., Zhu, Z., Zhang, C., et al., 2012. Performance comparison between logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus. Chinese Medical Journal 125, 851e857. Lyman, G.H., Lyman, C.H., Agboola, O., 2005. Risk models for predicting chemotherapy-induced neutropenia. The Oncologist 10, 427e437. Lyman, G.H., Morrison, V.A., Dale, D.C., Crawford, J., Delgado, D.J., Fridman, M., 2003. Risk of febrile neutropenia among patients with intermediate-grade nonHodgkin's lymphoma receiving CHOP chemotherapy. Leukemia and Lymphoma 44, 2069e2076. Mausner, J., Kramer, S., 1985. Epidemiology: an Introductory Text. W.B. Saunders, Philadelphia. Morrison, V.A., Picozzi, V., Scott, S., Pohlman, B., Dickman, E., Lee, M., et al., 2001. The impact of age on delivered dose intensity and hospitalizations for febrile neutropenia in patients with intermediate-grade non-Hodgkin’s lymphoma receiving initial CHOP chemotherapy: a risk factor analysis. Clinical Lymphoma 2, 47e56. Pan, H.H., Lin, K.C., Ho, S.T., Liang, C.Y., Lee, S.C., Wang, K.Y., 2012. Factors related to daily life interference in lung cancer patients: a cross-sectional regression tree study. European Journal of Oncology Nursing 16, 345e352. Samanta, B., Bird, G.L., Kuijpers, M., Zimmerman, R.A., Jarvik, G.P., Wernovsky, G., et al., 2009. Prediction of periventricular leukomalacia. part II: selection of hemodynamic features using computational intelligence. Artificial Intelligence in Medicine 46, 217e231. Ture, M., Kurt, I., Kurum, A.T., Ozdamar, K., 2005. Comparing classification techniques for predicting essential hypertension. Expert Systems with Applications 29, 583e588. Vento, S., Cainelli, F., 2003. Infections in patients with cancer undergoing chemotherapy: aetiology, prevention, and treatment. The Lancet Oncology 4, 595e604. Voog, E., Bienvenu, J., Warzocha, K., Moullet, I., Dumontet, C., Thieblemont, C., et al., 2000. Factors that predict chemotherapy-induced myelosuppression in lymphoma patients: role of the tumor necrosis factor ligand-receptor system. Journal of Clinical Oncology 18, 325e331. Yasufuku, T., Shigemura, K., Tanaka, K., Arakawa, S., Miyake, H., Fujisawa, M., 2013. Risk factors for refractory febrile neutropenia in urological chemotherapy. Journal of Infection and Chemotherapy 19, 211e216.

Please cite this article in press as: Park, J.H., et al., A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy, European Journal of Oncology Nursing (2015), http://dx.doi.org/10.1016/j.ejon.2015.03.006