Ecological Indicators 84 (2018) 748–752
Contents lists available at ScienceDirect
Ecological Indicators journal homepage: www.elsevier.com/locate/ecolind
Review
Ecological environmental early-warning model for strategic emerging industries in China based on logistic regression
MARK
⁎
Li-yan Suna,c, , Cheng-lin Miaoa,d, Li Yangb,e a
School of Economics and Management, Anhui University of Science and Technology, Huainan City, Anhui Province, 232001, China School of Humanities and Social Sciences, Anhui University of Science and Technology, Huainan City, Anhui Province, 232001, China c Ecological Environment Management, Ecological Economics, Innovation Management, China d Innovation Management, Ecological Economics, China e Evaluation Theory and Method, China b
A R T I C L E I N F O
A B S T R A C T
Keywords: Ecological environment Early-warning model Logistic regression Strategic emerging industries
Ecological environmental early-warning can make timely warning to predict the ecosystem degradation and the deterioration of environmental quality caused by industry development activities. The sustainable development of strategic emerging industries is taken as a starting point, this paper analyzes the impact of industrial activities on the ecological environment, and constructs the ecological environmental early-warning indicator system. This paper defines the research objects, and respectively selects sixty listed companies as the training samples and the testing samples. By the normal distribution tests and the factor analysis, five indicators are chosen. The Logistic regression early-warning model is constructed by two indicators which pass the significance test. Finally, the results of empirical analysis show that the early-warning model can provide an ideal warning for ecological environmental position, and can give an effective judgment on ecological environmental problems. The research can provide a certain basis for the sustainable coordination development between strategic emerging industries and ecological environment.
1. Introduction Since the international financial crisis in 2008, major developed countries have proposed to accelerate the development of strategic emerging industries, in order to maintain the forefront of science and technology and seize the commanding heights of global competition in the future. Chinese have also released the “decision of the State Council on speeding up the cultivation and development of strategic emerging industries” and “13th Five-Year national strategic emerging industry development plan” and other policies, in order to speed up the cultivation and development of strategic emerging industries. China's support for strategic emerging industries mainly focuses on breakthroughs in new technology and its industrialization, especially the production chain as the pillar of local economic development. Strategic emerging industries play a key role in promoting the construction of ecological civilization, green China and the sustainable development of economy and society (Gosens and Lu, 2013; Tseng et al., 2013; Shi and Lai, 2013). At present, the development of strategic emerging industries is based on high input, neglects the carrying capacity of environment, resources, ecology and other natural systems, a series of ecological problems has been caused such as excessive destruction of ecological ⁎
environment and irrational exploitation and utilization of resources, the fragile ecological environment has become the bottleneck of industry healthy development (Di Battista et al., 2016; Boxall et al., 2014). The ecological environmental problems should be comprehensively and objectively evaluated, in order to analyze the existing problems and the weak links; the early-warning system of ecological environment is built to analyze and diagnose ecological environment problems, and then early-warning is made. The results can provide a certain basis for the sustainable coordination development between strategic emerging industries and ecological environment, can help to improve the environmental management system and the modernization of governance, realize the goal of energy-saving emission reduction and low carbon economy development. A early-warning system of ecological environment is built to ensure the coordination of economic, social and ecological environment system in the driving process of strategic emerging industries (Abramic et al., 2015). The ecological environmental early-warning system is formed by the mutual connection and interaction of elements of the early-warning system. The ecological environmental status can be monitored and detected in the process of sustainable development of strategic emerging industries (Miao et al., 2016; Drayson et al., 2015),
Corresponding author at: School of Economics and Management, Anhui University of Science and Technology, Huainan City, Anhui Province, 232001, China. E-mail addresses:
[email protected] (L.-y. Sun),
[email protected] (C.-l. Miao),
[email protected] (L. Yang).
http://dx.doi.org/10.1016/j.ecolind.2017.09.036 Received 5 June 2017; Received in revised form 15 September 2017; Accepted 20 September 2017 Available online 30 September 2017 1470-160X/ © 2017 Elsevier Ltd. All rights reserved.
Ecological Indicators 84 (2018) 748–752
L.-y. Sun et al.
Pi ) can be calculated by substituting the se1 − Pi lected variables into the regression equation, thus Pi-value can be calculated. The threshold value is set to be the warning line, if Pi-value is bigger than the threshold value, the ecological environmental problem happens, otherwise, it does not happen. Thus, each sample category can be calculated. The curve of the logistic regression model is of sigmoidal shape, and its early-warning maximum value approaches 1, the minimum value approaches 0. The logistic regression is simple and convenient and has not specific requirements for the distribution of the variables. Therefore, the ecological environmental early-warning model is constructed by the Logistic regression method in this paper.
and the unstable operation status and the unusual phenomenon which can cause serious ecological environmental problems are also identified, evaluated and analyzed (Smeaton et al., 2014; Yu-qiu and Yang, 2013). The warning is given in a moment when the ecological environmental status gets to the setting warning limits (Butnariu and Avasilcai, 2014; Vuorinen et al., 2015), and then effective measures are timely adopted (Folkert, 2016). Strategic emerging industries in China are taken to be the research objects, and the heavy polluting industries are taken to be the abnormal companies which happened the heavy ecological environmental problems, and the light polluting industries are chosen to be the normal companies according to the principle of same number, same type and similar asset size. The ecological environmental early-warning indicators are screened through the normality test and the factor analysis, and the early-warning model is constructed by the Logistic regression method. Finally, the predictive tests of training samples and testing samples are respectively made to prove the validity of the early-warning model.
The value of Yi = ln(
2.2. Early-warning indicator system The ecological environmental early-warning model is constructed by the sample indicators data, all data come from the listed companies of strategic emerging industries in China Shanghai and Shenzhen Ashare stock market in 2016. Thirty heavy polluting industries are selected to be the study samples. According to the principle of same number, same type and similar asset size, thirty light polluting industries are selected to be the paired samples. Those sixty industries form the training samples. In order to test the model's predictive ability, based on the principle of same number, same type and similar asset size, sixty listed companies of strategic emerging industries at the same period are selected to be the testing samples, including thirty heavy polluting industries and thirty light polluting industries. The built early-warning model will be respectively tested by the training samples and the test samples to analyze the accuracy rate and error rate of early-warning model. It is the key to correctly select the representative important indexes and construct the early-warning index system of ecological environment for the analysis of the ecological environment status and the changing trend of strategic emerging industries. The ecological environment condition evaluation index system and calculation methods of each index are stipulated in China Technical Criterion for Ecosystem Status Evaluation. Taking into account the ultimate goal is to maximize the comprehensive benefits under the policy of energy saving and emission reduction, the ecological environmental early-warning indicator system is selected and constructed according to some files such as China Technical Criterion for Ecosystem Status Evaluation, National ecological demonstration area assessment index, China Ecological Safety Supervision Bureau, Environment Protection Bureau, all those files list ecological environment indicators, as presented in Table 1.
2. Early-warning model and indicator system 2.1. Logistic regression early-warning model Logistic regression is mainly applied into the binary response variation and order response variation, its goal is to seek the conditional probability and judge the position and operating risks of observed object (Stoklosa et al., 2016; Asif and McHale., 2016). It is built on the basis of the cumulative probability function, and does not require the independent variable to obey the multivariate normal distribution and the equal co-variance between the two groups (Motrenko et al., 2014; Geng and Sakhanenko., 2016). Its parameter values are estimated by the maximum likelihood estimation method, and the probability of response variable value is calculated by mathematical operations. If the probability is bigger than the setting point, the industry is judged to be the heavy ecological environmental problem. Xi is the variable of early-warning indicator i, formula (1) is the regression relationship function between Xi and probability Pi of happening ecological environmental problem.
Pi =
exp(α + ∑ βi Xi ) 1 + exp(α + ∑ βi Xi )
Assuming Yi = α + β1X1 + β2X2 + ⋯ + βiXi = α + ∑βiXi
(1)
(2)
In formula (2), Yi denotes the total discrimination value which reflects the quantitative characteristics of i th indicator; βi is the weight which reflects the degree of relevant independent variables Xi; α is a constant. So, formula (1) can be changed to be formula (3):
Pi =
exp(Yi ) 1 + exp(Yi )
3. Ecological environmental early-warning model The normality test, factor analysis and logistic regression analysis are made respectively on the ecological environmental early-warning indicators by the statistical software SPSS, and the early-warning model is constructed.
(3)
Formula (4) is gotten according to formula (3), that is
exp(Yi ) =
Pi 1 − Pi
(4)
3.1. Data factor analysis
The natural logarithm is taken on the formula (4), formula (5) is gotten:
Yi = ln(
Pi ) 1 − Pi
The factor analysis of sixty training samples in 2016 is made by the statistical software SPSS.
(5)
(1) Factor analysis test. In order to make sure the suitability of the factor analysis, the KMO test and Bartlett test are made on the sample data, the results are shown in Table 2.
In formula (5), Pi is the probability of danger, that is calculated based on the linear regression model. According to formulas (2) and (5), formula (6) is gotten,
ln(
Pi )=α+ 1 − Pi
∑ βi Xi
The KMO measure value is 0.520 which is bigger than 0.5, and the significance probability value of Bartlett test’s χ2 is 0.000 which is far less than the significance level 0.05, all that indicate that the sample
(6) 749
Ecological Indicators 84 (2018) 748–752
L.-y. Sun et al.
Table 1 Ecological environmental early-warning indicator system. Target layer
Level indicators
Secondary indicators
ecological environmental early-warning indicator system
Industry environmental construction B1
Cleaner new production ratio X1 Resource reuse rate X2 Environmental pollution treatment intensity X3 Production material utilization degree X4 Social productivity X5 Total emissions of environmental pollutants X6 Industry GDP X7 Environmental protection benefits X8 Resource utilization benefit X9 Population carrying capacity of the environment X10 New energy absorption capacity X11 Environmental resource load force X12 Occupancy rate of ecological resources X13 Number of patents in force X14 R & D personnel X15 Expenditure for technology acquisition and renovation X16
Material production B2
Green economic efficiency B3
Environmental carrying capacity B4
Industry green technology R & D B5
Note: The data comes from China statistical yearbook, China energy statistical yearbook and statistical yearbook of China's industrial economy. Table 2 Kaiser-Meyer-Olkin and Bartlett's Test.
Table 4 The orthogonal rotation of factor loadings matrix.
Kaiser-Meyer-Olkin Measure of Sampling Adequacy.
0.520
Component
Bartlett's Test of Sphericity
513.026 120 0.000
1
2
3
4
5
6
7
0.228 0.020 −0.498 0.174 0.101 −0.157 −0.031 0.055 0.155 −0.105 0.046 −0.109 −0.214 0.935 0.953 0.925
0.583 0.049 0.018 0.132 0.402 −0.365 0.814 0.081 0.261 −0.093 0.024 0.809 0.041 −0.098 0.080 −0.016
−0.012 0.018 0.145 0.034 0.156 0.083 −0.032 −0.961 0.020 0.027 0.972 −0.031 0.044 −0.043 0.033 0.031
0.444 0.075 0.387 0.855 −0.109 −0.709 0.067 −0.015 −0.221 0.095 −0.021 0.216 0.037 0.011 0.133 0.240
−0.119 −0.011 −0.189 −0.123 −0.697 −0.157 0.106 0.097 0.736 −0.118 0.063 −0.020 0.761 −0.122 0.009 −0.050
−0.335 0.854 −0.486 0.126 −0.276 0.075 0.216 0.040 −0.123 0.065 0.023 −0.055 0.002 0.000 −0.005 0.041
0.173 0.040 −0.124 −0.044 0.100 −0.242 −0.141 −0.023 0.208 0.918 0.000 −0.034 −0.334 −0.001 −0.035 −0.072
Approx. Chi-Square df Sig.
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16
data suits for the factor analysis. (2) Seeking the principal component eigenvalues and variance contribution rates. After making the factor analysis on the sample data of sixteen indicators, sixteen eigenvalues are gotten. This paper selects seven variables whose eigenvalues are bigger than 1 to be the variables at the next step, and their cumulative contribution rate reaches 81.490% shown in Table 3, that indicates that these seven variables contain 81.490% information of original sixteen indicators, so that, these seven variables can basically reflect the difference of original indicators. (3) Explaining the factor variables. To clearly explain the initial factors, the maximum variance orthogonal rotation method is adopted to find a suitable explanation for each factor and avoid multicollinearity between variables. Factor loading matrix is shown in Table 4:
Table 4 shows that the contribution rate of resource reuse rate X2 which reflects the profitability indicators of resource optimal consumption is up to 0.854, the contribution rate of production material utilization degree X4 which reflects the utilization indicators of unit
Table 3 Principal component eigenvalue and variance contribution rate. Component
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Initial Eigenvalues
Extraction Sums of Squared Loadings
Rotation Sums of Squared Loadings
Total
% of Variance
Cumulative%
Total
% of Variance
Cumulative%
Total
% of Variance
Cumulative%
3.411 2.551 2.084 1.739 1.178 1.068 1.007 0.731 0.562 0.487 0.417 0.326 0.214 0.111 0.085 0.028
21.317 15.946 13.028 10.867 7.365 6.674 6.294 4.571 3.515 3.046 2.603 2.038 1.340 0.692 0.531 0.173
21.317 37.263 50.291 61.157 68.522 75.196 81.490 86.061 89.577 92.623 95.226 97.264 98.604 99.296 99.827 100.000
3.411 2.551 2.084 1.739 1.178 1.068 1.007
21.317 15.946 13.028 10.867 7.365 6.674 6.294
21.317 37.263 50.291 61.157 68.522 75.196 81.490
3.104 2.074 1.931 1.785 1.753 1.248 1.144
19.397 12.960 12.066 11.158 10.958 7.801 7.150
19.397 32.357 44.423 55.581 66.539 74.340 81.490
750
Ecological Indicators 84 (2018) 748–752
L.-y. Sun et al.
production materials is up to 0.855, the contribution rate of industry GDP X7 which reflects the industry benefit indicators is up to 0.814, the contribution rate of new energy absorption capacity X11 is the biggest and its value is 0.972 that reflects the utilization and consumption intensity indicators of new energy, the contribution rate of expenditure for technology acquisition and renovation X15 is 0.953 that reflects the influence level of industry green technology R & D. Therefore, X2, X4, X7, X11 and X15 are chosen to be the initial variables to build the ecological environmental early-warning model.
Table 6 Model Summary table. -2 Log likelihood
Cox & Snell R Square
Nagelkerke R Square
1 2
68.366a 58.157a
0.219 0.341
0.292 0.455
a Estimation terminated at iteration number 5 because parameter estimates changed by less than 0.001.
Table 7 The chosen variables in the Equation.
3.2. Early-warning model construction Before constructing the early-warning model, the discriminate critical point must be firstly determined. The logistic regression earlywarning model has not the optimal split point, so the split point is selected based on the specific objectives of the model user. In fact, for any critical point, there are two types of errors for all models: the class I error and class II error. The class I error is that the heavy polluting company is usually mistaken as the light polluting company. Class II error is that the light polluting company is usually mistaken as the heavy polluting company. When class I error reduces, class II error increases, and the costs of two types of errors are the same. 0.5 is set as the critical point in most research, i.e., if the probability of an event calculated by the model is bigger than or equal to 0.5, then the event is judged to have happened, otherwise the event does not occur. All selected samples in this paper are paired, so that, 0.5 is selected to be the critical point. Adopting the statistical analysis software SPSS, the logistic regression analysis is made by the selected five variables including X2, X4, X7, X11 and X15. The results shown in Table 5 are calculated by the forward stepwise method. It is obvious that 0 is smaller than 0.05, so the overall model is significant. Table 6 shows that the value of statistics Cox & Snell R Square is 0.341 and the value of Nagelkerke R Square is 0.455, about 60% of explained variable can be explained by the model, the model’s fitting degree is higher. Table 7 shows that two variables pass the significance test i.e. significance level 0.05, both are chosen to be variables of the final model, but the other three variables fail to pass the test. The final logistic earlywarning model is constructed by only two variables which are X4 and X7 . According to the formula (2): Yi = α + β1X1 + β2X2 + ⋯ + βiXi = α + ∑βiXi the following formula is gotten
a
Step 1
Step 2b
a b
df
Sig.
Exp(B)
−0.015 1.669 −0.013 −0.026 3.053
0.005 0.594 0.005 0.010 0.854
8.682 7.884 6.272 7.151 12.788
1 1 1 1 1
0.003 0.005 0.012 0.007 0.000
0.985 5.305 0.987 0.974 21.172
Variable(s) entered on step 1: X4. Variable(s) entered on step 2: X7.
4. Conclusion The results of ecological environmental early-warning are used to
exp(3.053 − 0.013X 4 − 0.026X7 ) 1 + exp(3.053 − 0.013X 4 − 0.026X7 )
(8)
Table 8 Classification forecast Table.
Table 5 Model Coefficients’ Significance Test.
Step 2
Wald
The early-warning model is tested by sixty training samples. If Pvalue is bigger than or equal to 0.5, the research object is judged to fall into heavy pollution; if less than 0.5, the research object isn't judged to fall into heavy pollution. The results are shown in Table 9. Five light polluting companies are misjudged to be heavy pollution, the error rate is 16.67%. Four heavy polluting companies are misjudged to be light pollution, the error rate is 13.33%. The total accuracy rate is 85%. The early-warning model is gotten by the training sample data, so the training sample test results maybe overestimate its early-warning accuracy. To further test the early-warning accuracy, sixty testing samples which are not applied into the modeling process are adopted, 0.5 is the critical point, the results are shown in Table 10. The test results indicate that the accuracy rate is 75%, so the built logistic regression early-warning model has better early-warning ability in practical applications.
exp(Yi ) , 1 + exp(Yi ) So, the ecological environmental early-warning model is gotten, as following formula (8):
Step Block Model Step Block Model
S.E.
3.3. Test of early-warning model
then substituting into formula (3) Pi =
Step 1
X4 Constant X4 X7 Constant
B
X4 is production material utilization degree which reflects the utilization indicators of unit production materials; X7 is industry GDP which reflects the sustainable development efficiency of industry. Table 8 shows that the early-warning model gets a better discriminate accuracy rate. The overall accuracy rate is 75.0%; the accuracy rate of light pollution is 70.0%, the error rate is 30.0%; the accuracy rate of heavy pollution is 80.0%, the error rate is 20.0%.
(7)
Yi = 3.053 − 0.013X 4 − 0.026X7
Pi =
Step
Observed Chi-square
df
Sig.
14.812 14.812 14.812 10.209 25.020 25.020
1 1 1 1 2 2
0.000 0.000 0.000 0.001 0.000 0.000
Step 1
crisis occur or not
Step 2
Overall Percentage crisis occur or not Overall Percentage
Note: Chi-square is the chi-square value, df is freedom degrees, Sig is the significance test value.
a
751
The cut value is 0.500.
Predicted crisis occur or not 0 1
Percentage Correct
0 1
21 9
9 21
0 1
21 6
9 24
70.0 70.0 70.0 70.0 80.0 75.0
Ecological Indicators 84 (2018) 748–752
L.-y. Sun et al.
Acknowledgements
Table 9 The test results of training samples.
The paper is supported by the National Natural Science Foundation of China (71503003, 71704002, 51774013); Anhui Province Philosophy and Social Science Planning Foundation of China (AHSKY2015D78, AHSKQ2016D26); Anhui Province Soft Science Foundation of China (1502052055); Humanities and Social Sciences Foundation of Education Department, Anhui Province, China (SK2016A0291); Anhui Province Natural Science Foundation of China (1508085QG147, 1708085QG166).
Predictive value crisis occur or not
crisis occur or not
No Yes
Total
No
Yes
25 4 29
5 26 31
accuracy rate
83.33 86.67 85.00
References Table 10 The test results of testing samples.
Abramic, Andrej, Martinez-Alzamora, Nieves, del Rio Rams, Julio Gonzalez, et al., 2015. Coastal waters environmental monitoring supported by river basin pluviometry and offshore wave data. Mar. Pollut. Bull. 92 (1–2), 80–89. Asif, Muhammad, McHale, Ian G., 2016. In-play forecasting of win probability in one-day international cricket: a dynamic logistic regression model. Int. J. Forecast. 32 (1), 34–43 (6). Boxall, A.B.A., Keller, V.D.J., Straub, J.O., et al., 2014. Exploiting monitoring data in environmental exposure modelling and risk assessment of pharmaceuticals. Environ. Int. 73, 176–185 (1.1). Butnariu, Anca, Avasilcai, Silvia, 2014. Research on the possibility to apply ecological footprint as environmental performance indicator for the textile industry. Procedia – Soc. Behav. Sci. 124, 344–350 (3). Di Battista, Tonio, Fortuna, Francesca, Maturo, Fabrizio, 2016. Environmental monitoring through functional biodiversity tools. Ecol. Indic. 60, 237–247 (1). Drayson, Katherine, Wood, Graham, Thompson, Stewart, 2015. Assessing the quality of the ecological component of English Environmental Statements. J. Environ. Manage. 160, 241–253. Geng, Pei, Sakhanenko, Lyudmila, 2016. Parameter estimation for the logistic regression model under case-control study. Stat. Probab. Lett. 109, 168–177. Gosens, Jorrit, Lu, Yonglong, 2013. From lagging to leading? Technological innovation systems in emerging economies and the case of Chinese wind power. Energy Policy 60 (9), 234–250. Miao, Cheng-lin, Sun, Li-yan, Yang, Li, 2016. The studies of ecological environmental quality assessment in anhui province based on ecological footprint. Ecol. Indic. 60, 879–883 (2). Motrenko, Anastasiya, Strijov, Vadim, Weber, Gerhard-Wilhelm, 2014. Sample size determination for logistic regression. J. Comput. Appl. Math. 255, 743–752 (7). Shi, Qian, Lai, Xiaodong, 2013. Identifying the underpin of green and low carbon technology innovation research: a literature reviewfrom 1994 to 2010. Technol. Forecast. Soc. Change 80 (5), 839–864. Smeaton, Alan F., O’Connor, Edel, Regan, Fiona, 2014. Multimedia information retrieval and environmental monitoring: shared perspectives on data fusion. Ecol. Inf. 23, 118–125. Stoklosa, Jakub, Huang, Yih-Huei, Furlan, Elise, Hwang, Wen-Han, 2016. On quadratic logistic regression models when predictor variables are subject to measurement error. Comput. Stat. Data Anal. 95, 109–121 (5). Tseng, Ming-Lang, Wang, Ray, Chiu, Anthony S.F., Geng, Yong, Lin, Yuan Hsu, 2013. Improving performance of green innovation practices under uncertainty. J. Clean. Prod. 40 (2), 71–82. Vuorinen, IIppo, Hanninen, Jari, Rajasilta, Marjut, et al., 2015. Scenario simulations of future salinity and ecological consequences in the Baltic Sea and adjacent North Sea areas-implications for environmental monitoring. Ecol. Indic. 50, 196–205. Yu-qiu, C.A.I., Yang, Xin., 2013. The studies of agricultural ecological environmental quality assessment problem. Ecol. Econ. 2, 174–177. Folkert, de Jong, 2016. Ecological knowledge and North Sea environmental policies. Environ. Sci. Policy 55 (3), 449–455 (4).
Predictive value crisis occur or not
crisis occur or not Total
No Yes
No
Yes
22 7 29
8 23 31
accuracy rate
73.33 76.67 75.00
evaluate and predict the impact of strategic emerging industries activities on ecosystems, and decision-making is provided for rational use of resources, improvement of ecological environment and nature conservation in the process of driving economic development of strategic emerging industries. The research objects is firstly defined in this paper, and sixty listed companies of strategic emerging industries are selected to be the training samples including thirty heavy polluting industries and thirty light polluting industries. In order to test the model's predictive ability, another sixty industries at the same period are selected to be the testing samples based on the principle of same number, same type and similar asset size. The normal distribution tests are made on the early-warning indicators data to determine whether they are suitable for factor analysis or not. Then, the logistic early-warning model is built by two variables i.e. X4 and X7 which are screened by the forward stepwise method. Finally, the empirical analysis is made and the results show that the ecological environmental early-warning model has a better prediction that are tested separately by sixty training samples and sixty testing samples. In short, the rapid development of strategic emerging industries cause more and more serious population, resources and environment problems. It is not enough to understand the quality of ecological environment simply from physical chemical indicators and biological indicators monitoring. The paper analyzes the relationship between industrial development and ecological environment, constructs the ecological environmental early-warning model which can provide effective basis for evaluating the ecological environment quality and protecting and reconstructing the ecological environment.
752