Research on regional differences and influencing factors of green technology innovation efficiency of China’s high-tech industry

Research on regional differences and influencing factors of green technology innovation efficiency of China’s high-tech industry

Journal Pre-proof Research on regional differences and influencing factors of green technology innovation efficiency of China’s high-tech industry Chu...

1014KB Sizes 0 Downloads 15 Views

Journal Pre-proof Research on regional differences and influencing factors of green technology innovation efficiency of China’s high-tech industry Chunyang Liu, Xingyu Gao, Wanli Ma, Xiangtuo Chen

PII: DOI: Reference:

S0377-0427(19)30602-8 https://doi.org/10.1016/j.cam.2019.112597 CAM 112597

To appear in:

Journal of Computational and Applied Mathematics

Received date : 25 March 2019 Revised date : 1 September 2019 Please cite this article as: C. Liu, X. Gao, W. Ma et al., Research on regional differences and influencing factors of green technology innovation efficiency of China’s high-tech industry, Journal of Computational and Applied Mathematics (2019), doi: https://doi.org/10.1016/j.cam.2019.112597. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Elsevier B.V. All rights reserved.

Manuscript Click here to view linked References

Journal Pre-proof

Research on Regional Differences and Influencing Factors of Green Technology Innovation Efficiency of China's High-tech Industry

1,b

1,c,*

, Xiangtuo Chen

2,d

of

1,a

Chunyang Liu , Xingyu Gao , Wanli Ma

1 School of Business, Shandong University, Weihai 264209, CHINA

2 Laboratory MICS, CentraleSupélec, Paris-Saclay University, Gif sur Yvette 91190, FRANCE

p ro

* Corresponding author: [email protected]

Jo

urn

al

Pr e-

Note: This paper is supported by the Graduate Science Research Fund of Shandong University Business School.

Journal Pre-proof

Abstract Through the K-means clustering analysis, it divides the regions of China into four clusters according to the differences in high-tech industry development level between 2008 and 2016. Considering "environmental pollution" and "innovation failure", an improved SBM-DEA efficiency measurement model was constructed to measure the green technology innovation efficiency of China's high-tech industry clusters. Lasso regression was used to screen out the factors affecting the green technology innovation efficiency of high-tech industry in each cluster area. On this basis, quantile regression method is used to

of

study the influence degree and regional differences of various influencing factors on green innovation efficiency of high-tech industry at different quantile. Meanwhile, DEA-tobit model is used for robustness test. The research shows that in each cluster area, the factors that significantly affect the green innovation

p ro

efficiency of high-tech industry are different, and the degree of influence of each factor on the innovation efficiency at different quantile is also different. Combining the empirical results with the reality of high-tech industries in various regions, the corresponding policy recommendations are put forward. Keywords: High-tech Industry; Innovation Efficiency; SBM-DEA; K-means Clustering Analysis; Lasso

Jo

urn

al

Pr e-

Regression;Quantile Regression;DEA-Tobit Model

Journal Pre-proof 1.Introduction High-tech industry has an important role in the strategic emerging industry in China. China's economic development has entered a new normal, how to effectively optimize the allocation of innovation resources and improve the innovation efficiency of enterprises is one of the important issues facing China's implementation of innovation-driven strategy and construction of an innovation-oriented country. In recent years, China's high-tech industry has made great

of

contribution to China's economic development with its rapid development. However, there are also many problems. For example, The development of high-tech industries in China is unbalanced among regions, and the factors influencing the innovation level of regional

p ro

high-tech industry are different. Due to the characteristics of technology and knowledge concentration and low resource consumption, the technology innovation efficiency plays a more important guiding role in industrial development, and determines the overall development level of high-tech industries in various regions(Meng et al.,2019). Therefore, it is still of great theoretical and practical significance to further enrich the measure and evaluation means of innovation efficiency in high-tech industries and analyze the key factors affecting

Pr e-

innovation efficiency in various regions.

The research objectives of this paper are mainly reflected in three aspects. Firstly, this study takes "environmental pollution" and "innovation failure" as the unexpected output of green innovation efficiency. The SBM - DEA model is constructed to measure and analyze the green technology innovation efficiency of high-tech industries in each cluster area. Secondly, the factors influencing the green innovation efficiency of high-tech industries in each cluster region were selected by Lasso regression. Finally, the quantile regression method and DEA-Tobit method are used to study the differences in the degree of influence of each factor on green

al

innovation efficiency of high-tech industry in different quantiles. On the basis of relevant empirical results and in combination with China's national conditions, this paper provides valuable policy suggestions for deepening the reform of China's regional high-tech industry and

urn

realizing innovation-driven and efficient development. 2.Literature Review

2.1Research on the measurement of innovation efficiency In recent years, scholars have deepened their research on the innovation efficiency of

Jo

high-tech industries, and their perspectives and methods have been gradually expanded. Data envelopment analysis (DEA) is a common method to measure innovation efficiency. Carayannis et al. (2016) compared the innovation efficiency of 185 regions in 23 European countries with the multi-objective DEA model, and pointed out that there were differences in the innovation efficiency between different regions and different innovation stages. Kaya Samut and Cafrı (2016) used DEA to measure the Innovation efficiency values of hospitals in 29 OECD countries between 2000 and 2010, and then applied the panel Tobit model to determine the environmental factors affecting hospital efficiency scores. By decomposing the Malmquist

Journal Pre-proof productivity index decomposition, the change of the efficiency decomposition value was analyzed. Lafarga and Balderrama (2015) used DEA method to calculate the overall efficiency, patent production efficiency and scientific paper production efficiency of 32 Mexican states. Yeung and Azevedo (2011) employed data envelopment analysis to measure the efficiency of Brazilian state courts. 2.2Research on the influencing factors of innovation efficiency

of

Scholars at home and abroad have conducted a lot of researches on the influencing factors of innovation efficiency of high-tech industry from different perspectives and methods, and achieved fruitful results. Some studies show that the factor market distortion can inhibit

p ro

innovation activities in high-tech industries (Ji and Dou 2016; Li et al. 2017, 2018; Gao et al. 2019; Shen et al. 2018). Fang and Chiu (2017) shows that industry-university-research cooperation is an effective way to improve innovation performance. Kalapouti et al. (2017) found that patent application, development level, employment level and technological diversity have an impact on innovation efficiency. Hong et al. (2016) found that government subsidies had a negative impact on the innovation efficiency of high-tech industries, while private R&D funds observably

Pr e-

promoted the innovation efficiency of high-tech industries. Castro and Gregorio (2015) believe that acquiring knowledge through the Internet and the outside world is very important for sustained and stable innovation.

2.3 Study on the method of influencing factor selection

Many scholars use Lasso regression in machine learning to select the influencing factors of variables (Liu and Du 2012; Fang et al. 2014; Mansiaux and Carrat 2014; Pereira et al. 2016), the advantage of Lasso method is that it directly estimates the regression coefficient of insignificant

al

variables by adding penalty terms, so as to eliminate weak variables. Lasso regression modeling can be used regardless of the nature of the target dependent variable. The basic idea of Lasso regression is to minimize the sum of the squared residuals under the condition that the sum of the absolute values of a regression coefficient is less than a constant, so that the coefficient

urn

should be reduced to zero strictly, and the corresponding variables are deleted to realize variable selection (Tibshirani, 1996). Matsui and Hidetoshi (2018) estimated the parameters of the logistic regression model through he sparse group Lasso-type penalty, and then selected the optimization parameters of the model under the model selection criteria. At present, some scholars use quantile regression to select variables (Alhamzawi and Yu 2012; Jiang et al. 2014; Fan 2015; ). Some studies combine LASSO with quantile to estimate and select parameters.

Jo

Hashem et al. (2016) used a group lasso penalty to estimate the parameters of the quantile regression model in the binary response classification problem. Xie and Xu (2014) introduced sparse group Lasso technology to construct an algorithm for feature selection of uncertain data. Benoit et al. (2016) proposed the Bayesian hierarchical model to select and estimate variables in binary quantile regression, and gives the corresponding Lasso program. Existing literature lacks in-depth research on the scientific measurement of green innovation efficiency of high-tech industry, and the selection of influencing factors is also lack of comprehensiveness and innovation. Based on the existing literature, SBM-DEA model considering "environmental pollution" and "innovation failure" is adopted to measure green

Journal Pre-proof innovation efficiency, and LASSO regression is used to screen out the variables that affect green innovation efficiency. On this basis, DEA-Tobit model and panel quantile regression are used to measure the direction and degree of the impact of variables on green innovation efficiency. By comparing the parameter estimates of the two regression models, the results are more convincing, and the accuracy of LASSO selection variables is further verified. 3. Model and Algorithm

of

3.1. SBM-DEA efficiency measurement model Data Envelopment Analysis(DEA) refers to a nonparametric technical efficiency analysis method used to measure the relative efficiency of a research object, the decision-making unit

p ro

(DMU). In the DEA theoretical model, the number of DMUs can be infinite.

Most of the traditional DEA models are radial and angular measures. Tone (2001) constructed a non-radial and non-angle SBM-DEA model, which could fully consider the slack of input and output. Suppose there are n decision making units in a production system. Each decision-making unit has three vectors: input X , expected output Y g and non-expected outpu s s g b g b Y b . Its elements can be expressed as x  R m , y  R 1 and y  R 2 .defining matrix X , Y , Y as









Pr e-

follows: X  x1,, xn  R mn , Y g  y1g ,, yng  Rs1n , Y b  y1b ,, ynb  Rs2 n , Among them, xi  0 , yis  0 和 y1b  0i  1,2,, n . Then the SMB-DEA model based on unexpected output can be

expressed as:

1 m si  1  m i 1 xi 0 *  = min s1 s2 1 sg sb 1 (  rg   rb ) s1  s2 r 1 yr 0 r=1 yr 0

(1)

urn

al

 x0  X   s  g g g  y0  Y   s s.t.  b b b  y0  Y   s  s   0, s g  0, s b  0,   0  

In formula(1), s  , s g and s b indicate the slack of input, expected output and non-expected output, respectively,  is a weight vector, objective function   is about s  , s g , s b Strictly monotonous decreasing. And 0     1 . when    1 , s  , s g , s b are zero, if    1 ,it shows that

Jo

the decision making unit is efficient;if It shows that there is redundancy in decision making units, and the efficiency can be improved by optimizing the allocation. 3.2.K - means Algorithm K-means clustering is a clustering analysis method based on partition, and it is also the most classical clustering method at present. The basic principle is that after the given classification number k , the algorithm divides the data set into k categories C  C1, C2 ,, Ck  , and then iterates continuously until the objective function reaches the minimum value, that is, the final clustering result is obtained. The objective function is

Journal Pre-proof k

E

 x  u

2

(2)

i

i 1 xCi

In formula(2), E is the square error sum of all clustering objects, x is the clustering object, and ui is the average of all data in class Ci , namely the clustering center(Wang et al.,2019). 3.3.Lasso Regression Algorithm

of

Generally, we consider a linear regression problem with P variables and n observations in the form as follows:

p ro

Y ~ 1 x1   2 x2  ...   p x p  X  

(3)

In formula(3), Y  Rn , is the explained variable, X  ( x1 , x2 ,..., x p ) is the matrix of explanatory 2 variables of dimension n  p ;  ~ N (0,  I n ) is the vector of stochastic errors;  is the coefficient

vector of regression model. Under the least squares criterion, estimate coefficient vector by minimizing the sum of squares of errors:

 R n

However,

when

2 2



Pr e-



   arg min Y  X

the

variables

are

multicollinearity,

the

ordinary

(4)

least

squares

regression(OLS) performs poorly in terms of prediction quality and model complexity. Robert Tibshirani (1996) first proposed the Lasso regression method, which is a compression estimation method. It compresses the regression coefficient of some indistinctive variables to 0 by introducing penalty function into the regression model. Lasso regression method can be used to solve multicollinearity problem. The idea of Lasso regression is to minimize the sum of

al

the residual squares by imposing a constraint on the L1 norm of the regression coefficients. It is written in the following penalized form: 

1 n

lasso  arg min  Y  X

    1 ,   0. 

(5)

urn

 R

2 2

In formula(5),  is called the penalty parameter. It helps not only to reduce the bias by narrowing the coefficients  , but also make the automatic selection of variables, by while estimating certain components of the coefficient vector  by 0.

Jo

3.4.The Quantile Regression Method In this paper, the quantile regression method is used to analyze the change of the influence degree of various factors on the green innovation efficiency of high-tech industry when it is at different levels. The estimated coefficient of quantile regression represents the marginal effect of independent variable on dependent variable at specific quantiles, which can fully reflect the characteristics of conditional distribution of dependent variable, especially the effective description of local information of distribution function, avoiding one-sided judgment on research issues based on "average" influence. Let the distribution function form of random variable Y be as follows:

Journal Pre-proof F ( y )  P(Y  y )

(6)

In equation (6), y of the  (0  1  0) quantile function may be defined as: Q( )  inf y : F ( y )  

(7)

of

In equation (7), 0    1 represents the proportion of data below the regression line or regression plane. The characteristic of the partition function is that the distribution of the dependent variable y has a proportion of  less than Q( ) , and the part (1   ) is greater than Q( ) . The distribution of y is divided by  . To solve for quantile regression, the probability function  () is defined as: Yi  X i' 

  ( )   (  1)

p ro

Yi  X i' 

(8)

In equation (8),  is the parameter reflecting the probability density function,  () represents the probability density function relationship when the sample points of y are below and above the quantile  . Suppose quantile regression model as follows: yˆQ   Q  Q x

(9)

Pr e-

In equation (9), quantile regression of y is to find that the sum of absolute deviation of y under Q quantile is the minimum, and the expression is as follows: min 

y

iQ

 Q  Q X i  iQ

(10)

In equation (10), for simplicity,   1 can be assumed in the specific estimation process, so for any  quantile regression, parameter estimation is to minimize the weighted sum of squared absolute errors:

 Y  X   (1   ) Y  X  i

' i

yi  X i' 

i

' i

al

ˆ (t )  arg min

(11)

yi  X i' 

In equation (11), Yi and X i ' represent vectors of dependent variable and independent variable

urn

respectively,  for the estimated quantile values, when  takes different values on (0,1), different parameter estimates can be obtained. 3.5.System Framewrok

The paper takes "Research on Regional Differences and Influencing Factors of Green Technology Innovation Efficiency of China's high-tech Industry Based on Machine Learning" as

Jo

the topic, which involves the measurement and regional difference of innovation efficiency of high-tech industry based on machine learning, the analysis of the influencing factors of innovation efficiency of high-tech industry based on machine learning and the differentiated realization path and institutional arrangement for improving the efficiency of green innovation in high-tech industries. The specific system framework of this paper is shown in Figure 1.

p ro

of

Journal Pre-proof

Diagram 1. System framework of this paper 4.Data and Variables

Pr e-

The study sample includes 30 provincial administrative regions in mainland China (excluding Tibet due to limited data), totaling 270 observation samples. Relevant data are from China Statistical Yearbook, China High-tech Industry Statistical Yearbook, China Science and Technology Statistical Yearbook and China Financial Yearbook from 2009 to 2016. 4.1.Green Innovation Efficiency Index System

The efficiency is measured by SBM-DEA method in MaxDEA software. The setting and data processing of each indicator are described in detail as follows:

al

The input of innovation activities is considered from two perspectives. From the perspective of human capital investment, the index of full-time equivalence of R&D personnel in high-tech industries is adopted to better measure the human investment and actual working time of R&D

urn

personnel in innovation activities. From the perspective of capital investment, the internal expenditure index of R&D expenditure of high-tech industries is used to represent the actual expenditure of R&D expenditure in each region through the construction of R&D price index (Lyu and Li, 2016).

The expected output of innovation activities is also considered from two perspectives. From

Jo

the perspective of knowledge technology, Domestic patent application volume is selected to approximate represent the output of knowledge technology. From the perspective of product, New product sales revenue can reflect the result of innovation efficiency from the output dimension of technology transformation. In the base period of 2008, the sales revenue of new products is reduced according to the producer price index. The unexpected output of innovation activities is considered from the perspective of environmental pollution. Industrial waste water and waste gas emissions are selected as

Journal Pre-proof unexpected outputs, which makes the calculated results of technological innovation efficiency more practical. 4.2.Other Impact Factors The influencing factors selected in this paper are as follows: (1)Green Financial Resources Allocation Efficiency (FE)

of

The allocation efficiency of green financial resources in high-tech industries is constructed as follows: based on relevant literature and theories, R&D investment can dramatically improve the level of innovation (Li and Zhang,2014), so government funds and enterprises' self-raised developing innovation.

p ro

R&D funds are selected as input indicators, which fully ensures the capital supply for

The selected expected output indexes include the sales revenue of new products and the number of new product projects. These indexes can reflect the efficiency of financial resource allocation

from

the

output

dimension

of

financial

resource

transformation.

The

output.

Pr e-

non-performing loan ratio of commercial Banks is regarded as an indicator of unexpected The non-performing loan ratio of the benchmark period in 2008 is set as 1, and the

ratio between the non-performing loan amount of the current year and the non-performing loan amount of the previous year is taken as the non-performing loan ratio of the current year. (2)Degree of Opening-up (DO). By participating in global competition, we can find our own shortcomings and make use of our own comparative advantages, so as to improve the technological innovation. In this paper, The ratio of the export delivery value of high-tech industries to the main business income is selected as the indicator of opening up.

al

(3)New Product Demand (NPD). The market orientation formed by users' demand for new products can guide enterprises' behaviors to a certain extent, so that enterprises can improve

urn

the efficiency of technological innovation through targeted organizational management and R&D learning. The ratio of new product sales revenue and main business revenue of high-tech industries is used to measure the demand for new products in different regions. (4)The Factors of Financial Market Distortions(FD). In terms of measuring the distortion of factor market, this method is adopted to measure the distortion degree of financial factor market in different regions according to the relative difference between the degree of financial

Jo

marketization in different regions and the degree of benchmark financial marketization, that is:

DISTit   max(MARKit )  MARKit  max(MARKit )

(12)

In formula (12), i and t represent region and year respectively. DISTit is the degree of market

distortion of financial factors; MARKit is financial marketization index; maxMARKit  is the maximum value of the financial marketization index in the sample; DISTit is between 0 and 1. This method not only reflects the relative differences in financial factor market distortions between regions, but also the inter-temporal changes of such differences.

Journal Pre-proof (5) Level of Government Support (GS). Government support plays a very important role in improving the innovation level of high-tech industry, which is mainly reflected in the fiscal subsidy policies and tax preferential policies. This paper selects the proportion of government funds in the R&D expenditure of high-tech industry in the total sum of government funds and enterprise funds as the index to measure the degree of government support. (6) Transfer of Knowledge(DKTD,IKTD). The knowledge transfer degree reflects the flow of innovation resources such as technology between regions. From the source of knowledge

of

transfer, it can be classified into domestic knowledge transfer degree and international knowledge transfer degree, which respectively reflect the flow of innovative resources in the region, other regions and abroad. In this paper, domestic knowledge transfer degree (DKTD) is

p ro

measured by the proportion of regional technology market technology transfer (contract amount) to regional GDP, while international knowledge transfer degree (IKTD) is measured by the proportion of foreign technology transfer contract amount to regional GDP. (7) Financial support(FS). The investment of science and technology finance and the support of venture capital can promote the technological innovation. This paper uses the

Pr e-

proportion of bank loan in the total amount of science and technology fund to represent financial support.

(8) Industry-university-research cooperation (IURC). Enterprises, universities and research institutions are the innovation subjects and basic forces of innovation in the regional innovation system (Li and Fu, 2014), the interaction among the three has an important impact on innovation performance (Lundvall, 1988; Edquist et al., 2002). This paper uses the proportion of enterprise funds in the total amount of science and technology funds raised by

al

universities and R&D institutions. (9) Foreign direct investment (FDI). Foreign direct investment will bring capital, advanced technology, equipment and knowledge, advanced management experience and technical

urn

talents to the region. However, foreign investors' advantages in capital, technology and government policies will squeeze the market share of local enterprises. In this paper, foreign direct investment is measured by the proportion of total investment of foreign-funded enterprises in the gross regional product. (10) Information Infrastructure Development (IID). Information technology is diffused

Jo

through the construction of information infrastructure, so as to promote the penetration and integration between high-tech industries and non-high-tech industries, and thus improve the efficiency of innovation. The proportion of total postal and telecommunications business in regional GDP is adopted to reflect the development level of infrastructure. (11) Regional Economic Development (lnGDP). The real GDP of a region not only reflects its economic performance, but also its economic strength and wealth. Taking the logarithm of real GDP makes the data more stable.

Journal Pre-proof

Standard

Sample Size

Maximum

Minimum

Mean

IV

270

1

0.011

0.540

0.258

FE

270

1

0.543

0.864

0.147

DO

270

3.344

0

0.350

0.463

NPO

270

0.622

0.001

0.224

0.148

FD

270

0.746

0

0.380

0.178

GS

270

0.613

0.002

0.128

0.113

DKTD

270

0.068

0.001

0.010

0.009

IKTD

270

0.045

0

0.003

0.006

FS

270

0.135

IURC

270

0.321

FDI

270

4.466

IID

270

0.972

lnGDP

270

10.611

p ro

of

Variables

deviation

0.036

0.020

0.013

0.135

0.068

0.047

0.347

0.447

0.003

0.145

0.274

6.890

9.075

0.849

Pr e-

0.005

Table 1. Descriptive Statistics of Variables

5.Empirical Analysis 5.1.Clustering Region Division

As we know, regional imbalance is one of the most important characters of the development in the developing countries(Wei,1999; Hastie and Efron,2013). In order to further analysis of

al

China's regional green technological innovation efficiency in time and space change characteristics, a K-means clustering is firstly carried out with to divide the 30 provinces into 4

urn

groups according to their development level of high-tech industry(Zou,2017). These groups are high, medium, medium low, low level 4 kind of economic regions, corresponding clustering number 1, 2, 3, 4, clustering results as shown in Table 3. Table 2. Clustering Division of Chinese Provinces

Cluster 1

Jo

Cluster 2 Cluster 3

Cluster 4

Tianjin,Shanghai, Jiangsu, Guangdong,Chongqing,

Beijing, Zhejiang, Anhui,Shandong, Henan, Sichuan

Hebei,Jilin,Fujian, Jiangxi,Hunan,Hainan,Shaanxi, Qinghai, Ningxia, Xinjiang Shanxi,Neimenggu, Liaoning,Heilongjiang,,Guangxi, Guizhou, Yunnan,

Gansu

Based on the data in table 3, the following conclusions can be drawn: from 2008 to 2016, Jiangsu, Guangdong ranked the first in clustering, and the efficiency of technological innovation was higher than the national average. Beijing, Shanghai, Zhejiang, Shandong, Henan, Sichuan, Tianjin are ranked in the second clustering, and technological innovation efficiency is in the national average level. Anhui, Hainan, Chongqing, Qinghai, Ningxia, Xinjiang lie in the

Journal Pre-proof third cluster and have a higher technical innovation efficiency than the national average. In the fourth cluster, the efficiency of technological innovation is lower than the national average. Cluster 1 gathers a large number of excellent manufacturing enterprises at home and abroad. They are the pioneers of green innovation industry. The province in cluster 2 is also a large economy province, but it is not the best soil for high-tech industry because its per capita resource possession is not as large as that in cluster 1. Cluster 4 mainly concentrated in the inland and northeast regions, have always been China's heavy industry base, while high-tech

of

industries are relatively deficient in industrial development. The results of clustering grouping in this paper are not consistent with the results of geographical division, which indicates that the traditional method of dividing different regions of China according to geographical location

p ro

is not completely applicable to the study of technological innovation efficiency in China. In order to prove the rationality of the above clustering grouping results, the following paper conducts ANOVA on the clustering results to obtain intra-class differences and inter-class differences, as shown in Table 4.

Pr e-

Table 3. Analysis of Variance Total

Inter-Class

Iintra-Class

Interblock Percentage

1.363

1.258

0.105

92.30%

As can be seen from Table 4, the inter-class difference is 1.258, and the intra-class difference is 0.105. That is to say, the inter-class difference in the green technology innovation efficiency of high-tech industry explains 92.30% of the regional total difference between 2008 and 2016. In other words, the differences in the green technology innovation efficiency of high-tech

al

industries in different regions of China are mainly caused by the differences among different categories. Therefore, when analyzing the influencing factors of green innovation efficiency in China, it is necessary to classify regions to understand the differences of the impact of various

urn

factors on the green innovation efficiency of high-tech industries in different clustering regions. 5.2.Measurement of Green Technology Innovation Efficiency As can be seen from Table 2, the overall green innovation efficiency of most provinces has maintained a tortuous growth trend, and the national average green innovation efficiency increased from 0.422 in 2008 to 0.636 in 2016. The technological innovation efficiency of

Jo

Liaoning, Heilongjiang, Anhui, Jiangxi, Shandong, Hubei, Hunan shows a straight upward trend; the technological innovation efficiency of Hebei, Inner Mongolia, Jilin, Shanghai, Zhejiang, Fujian, Guangxi and Guizhou is in a state of fluctuation, with little change in final value and initial value; while the technological innovation efficiency of Yunnan shows a tortuous downward trend. The average regional innovation efficiency of Beijing, Tianjin, Guangzhou, Anhui and other provinces is above 0.70, which can be summarized as regions with high innovation efficiency in the result. Meanwhile, the average regional innovation efficiency of Shanxi, Liaoning, Jiangxi, Guangxi, Hainan, Guizhou, Gansu, Qinghai, Ningxia and other

Journal Pre-proof provinces is below 0.45, which can be concluded as insufficient innovation efficiency in the result. Area. The average level of green innovation efficiency of high-tech industries in clustering 1, 2, 3 and 4 decreases successively, among which the average efficiency of clustering 1 and 2 is higher than the national level, and the average efficiency of cluster 3 and 4 is obviously lower than the national level. The research shows that the green innovation efficiency level of the high-tech industry is closely related to the development level of the regional high-tech industry.

2009

2010

2011

2012

Beijing

1.000

1.000

1.000

1.000

1.000

Tianjin

0.625

1.000

1.000

1.000

1.000

Hebei

0.228

0.180

0.177

0.131

0.208

Shanxi

1.000

0.204

0.399

0.231

0.286

0.000

0.334

0.260

Liaoning

0.192

0.275

0.254

Jilin

0.364

0.356

0.285

Heilongjiang

0.107

0.084

0.082

Shanghai

1.000

0.468

0.430

Jiangsu

1.000

0.300

0.366

Zhejiang

0.303

0.373

Anhui

0.145

0.443

Fujian

1.000

0.356

Jiangxi

0.134

0.147

Shandong

0.314

Henan

rs

Inner

2013

2014

2015

2016

Year/Mea n

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

0.958

0.236

0.256

0.204

0.267

0.210

0.312

0.337

0.335

0.100

0.356

Pr e-

2008

p ro

Provinces/Yea

of

Table 4. Green Innovation Efficiency of China's High-tech Industries, 2008-2016

0.247

0.216

0.206

0.201

0.265

0.210

0.223

0.247

0.344

0.357

0.428

0.631

0.328

0.235

0.329

0.523

0.406

0.278

0.365

0.349

0.134

0.177

0.152

0.215

0.223

0.308

0.165

0.455

0.400

0.480

0.529

0.504

0.583

0.539

0.626

0.566

0.549

0.726

0.725

1.000

0.651

0.355

0.353

0.433

1.000

0.496

0.591

0.510

0.491

0.275

0.427

1.000

1.000

1.000

1.000

1.000

0.699

0.361

0.446

0.416

0.390

0.380

0.447

0.539

0.482

0.196

0.166

0.250

0.357

0.489

0.584

0.795

0.347

0.325

0.406

0.434

0.350

0.371

0.404

0.491

1.000

0.455

0.300

0.297

1.000

0.349

0.398

1.000

1.000

1.000

1.000

0.705

Hubei

0.177

0.236

0.264

0.195

0.221

0.275

0.294

0.374

0.544

0.287

Hunan

0.226

0.317

0.417

0.635

0.532

0.565

0.531

0.489

1.000

0.524

Jo

urn

al

0.159

Mongolia

Guangdong

1.000

1.000

1.000

1.000

0.535

0.664

0.689

0.702

1.000

0.843

Guangxi

0.255

0.248

0.236

0.116

0.316

0.440

0.367

0.408

0.394

0.309

Hainan

0.061

0.234

0.106

0.346

1.000

0.262

0.268

0.205

0.100

0.287

Chongqing

0.267

0.388

0.372

1.000

0.578

0.569

0.719

1.000

0.886

0.642

Sichuan

0.184

0.255

0.229

0.247

1.000

0.502

1.000

1.000

1.000

0.602

Guizhou

1.000

0.256

0.220

0.409

0.378

0.232

0.318

0.288

0.442

0.394

Journal Pre-proof 0.326

1.000

0.334

0.280

0.316

0.335

0.360

0.309

0.203

0.385

Shaanxi

0.139

0.137

0.155

0.143

0.143

0.172

0.192

0.197

0.240

0.169

Gansu

0.137

0.210

0.233

0.344

0.408

0.369

0.433

0.413

0.388

0.326

Qinghai

0.019

0.016

1.000

1.000

0.096

0.095

0.203

0.380

1.000

0.423

Ningxia

0.153

0.270

0.252

0.318

0.563

1.000

0.274

0.255

0.534

0.402

Xinjiang

1.000

1.000

0.194

0.088

0.123

1.000

0.426

1.000

1.000

0.648

Nationwide

0.422

0.39

0.395

0.416

0.451

0.514

0.496

0.534

0.636

0.473

Cluster 1

0.778

0.631

0.634

0.816

0.616

0.652

0.733

0.786

0.894

0.727

Cluster 2

0.374

0.449

0.544

0.468

0.697

0.812

0.817

0.847

0.918

0.659

Cluster 3

0.332

0.301

0.314

0.351

0.366

0.460

0.343

0.404

0.584

0.384

Cluster 4

0.377

0.326

0.252

0.237

0.297

0.300

0.324

0.326

0.341

0.309

p ro

5.3.Lasso Regression

of

Yunnan

According to the level of this penalty coefficient  , the algorithm will set the regression

Pr e-

coefficient of "unimportant" variables to 0, thus achieving the purpose of variable selection. Take the whole country as an example, the variable selection of each cluster area is similar. The different components of coefficient vectors nationwide are shown in figure 1. In Diagram 1, as the independent variable changes from left to right, more and more non-zero components are added, which means more and more variables are selected. Relative variation of Cp value is shown in the following Diagram 2. It is minimized when 4 factors are kept. The penalty

urn

al

parameter at this point equals “0.23478”.

Diagram 3. Cp criteria for  Selection

Jo

Diagram 2. Regression Coefficient Variation

Lasso regression method is used to analyze the factors affecting the green innovation efficiency of high-tech industry. By establishing the Lasso regression model, the coefficients of the optimal variables selected in each region are shown in table 5. Table 5. Lasso Regression Coefficient of Each Region Variables

Nationwide

Cluster 1

Cluster 2

Cluster 3

Cluster 4

FE

3.75E-01

0

0

0

0

Journal Pre-proof 0

1.53E-07

0

-2.48E-08

0

NPD

1.06E+00

7.43E-05

9.60E-05

1.47E-05

2.37E-08

FD

-2.34E-08

0

0

-4.22E-04

-3.20E+02

GS

-6.33E-03

-5.78E-04

-5.33E-03

-8.92E-04

-7.88E-03

DKTD

0

8.77E-05

0

0

0

IKDT

0

-1.89E-05

0

0

0

FS

0

0

4.12E-05

0

0

IURC

0

0

0

0

0

FDI

0

0

0

IID

0

0

0

lnGDP

0

0

3.26E-03

of

DO

0

0

0

5.42E-07

6.03E-04

p ro

0

The results show that the allocation efficiency of green financial resources only observably promotes the green innovation efficiency of high-tech industries at the national level; Opening up has a significant promoting effect on the improvement of green innovation efficiency in cluster 1, while the opposite effect is shown in cluster 3; The demand for new products has a

Pr e-

positive impact on the improvement of green innovation efficiency in all regions; The distortion of financial factor market has a obvious negative impact on the improvement of green innovation efficiency in the whole country and cluster 3 and cluster 4; Government support is not conducive to the improvement of green innovation efficiency in all regions; Knowledge transfer only has a significant impact on the green innovation efficiency of high-tech industries in cluster I, among which, domestic knowledge transfer has a positive impact on the green innovation efficiency, while domestic knowledge transfer is just the opposite; Financial support only observably inhibits the improvement of green innovation efficiency of cluster 2; Regional

al

economic development has promoted the improvement of green innovation efficiency in cluster 2, 3 and 4.

urn

5.4.Quantile Regression and Tobit Regression Using quantile regression and panel Tobit regression, the variables selected by Lasso regression were empirically analyzed. This paper further discusses the influence degree and difference of various influencing factors on green innovation efficiency at different quantile of green innovation efficiency of high-tech industry, and reveals the influence law of various influencing factors on the conditional distribution of technological innovation efficiency in

Jo

China. The quantile regression and panel Tobit regression results of each region are shown in Table 6.

Table 6. The Quantile Regression and Panel Tobit Regression Results of Each Region

Region

Nationwide

Quantile Regression

Factors

FE

Tobit Model 0.25

0.5

0.75

0.031

0.158

0.259*

0.196**

(0.090)

(0.119)

(0.156)

(0.099)

Journal Pre-proof

FD

DO

GS

Cluster 1

-0.518***

-0.683***

-0.511***

(0.116)

(0.154)

(0.201)

(0.128)

0.414***

0.639***

0.723***

0.557***

(0.094)

(0.124)

(0.163)

(0.104)

-0.697***

-0.511***

-0.256*

-0.444***

(0.076)

(0.101)

(0.132)

(0.084)

0.401*

0.445***

0.449

0.492***

(0.216)

(0.163)

(0.169)

(0.141)

-1.007***

-1.044***

(0.466)

(0.352)

0.170***

0.829***

(0.287)

(0.217)

(0.225)

(0.187)

18.038**

12.710**

9.043

19.984***

(7.882)

(5.944)

(6.172)

(5.670)

-16.042**

-4.804

-5.300

-8.738*

(6.236)

(4.703)

(4.883)

(4.612)

-0.275

-1.942***

-1.881***

-1.416**

(1.053)

(0.641)

(0.598)

(0.625)

0.487***

0.548***

0.532***

0.751***

(0.322)

(0.196)

(0.183)

(0.189)

2.560

0.872

1.216

2.383*

(2.231)

(1.357)

(1.267)

(1.391)

-0.099

0.399***

0.363***

0.307***

(0.172)

(0.105)

(0.098)

(0.123)

-0.060*

-0.080*

-0.058

-0.099**

(0.031)

(0.045)

(0.061)

(0.038)

-0.459**

-0.742***

-1.098***

-0.688***

(0.177)

(0.255)

(0.350)

(0.214)

0.233

0.358*

0.293

0.328*

(0.141)

(0.204)

(0.279)

(0.172)

-0.195

0.356

0.796*

0.567*

(0.230)

(0.332)

(0.455)

(0.280)

0.031

0.113**

0.121

0.068

(0.038)

(0.055)

(0.076)

(0.047)

-0.457***

-0.459***

-0.159***

-0.578***

(0.139)

(0.101)

(0.181)

(0.110)

1.082***

1.092***

1.367***

1.158***

(0.279)

(0.203)

(0.363)

(0.221)

-1.047***

-0.915***

-0.867**

-0.673***

(0.136)

(0.231)

(0.411)

(0.251)

0.106*

0.126***

0.198**

0.137***

PROD

DKTD

NPD Cluster 2 FS

lnGDP

DO

Cluster 3

urn

GS

al

GS

Pr e-

IKDT

of

NPD

-0.305***

-1.127***

-1.162***

(0.365)

(0.301)

0.883***

1.009***

p ro

GS

NPD

FD

Jo

lnGDP

GS

NPD

Cluster 4

FD lnGDP

Journal Pre-proof (0.062)

(0.045)

(0.080)

(0.049)

The allocation efficiency of green financial resources(FE) plays a remarkable role in promoting the green innovation efficiency of high-tech industries. The improvement of financial resource allocation efficiency directly affects the allocation efficiency of production factors, improves the TFP and innovation efficiency of high-tech industries, and pushes China's economic development onto the road of new normal.

of

Opening up(DO) has a promoting effect on the green innovation efficiency of high-tech industry in cluster 1, while it is opposite in cluster 3. In the two clustering regions, this effect is more obvious in the regions with higher innovation efficiency. Opening up is conducive to

p ro

breaking the monopoly of the state-owned sector, enhancing the innovation vitality of small and medium-sized enterprises in market competition, and stimulating enterprises to increase their knowledge reserves through market introduction, learning, communication and so on. New product demand(NPD) of regional high-tech industry has significant positive influence on green innovation efficiency, and the higher the innovation efficiency of the region, the

Pr e-

greater the influence degree. The higher the demand of consumers for new products, the higher the price of new products will be. To some extent, the incentive of producers to invest in innovation will be strengthened, and the innovation efficiency of enterprises will be improved. The distortion of the financial factor market(FD) has notable negative effects on the green innovation efficiency of the national and cluster 3 and 4 high-tech industries. Among them, in the whole country and cluster 4, this negative effect is more significant in the regions with lower efficiency, while in cluster 3, it is opposite. The distortion of financial factor market will have adverse effects on the allocation of innovative capital and the initiative of enterprises to

al

innovate, which will lead to the loss of innovation efficiency. Government support(GS) has a obvious negative impact on the improvement of green

urn

innovation efficiency, and the higher the innovation efficiency is, the greater the impact degree is. Government financial support can effectively solve the problem of insufficient financing for enterprise innovation and research and development activities, thus promoting enterprises to carry out innovation activities efficiently. Government investment may crowd out enterprises' investment in R&D, or "crowding out effect". Government funds make enterprises' investment in R&D relatively insufficient, which inhibits enterprise innovation to some extent.

Jo

Knowledge transfer(DKTD,IKTD) only has an impact on innovation efficiency of cluster 1, and the impact is more significant in regions with low innovation efficiency. Domestic knowledge transfer(DKTD) promotes the improvement of innovation efficiency, while foreign knowledge transfer(IKTD) is the opposite. Enterprises in different regions in China have realized knowledge sharing through knowledge transfer, which makes innovation activities more effective and promotes the improvement of innovation efficiency. However, the crowding effect of foreign investors on the local market inhibits the improvement of innovation efficiency.

Journal Pre-proof Financial support(FS) promotes the innovation efficiency of cluster 1. Finance is the core of modern economy, and scientific and technological innovation and industrialization need financial support. Regional economic development(lnGDP) shows a positive correlation with innovation efficiency of high-tech industries, which is consistent with China's national conditions. 6.Discussion and Suggestions

of

The contribution and originality of this paper are mainly reflected in in the following aspects. A scientific system is constructed to measure the green innovation efficiency of high-tech industries. Then, through Lasso regression, quantile regression and DEA-Tobit method, this

p ro

paper comprehensively and systematically studied the regional differences of influencing factors of green innovation efficiency in high-tech industry. It provides theoretical support for deepening reform, realizing innovation-driven and efficient development of regional high-tech industries in China.

However, the traditional panel regression assumes that individuals are independent of each

Pr e-

other, and fails to consider the correlation between regions and the spatial spillover effect, leading to an unscientific and incomplete analysis conclusion on the efficiency of green innovation. In addition, when considering the influencing factors of green innovation efficiency of high-tech industries, some variables, such as social system and culture, are missing due to the difficulty of measurement, and the missing errors caused by them need to be further discussed in future studies. Finally, due to the length and research direction of the paper, there is no analysis on the decomposition value of innovation efficiency of high-tech industry, nor does it distinguish the sub-industries of high-tech industry, which needs to be further enriched

al

and improved in the later research. Based on the research results, the following policy suggestions are proposed. Different regions have great differences in resource endowment, economic development level, economic

urn

development speed and policy environment, which leads to different innovation efficiency. Therefore, we should make full use of our respective advantages, strengthen the communication and learning between each other, and promote the effective use and allocation of innovation resources. When local governments formulate corresponding policies to promote innovation in high-tech industries, they need to adjust measures to local conditions and focus

Jo

on promoting the development of factor markets.

Journal Pre-proof

References

of

Meng W Z, Li C Y, Shi X D. Analysis of innovation efficiency of high-tech industry in China in stages -- based on three-stage DEA model [J]. Macroeconomic research,2019(02):78-91. Carayannis E G , Grigoroudis E , Goletsis Y . A multilevel and multistage efficiency

p ro

evaluation of innovation systems: A multiobjective DEA approach[J]. Expert Systems with Applications, 2016, 62:63-80.

Kaya Samut P , Cafrı R . Analysis of the Efficiency Determinants of Health Systems in OECD Countries by DEA and Panel Tobit[J]. Social Indicators Research, 2016, 129(1):113-132. Lafarga C V, Balderrama J I L. Efficiency of Mexico's regional innovation systems: an Innovation & Development, 2015.

Pr e-

evaluation applying data envelopment analysis (DEA)[J]. African Journal of Science Technology

Yeung L L , Azevedo P F . Measuring efficiency of Brazilian courts with data envelopment analysis (DEA)[J]. IMA Journal of Management Mathematics, 2011, 22(4):343-356. Dai K Z, Liu Y J . How Factor Market Distortion Affects Innovation Performance[J]. The Journal of World Economy, 2016,39(11):54-79.

al

Dai K Z, Liu Y J . Factor Market Distortion and Innovation Efficiency:Empirical Evidence of China's High-tech Industries[J]. Economic Research Journal, 2016,51(07):72-86. Ji Y , Dou J . Study on Stage Impacts of Factor Price Distortion on Chinese Technology

urn

Innovation Based on Data Mining[J]. Journal of Computational and Theoretical Nanoscience, vol. 13, issue 12, pp. 10504-10513, 2016, 13(12):10504-10513. Li X, Ran G, Wei Z. How Does Financial Factor Distortion Affect Enterprise Innovation Investment?——Analysis from the Perspective of Financing Constraints[J]. Studies of International Finance, 2017(12):25-35.

Jo

Li X, Ran G, Wei Z.The Innovative Effect of Financial Factor Distortion and Its Regional Differences[J]. Studies in Science of Science, 2018,36(03):558-568. Gao X Y , Lyu Y W , Shi F , Zeng J T , Liu C Y . The Impact of Financial Factor Market Distortion on Green Innovation Efficiency of High-tech Industry[J]. Ekoloji,2019.28(107): 3449-3461. Fang J W , Chiu Y H . Research on Innovation Efficiency and Technology Gap in China

Journal Pre-proof Economic Development[J]. Asia Pacific Journal of Operational Research, 2017, 34(2):1750005. Kalapouti K , Petridis K , Malesios C , et al. Measuring efficiency of innovation using combined Data Envelopment Analysis and Structural Equation Modeling: empirical study in EU regions[J]. Annals of Operations Research, 2017. Hong J , Feng B , Wu Y , et al. Do government grants promote innovation efficiency in

of

China’s high-tech industries?[J]. Technovation, 2016:S0166497216301018. Castro M D , Gregorio . Knowledge management and innovation in knowledge-based and High-tech industrial markets: The role of openness and absorptive capacity[J]. Industrial

p ro

Marketing Management, 2015, 47:143-146.

Liu R Z , Du W . Portfolio Construction Using Variable Selection:Based on LASSO Method[J]. On Economic Problems, ,2012,(9):103-107.

Fang K N , Zhang G J , Zhang H Y . Individual Credit Risk Prediction Method:Application of 2014,31(02):125-136.

Pr e-

a Lasso-logistic Model[J]. The Journal of Quantitative & Technical Economics,

Mansiaux Y , Carrat F . Detection of independent associations in a large epidemiologic dataset: a comparison of random forests, boosted regression trees, conventional and penalized logistic regression for identifying independent factors associated with H1N1pdm influenza infections[J]. BMC Medical Research Methodology, 2014, 14(1).

Pereira J M , Basto M , Silva A F D . The Logistic Lasso and Ridge Regression in Predicting Corporate Failure[J]. Procedia Economics and Finance, 2016, 39:634-641.

al

Tibshirani R. Regression shrinkage and selection via the lasso: a retrospective[J]. Journal of the Royal Statistical Society, 1996, 58(1):267-288.

urn

Matsui, Hidetoshi. Sparse group Lasso for multiclass functional logistic regression models[J]. Communications in Statistics - Simulation and Computation, 2018:1-14. Alhamzawi R, Yu K. Variable selection in quantile regression via Gibbs sampling[J]. Journal of Applied Statistics, 2012, 39(4):799-813. Jiang L , Bondell H D , Wang H J . Interquantile shrinkage and variable selection in quantile

Jo

regression[J]. Computational Statistics & Data Analysis, 2014, 69:208-219. Fan Y L . Two-step variable selection in quantile regression models[J]. Journal of Shanghai Normal University (Natutal Sciences), 2015,44(03):270-283. Wang B C , Wei Y H . Bayesian Estimation of Using M-H Algorithm to Solve Logistic Regression Model Parameters[J]. Statistics & Decision, 2017(18):23-28. Chien-wen Shen, Min Chen*, Chiao-chen Wang (2018). Analyzing the Trend of O2O Commerce by Bilingual Text Mining on Social Media. Computers in Human Behavior,

Journal Pre-proof https://doi.org/10.1016/j.chb.2018.09.031. Hashem H , Vinciotti V , Alhamzawi R , et al. Quantile regression with group lasso for classification[J]. Advances in Data Analysis and Classification, 2016, 10(3):375-390. Xie Z , Xu Y . Sparse group Lasso based uncertain feature selection[J]. International Journal of Machine Learning and Cybernetics, 2014, 5(2):201-210.

of

Benoit D F , Alhamzawi R , Yu K . Bayesian lasso binary quantile regression[J].

Jo

urn

al

Pr e-

p ro

Computational Statistics, 2013, 28(6):2861-2873.