Ranking of environmental heat stressors for dairy cows using machine learning algorithms

Ranking of environmental heat stressors for dairy cows using machine learning algorithms

Computers and Electronics in Agriculture 168 (2020) 105124 Contents lists available at ScienceDirect Computers and Electronics in Agriculture journa...

2MB Sizes 1 Downloads 46 Views

Computers and Electronics in Agriculture 168 (2020) 105124

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag

Ranking of environmental heat stressors for dairy cows using machine learning algorithms

T



Michael T. Gorczycaa,b, , Kifle G. Gebremedhina a b

Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY 14853, United States Decisive Analytics Corporation, Arlington, VA 22202, United States

A R T I C LE I N FO

A B S T R A C T

Keywords: Heat stress Dairy cows Machine learning Random forest Neural network

Heat stress is harmful to the health and productivity of dairy cows. While several models have been developed to assess heat stress conditions for dairy cows, many of these models assume a particular relationship, chosen by the researcher, between environmental conditions and their physiological responses to heat stress. These assumptions may not accurately represent the underlying effects. This study uses machine learning algorithms to evaluate how environmental heat stressors (air temperature, AT; relative humidity, RH; solar radiation, SR; and wind speed, U) influence physiological responses (respiration rate, RR; skin temperature, ST; and vaginal temperature, VT) of dairy cows. The advantage of this approach is that many machine learning algorithms automatically consider nonlinearity in data, which removes subjectivity from researchers choosing the relationship between the predictor and response variables. Four algorithms were considered in this study: penalized linear regression, random forests, gradient boosted machines, and neural networks. Nonlinear machine learning algorithms (random forests and neural networks) were consistently the most accurate in predicting the three physiological responses. The root mean squared error, RMSE, for RR was 9.695 respirations per minute, which was obtained with a random forest model; RMSE for ST was 0.334 °C, which was obtained with a random forest model; and RMSE for VT was 0.434 °C, which was obtained with a neural network model. Air temperature had the highest ranking for effect on RR, ST, and VT. Wind speed and its interaction terms displayed much lower effects as environmental heat stressors. Ranking of environmental heat stressors could help farmers make evidence-based interventions before anticipated stressful environmental conditions occur. Early intervention could improve animal health, which would increase production and reduce the costs associated with treating heat stressed livestock.

1. Introduction Heat stress proves to be a serious issue in the dairy industry, as it negatively affects the productivity and health of dairy cows (Barash et al., 2001; West et al., 2003; Bohmanova et al., 2007). To aid in the process of detecting heat stress in dairy cows, several researchers have developed heat stress indices, which provide relationships between physiological responses (respiration rate, RR; skin temperature, ST; and vaginal temperature, VT) and environmental heat stressors (air temperature, AT; relative humidity, RH; solar radiation, SR; and wind speed, U). The underlying issue with the heat stress indices developed thus far is that they assume the relationships between environmental heat stressors and physiological responses to be linear (Bouraoui et al., 2002; Dikmen and Hansen, 2009) with a limited order of interaction terms (Schoen, 2005; Dikmen and Hansen, 2009; Wang et al., 2018a),



nonlinear models where the functional form is specified (Tao and Xin, 2003), or consider additive effects (Yano et al., 2014). In reality, the relationships among the variables are unknown, complex, and nonlinear (Hastie et al., 2003). In other words, assumptions of linearity, nonlinearity with a specified functional form, or additivity for environmental heat stressors may be too restrictive, and models built from these assumptions may not correctly represent how environmental heat stressors affect the physiological responses of dairy cows. Furthermore, an improved understanding of the relationships in heat-stress data would help farmers make evidence-based decisions before an anticipated problem arises. Early intervention could improve animal health, which would increase production and reduce the costs associated with treating heat stressed livestock. The objectives of this study are (1) to compare the performance of different machine learning models when predicting physiological

Corresponding author at: Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY 14853, United States. E-mail address: [email protected] (M.T. Gorczyca).

https://doi.org/10.1016/j.compag.2019.105124 Received 20 August 2019; Received in revised form 21 October 2019; Accepted 27 November 2019 0168-1699/ © 2019 Elsevier B.V. All rights reserved.

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

responses (RR, ST, and VT) of dairy cows and (2) use these machine learning models to rank the influence of the environmental heat stressors (AT, RH, ST, and U) on each physiological response. The advantage of using machine learning models in this setting is that they are data driven in determining the nonlinear relationships in the data, which removes any potential bias in assuming the relationships between the environmental heat stressors and the physiological responses. As a result, when a dataset consists of a large, diverse population of dairy cows in different environmental conditions, the resulting model can accurately represent the relationships between environmental heat stressors and physiological responses; otherwise, the relationships learned may only be accurate for dairy cows in similar environmental conditions. While this is not the first study to use machine learning approaches to predict physiological responses of dairy cows from environmental heat stressors (Brown-Brandl et al., 2005), and machine learning approaches have recently been extended to other species of livestock, such as piglets (Gorczyca et al., 2018), to the best of our knowledge, this is the first study in which machine learning models are used to rank the effects of environmental variables on physiological responses of dairy cows.

Table 1 Summary statistics of the dataset (Chen et al., 2015) prior to data normalization. Number of samples = 486. Air temperature is denoted by AT; relative humidity by RH; solar radiation by SR; wind speed by U; respiration rate by RR; skin temperature by ST; and vaginal temperature by VT. Heat stressor

Mean

Standard deviation

Minimum value

Maximum value

AT (°C) RH (%) SR (W/m2) U (m/s)

31.6 30.8 799.5 1.7

3.54 7.0 84.8 0.9

21.2 17.0 194.0 0.0

39.5 50.0 981.0 4.8

Physiological measures RR (breaths per min.) ST (°C) VT (°C)

Mean 94.6

Standard deviation 17.3

Minimum value 48.0

Maximum value 140.0

37.6 39.5

0.9 0.6

35.0 38.4

40.0 41.1

2.2. Model development 2.2.1. Machine learning model development In order to develop a ranking system for environmental heat stressors, it was necessary to develop models that predict physiological responses from environmental data. The machine learning algorithms considered for developing these models were penalized linear regression with the elastic net penalty (Zou and Hastie, 2005), random forests (Breiman, 2001), gradient boosted machines (Friedman, 2001), and neural networks (Goodfellow et al., 2016). These algorithms are discussed in detail in Gorczyca et al. (2018). In addition, two models were developed from least squares linear regression: one where all the environmental heat stressors and their interaction terms were input variables, and another where input variables were selected with forward selection (Hastie et al., 2003). Models were developed with the R statistical software (R Core Team, 2017) using the h2o package (H2O.ai Team, 2017). Computations were performed on a desktop computer (Dell Precision 101 T3600 Workstation with 500 GB Seagate ST500DM002 HDD, 64 GB DDR4 RAM memory, and 102 8-Core 2.20 GHz Xeon E5-2660 processors). Each machine learning algorithm considered has hyperparameters, which influence how a model learns relationships from data. Table 2 provides a summary of the hyperparameter space considered for optimization. To develop machine learning models for this study, a random search for hyperparameter optimization was performed (Bergstra and Bengio, 2012). For the penalized linear regression models, random forests, and gradient boosted machines, 1,000 random search iterations were performed for each of these algorithms. For neural networks, 2,000 random search iterations were performed, as neural networks have a larger number of hyperparameters (5,000 models were developed overall).

2. Materials and methods 2.1. Dataset overview The dataset used for this study was obtained from a study conducted at the University of California-Davis dairy facility with 19 HolsteinFriesian cows (Chen et al., 2015). Environmental heat stressors and physiological responses were recorded every 5 min while the cows were restrained outdoors for 1 h each day in headlocks at an unshaded, clean feed bunk. The entire experimental trial lasted for 21 days. The environmental heat stressors were recorded with a portable weather station (WS-16; Novalynx Corp., Auburn, CA). For the physiological responses, RR was measured by timing flank movements and converting to breaths per minute; ST was recorded using loggers (Thermochron iButton DS1921H, Embedded Data Systems, Lawrenceburg, KY, USA) taped to shaved skin on the “side”, “upper leg”, “lower leg”, and “shoulder”; and VT was measured using vaginally indwelling loggers (Minilog12-TX, Vemco Ltd., Bedford, NS, Canada). The skin temperatures measured from the side, upper leg, and lower leg were averaged to define ST for this study. The shoulder temperature was not included in the averaging because the shoulder is directly exposed to solar radiation (Chen et al., 2015; Wang et al., 2018a). We note that this dataset initially consisted of 3,591 measurements, of which 486 measurements corresponded to a control group (the dairy cows were exposed to only environmental heat stressors when recording physiological responses), and the remaining measurements corresponded to experimental groups (the dairy cows were exposed to environmental heat stressors as well as sprinklers that sprayed water on the cows at different flow rates and droplet sizes when recording physiological responses). As the physiological responses of the experimental group were influenced by sprinkler exposure, and this study is concerned only with the assessment of the effect of environmental heat stressors on physiological responses, we used the 486 measurements from the control group for this study. Table 1 shows the summary statistics of the dataset used in this study. Once the measurements associated with sprinkler usage were removed, the dataset was augmented to include interaction terms between the environmental heat stressors. The environmental heat stressors and their interaction terms were then normalized to account for different magnitudes of measurement because different scales of measurement may distort the significance of each heat stressor and interaction term in the ranking system (Jolliffe, 2002).

2.3. Model selection and assessment A total of 15,000 models were developed, 5,000 for each physiological response. Every model was developed from the same training dataset, which consisted of 291 random samples (60%) from the entire dataset. An out-of-sample validation dataset consisting of 97 random samples (20%) was used to select a model developed from each machine learning algorithm that had the best performance metrics. Model selection for each machine learning algorithm was based on which model had the lowest root mean squared error (RMSE; Hastie et al., 2003) on the validation dataset. Finally, an out-of-sample test dataset consisting of 98 random samples (20%) was used to gather the performance metrics of these selected models. The performance metrics gathered on the test dataset are RMSE, mean absolute error (MAE), and coefficient of determination (R2). For RMSE and MAE, smaller values indicate better predictive performance. For R2, larger values indicate 2

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

Table 2 Hyperparameter space configured for model development.

Table 4 Model performance on the test dataset when predicting skin temperature. The performance metrics with regards to the number of significant digits in measurement is first displayed, followed by the performance metrics to 3 significant digits as well as the standard error of each performance metric in parentheses. The best performance metric attained is shown in bold. RMSE represents root mean squared error, MAE is mean absolute error, and R2 is coefficient of determination, LM is linear regression, LM-FS is linear regression with forward selection, PLM is linear regression with elastic net penalty, RF is random forest, GBM is gradient boosted machine, and NN is neural network. Skin temperature Model

RMSE (°C)

MAE (°C)

R2

LM LM-FS PLM RF GBM NN

0.4 (0.371; 0.023) 0.4 (0.371; 0.024) 0.4 (0.374; 0.024) 0.3 (0.334; 0.028) 0.4 (0.371; 0.029) 0.3 (0.345; 0.022)

0.3 (0.293; 0.023) 0.3 (0.293; 0.023) 0.3 (0.295; 0.022) 0.3 (0.261; 0.022) 0.3 (0.299; 0.023) 0.3 (0.276; 0.021)

0.8 (0.818; 0.030) 0.8 (0.817; 0.031) 0.8 (0.816; 0.032) 0.9 (0.854; 0.031) 0.8 (0.824; 0.035) 0.8 (0.844; 0.028)

Table 5 Model performance on the test dataset when predicting vaginal temperature. The performance metrics with regards to the number of significant digits in measurement is first displayed, followed by the performance metrics to 3 significant digits as well as the standard error of each performance metric in parentheses. The best performance metric attained is shown in bold. RMSE represents root mean squared error, MAE is mean absolute error, and R2 is coefficient of determination, LM is linear regression, LM-FS is linear regression with forward selection, PLM is linear regression with elastic net penalty, RF is random forest, GBM is gradient boosted machine, and NN is neural network. Vaginal temperature

(a) (a, b) denotes uniform continuous distribution from a to b, d(a, b) denotes uniform discrete distribution from a to b. (b) MNL: minimum number of observations in a leaf. (c) NVS: number of variables used in each split. (d) If the number of hidden layers is greater than 1, the upper bound of the sampling distribution was changed to 1.000 to decrease memory constraints. (e) From Srivastava et al. (2014). (f) Hyperparameters from the AdaDelta optimizer (Zeiler, 2012).

MAE (breaths per. min)

R2

LM LM-FS PLM RF GBM NN

10.5 (10.501; 0.622) 10.5 (10.503; 0.635) 10.6 (10.621; 0.647) 9.7 (9.695; 0.717) 11.0 (10.973; 0.880) 10.1 (10.097; 0.722)

8.7 (8.728; 0.583) 8.7 (8.697; 0.610) 8.8 (8.797; 0.592) 7.6 (7.616; 0.590) 8.5 (8.529; 0.684) 8.1 (8.074; 0.621)

0.5 (0.539; 0.067) 0.5 (0.538; 0.067) 0.5 (0.535; 0.064) 0.6 (0.610; 0.065) 0.5 (0.523; 0.082) 0.6 (0.587; 0.064)

MAE (°C)

R2

LM LM-FS PLM RF GBM NN

0.5 (0.462; 0.033) 0.5 (0.462; 0.032) 0.5 (0.462; 0.033) 0.4 (0.444; 0.039) 0.5 (0.456; 0.051) 0.4 (0.434; 0.030)

0.4 (0.368; 0.029) 0.4 (0.368; 0.028) 0.4 (0.369; 0.028) 0.3 (0.332; 0.029) 0.3 (0.325; 0.032) 0.4 (0.351; 0.025)

0.4 (0.403; 0.067) 0.4 (0.403; 0.063) 0.4 (0.403; 0.065) 0.5 (0.459; 0.083) 0.5 (0.460; 0.095) 0.5 (0.472; 0.058)

rank(wj ) =

|wj | max |wi |

i ∈ 1, ⋯ , p

where, wj represents the weight of the jth input variable. The ranking procedure for neural networks is an extension of the ranking procedure for penalized linear regression, which is the summation of all the weights for each input variable divided by the summation of the absolute value of all the weights. This procedure is described in detail by Gedeon (1997), and can be used in the h2o software (H2O.ai Team, 2017). If the best performing model is a random forest or a gradient boosted machine, then the ranking is the amount each input variable improved the mean squared error (MSE) of the model divided by the total amount every input variable improved the MSE (Breiman, 2001). The rankings for the models developed from the random forest and gradient boosted machine algorithms are also gathered using the h2o software (H2O.ai Team, 2017). To provide a baseline comparison to these machine learning models, the model coefficients from the two

Respiration rate RMSE (breaths per. min)

RMSE (°C)

better predictive performance. The model with the lowest RMSE on each test dataset was used to rank the effects of each environmental heat stressor on a physiological response. If the best performing model was a penalized linear regression model, then the ranking is the magnitude of a weight for an input variable divided by the largest weight in magnitude for all the input variables. For a dataset with p input variables, the rank of the jth input variable is defined as

Table 3 Model performance on the test dataset when predicting respiration rate. The performance metrics with regards to the number of significant digits in measurement is first displayed, followed by the performance metrics to 3 significant digits as well as the standard error of each performance metric in parentheses. The best performance metric attained is shown in bold. RMSE represents root mean squared error, MAE is mean absolute error, and R2 is coefficient of determination, LM is linear regression, LM-FS is linear regression with forward selection, PLM is linear regression with elastic net penalty, RF is random forest, GBM is gradient boosted machine, and NN is neural network.

Model

Model

3

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

Fig. 1. Predictions of respiration rate, RR (a); skin temperature, ST (b); and vaginal temperature, VT (c) from the best performing models (random forest models for RR and ST; a neural network model for VT) on out-of-sample data from the test dataset. These figures show that the predictions of these models are highly correlated.

Fig. 2. Ranking of the effects of environmental heat stressors and their interaction terms on respiration rate (a); skin temperature (b); and vaginal temperature (c). AT represents air temperature, RH is relative humidity, SR is solar radiation, and U is wind speed. An (x) represents interaction between two parameters.

4

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

Table 6 Regression coefficients for environmental heat stressors (AT denotes air temperature; RH denotes relative humidity; SR denotes solar radiation, and U denotes wind speed) and their interaction terms (x denotes an interaction) from linear regression models that utilized every variable as an input (LM) or utilized forward selection to determine input variables (LM-FS). A (*) indicates that the variable was not selected by the forward selection algorithm. Respiration rate Model

AT

RH

SR

U

AT × RH

AT × SR

AT × U

RH × SR

RH × U

SR × U

LM LM-FS

0.69 11.04

−12.90 −3.86

−9.38 0.00*

1.33 −2.16

2.78 3.80

12.84 0.56

8.94 0.00*

7.80 0.00*

4.84 0.00*

−17.78 0.00*

Skin temperature Model

AT

RH

SR

U

AT × RH

AT × SR

AT × U

RH × SR

RH × U

SR × U

LM LM-FS

0.01 −0.04

−0.64 −0.81

−0.54 −0.75

−0.13 −0.09

0.34 0.40

0.81 0.00*

0.49 0.00*

0.25 0.32

−0.09 0.00*

−0.42 0.00*

Vaginal temperature Model

AT

RH

SR

U

AT × RH

AT × SR

AT × U

RH × SR

RH × U

SR × u

LM LM-FS

−0.88 −0.88

−1.19 −1.22

−1.22 −1.27

−1.17 −1.31

0.15 0.15

1.44 1.45

0.82 0.80

0.74 0.78

0.48 0.49

−0.16 0.00*

Fig. 3. Partial dependence plots showing the effect of varying air temperature (a), relative humidity (b), solar radiation (c), and wind speed (d) on each model when respiration rate. A partial dependence plot shows the effect of predicting a physiological response when varying one environmental heat stressor while keeping the remaining environmental heat stressors fixed at their mean value (interaction terms would reflect changes in the varied environmental heat stressor). LM denotes linear regression model, LM denotes linear regression model that uses forward selection to determine input variables, PLM denotes penalized linear model, RF denotes random forest, GBM denotes gradient boosted machine, and NN denotes neural network.

U = [0, 5] m/s, AT = [21, 40] °C, or SR = [194, 981] W/m2, keeping the remaining environmental heat stressors at their mean values (the first and second values within the brackets denote the minimum and maximum values in a range, respectively). The interaction terms were the interactions between these artificially constructed variables.

linear regression models (the model developed from all input variables and the model where the input variables are selected with forward selection) are also reported. The relationships learned by each model were further assessed with partial dependence plots (Friedman, 2001) from four artificial datasets for each physiological response (12 artificial datasets in total). Each dataset was designed to test how the models would perform under different conditions. Each dataset had either RH = [17, 50] (%), 5

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

Fig. 4. Partial dependence plots showing the effect of varying air temperature (a), relative humidity (b), solar radiation (c), and wind speed (d) on each model when predicting skin temperature. A partial dependence plot shows the effect of predicting a physiological response when varying one environmental heat stressor while keeping the remaining environmental heat stressors fixed at their mean value (interaction terms would reflect changes in the varied environmental heat stressor). LM denotes linear regression model, LM denotes linear regression model that uses forward selection to determine input variables, PLM denotes penalized linear model, RF denotes random forest, GBM denotes gradient boosted machine, and NN denotes neural network.

3. Results and discussion

guarantee that a linear regression model would be as good a predictor as a machine learning model. We note that in the context of this study, one must consider model performance as well as the measurement error of the procedure used for recording each physiological response. For example, the difference in model performance for ST and VT are within measurement errors of the tools used to record them (Table 4, Table 5). This indicates that the difference in model performance may not matter if the models were used for predicting these physiological responses on a farm. But, the difference in model performance for RR may be beyond measurement error (RR was measured from human observation), so there may be a notable difference in performance if the models were used for predicting RR on a farm (Table 3).

3.1. Model performance Tables 3–5 show the performance metrics of the selected model from each machine learning algorithm on the test dataset for RR, ST, and VT, respectively (the model from each machine learning algorithm that minimized RMSE on the validation dataset). Random forest models had the best performance metrics for predicting RR and ST, and a neural network model had the best performance metrics for predicting VT. These models also achieved higher R2 values than those currently reported in literature (Dikmen and Hansen, 2009; Wang et al., 2018a, 2018b) which is displayed in the scatterplots of the prediction outputs from each selected model on the test datasets (Fig. 1). While this gives evidence to the strength of using nonlinear machine learning algorithms, this does not mean that nonlinear algorithms will consistently attain the best performance metrics for predicting heat stress. The underlying reasoning for this is that the complexity of relationships between variables in a dataset is not understood unless a variety of machine learning models are developed and are assessed, even if the relationships in a dataset appear similar to those in another dataset (Wolpert and Macready, 1997). If a dataset does not contain much complexity, then, simple modelling procedures such as linear regression will demonstrate strong predictive performance (MacKay, 2003). However, as data becomes more complex, simple modelling procedures would require extensive feature engineering (transforming input variables to improve model performance) to capture the nonlinear relationships in data. Employing feature engineering does not, however,

3.2. Ranking of environmental heat stressors and relationship evaluation Interaction terms are commonly used for developing heat stress indices, and ranking these interaction terms can provide better understanding of how environmental heat stressors affect physiological responses (Schoen, 2005; Dikmen and Hansen, 2009; Wang et al., 2018b). The ranking of the environmental heat stressors and their interaction terms for each physiological response is given in Fig. 2. These rankings indicate that air temperature (AT) has the highest effect on each physiological response. Wind speed (U) and its interaction terms had much lower effects on the physiological responses. Table 6 indicates that, if one were to analyze the weights of a linear regression model, then the interaction term between air temperature and solar radiation would have the largest effect on vaginal 6

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

Fig. 5. Partial dependence plots showing the effect of varying air temperature (a), relative humidity (b), solar radiation (c), and wind speed (d) on each model when predicting vaginal temperature. A partial dependence plot shows the effect of predicting a physiological response when varying one environmental heat stressor while keeping the remaining environmental heat stressors fixed at their mean value (interaction terms would reflect changes in the varied environmental heat stressor). LM denotes linear regression model, LM denotes linear regression model that uses forward selection to determine input variables, PLM denotes penalized linear model, RF denotes random forest, GBM denotes gradient boosted machine, and NN denotes neural network.

this study is notably smaller than the sample size of datasets used in contemporary machine learning studies for finance (Heaton et al., 2016), healthcare (Manogaran and Lopez, 2016), and cybersecurity (Xin et al., 2018). As a consequence, the relationships between environmental heat stressors and physiological responses for dairy cows that these models learned may be inaccurate when applied to dairy cows that were not in the data sample. However, if this model development procedure is repeated on a larger dataset with more dairy cows, then it would be expected that the relationships learned by the best performing model on out-of-sample data and its ranking would better represent the underlying relationship between environmental heat stressors and physiological responses.

temperature and skin temperature, while the interaction term between solar radiation and wind speed would have the largest effect on respiration rate. These linear models generally place large weights on wind speed and its interaction terms as well as on solar radiation. This is inconsistent with results from previous studies (Dikmen and Hansen, 2009; Wang et al., 2018a). The dairy cows in this study also received limited exposure to solar radiation (Chen et al., 2015). These inconsistencies are demonstrated quantitatively in Figs. 3–5, where the machine learning models are robust to changes in solar radiation and wind speed, whereas the linear model are not robust. For example, the linear models indicate that changes in wind speed alone would have a large effect on vaginal temperature, while the machine learning algorithms do not indicate such an effect (Fig. 3d). These figures also indicate that the linear models consistently predicted physiological responses that are not realistic with the physiological responses observed (the linear models predicted physiological responses beyond those observed in the study; Table 1). The machine learning models consistently provided realistic prediction outputs. It must be emphasized that the ranking developed in this study is valid only for dairy cows in similar environmental conditions where similar equipment is used for measurement. Specifically, the cows in this study were exposed to high air temperature and low wind speed. These cows also received little exposure to solar radiation, and there may be error in solar radiation measurement as the tape used for holding the solar radiation data loggers to the shaved skin could absorb some solar radiation (Chen et al., 2015). Because of these environmental exposures and data acquisition procedures, the level of effects may be different if the dairy cows were instead exposed to high wind speed (Fournel et al., 2017), if the dairy cows were instead exposed to intense solar radiation (Herbut et al., 2018), or if a different method was used to gather data for model development. It must also be recognized that the sample size of the dataset used in

3.3. Strengths and limitations of machine learning models The application of machine learning models to predict the physiological responses of dairy cows from environmental heat stressors is not novel (Brown-Brandl et al., 2005). What is new in this study is that machine learning is used to rank the effects of environmental heat stressors on physiological responses. The benefit of this approach is that nonlinear machine learning algorithms have greater expressive power in their ability to describe the underlying relationships in data than linear regression models, nonlinear regression models where the functional form is specified, or additive models (MacKay, 2003; Hastie et al., 2003; 376 Goodfellow et al., 2016). As a result, machine learning models are able to consider the effect of every environmental heat stressor and their interaction terms when developing the ranking without sacrificing model performance. Previous temperature-humidity index (THI) models did not consider such a combination of input variables because they were developed from linear, nonlinear, or additive regression, which do not possess the same degree of expressive power (Collier et al., 1981; Orihuela, 2000; Tao and Xin, 2003; Dikmen 7

Computers and Electronics in Agriculture 168 (2020) 105124

M.T. Gorczyca and K.G. Gebremedhin

and Hansen, 2009; Wang et al., 2018b). The main drawback of using nonlinear machine learning models is computational time. On average, it took approximately 0.6 s to develop each penalized linear model, 3.6 s to develop each random forest, 5.4 s to develop each gradient boosted machine, and 8 s to develop each neural network (as the number of hidden layers and neurons in each hidden layer increases, the computational time increases). These computational time indicate that, if access to a computer with similar hardware and a dataset are available, a similar ranking can be developed within 2 days. The computational time would, however, be significantly increased if the hardware is equipped with weaker processing power and lower memory.

Brown-Brandl, T.M., Jones, D.D., Woldt, W.E., 2005. Evaluating modelling techniques for cattle heat stress prediction. Biosys. Eng. 91, 513–524. Bohmanova, J., Misztal, I., Cole, J.B., 2007. Temperature-humidity indices as indicators of milk production losses due to heat stress. J. Dairy. Sci. 90, 1947–1956. https://doi. org/10.3168/jds.2006-513. Bouraoui, R., Lahmar, M., Majdoub, A., Djemali, M., Belyea, R., 2002. The relationship of temperature-humidity index with milk production of dairy cows in a Mediterranean climate. Animal Res. 51, 479–491. Chen, J.M., Schütz, K.E., Tucker, C.B., 2015. Cooling cows efficiently with sprinklers: physiological responses to water spray. J. Dairy Sci. 98, 6925–6938. Collier, R.J., Eley, R.M., Sharma, A.K., Pereira, R.M., Buffington, D.E., 1981. Shade management in subtropical environment for milk yield and composition in Holstein and Jersey cows. J. Dairy Sci. 64, 844–849. Dikmen, S., Hansen, P.J., 2009. Is the temperature-humidity index the best indicator of heat stress in lactating dairy cows in a subtropical environment? J. Dairy Sci. 92, 109–116. Fournel, S., Ouellet, V., Charbonneau, E., 2017. Practices for alleviating heat stress of dairy cows in humid continental climates: A literature review. Animals 7. Friedman, J., 2001. Greedy function approximation: A gradient boosting machine. Ann. Statist. 29, 1189–1232. Gedeon, T., 1997. Data mining of inputs: analysing magnitude and functional measures. Int. J. Neural. Syst. 8 (2), 209–218. https://doi.org/10.1142/s0129065797000227. Goodfellow, I., Bengio, Y., Courville, A., 2016. Deep Learning. MIT Press, Massachusetts. Gorczyca, M.T., Milan, H.F.M., Maia, A.S.C., Gebremedhin, K.G., 2018. Machine learning algorithms to predict core, skin, and hair-coat temperatures of piglets. Comp. Elec. Ag. 151, 286–294. H2O.ai team, 2017. h2o: R Interface for H2O, version 3.16.0.2. Hastie, T., Tibshirani, R., Friedman, J., 2003. The Elements of Statistical Learning. Springer, New York. Heaton, J.B., Polson, N.G., Witte, J.H., 2016. Deep learning in finance. arXiv:1602.06561. Herbut, P., Angrecka, S., Walczak, J., 2018. Environmental parameters to assessing of heat stress in dairy cattle—a review. Int. J. Biometeorol. 62, 2089–2097. Jolliffe, I.T., 2002. Principal Component Analysis. Springer, New York. MacKay, D.J.C., 2003. Information Theory, Inference and Learning Algorithms. Cambridge University Press, Cambridge, England. Manogaran, G., Lopez, D., 2016. A survey of big data architectures and machine learning algorithms in healthcare. Int. J. Biomed. Eng. Technol. 23, 1–27. Orihuela, A., 2000. Some factors affecting the behavioral manifestation of oestrus in cattle: A review. Appl. Anim. Behav. Sci. 70, 1–16. R Core Team, 2017. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Schoen, C., 2005. A new empirical model of the temperature–humidity index. J. Appl. Meteorol. 44, 1413–1420. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R., 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958. Tao, X., Xin, H., 2003. Acute synergistic effects of air temperature, humidity and velocity on homeostasis of market–size broilers. Trans. ASAE 46, 491–497. Wang, X., Bjerg, B.S., Choi, C.Y., Zong, C., Zhang, G., 2018a. A review and quantitative assessment of cattle-related thermal indices. J. Therm. Bio. 77, 24–37. Wang, X., Gao, H., Gebremedhin, K., Bjerg, B.S., Van Os, J., Tucker, C.B., Zhang, G., 2018b. A predictive model of equivalent temperature index for dairy cattle (ETIC). J. Therm. Bio. 76, 165–170. West, J.W., Mullinix, B.G., Bernard, J.K., 2003. Effects of hot, humid weather on milk temperature, dry matter intake, and milk yield of lactating dairy cows. J. Dairy. Sci. 86, 232–242. https://doi.org/10.3168/jds.S0022-0302(03)73602-9. Wolpert, D.H., Macready, W.G., 1997. No free lunch theorems for optimization. IEEE Trans. Evo. Comp. Xin, Y., Kong, L., Liu, Z., Chen, Y., Li, Y., Zhu, H., Wang, C., 2018. Machine learning and deep learning methods for cybersecurity. IEEE Access 6, 35365–35381. Yano, M., Shimadzu, H., Endo, T., 2014. Modelling Temperature Effects on Milk Production: A Study on Holstein Cows at a Japanese Farm. Springer Plus, pp. 3. Zeiler, M.D., 2012. Adadelta: An adaptive learning rate method. arXiv:1212.5701. Zou, H., Hastie, T., 2005. Regularization and variable selection via the elastic net. J. Stat. Soc. Ser. B 67, 301–320.

4. Conclusions The following conclusions can be drawn from this study: (1) Machine learning models (penalized linear regression with elastic net penalty, random forests, gradient boosted machines, and neural networks) were used to predict respiration rate, skin temperature, and vaginal temperature of dairy cows. (2) The effects of environmental stressors (air temperature, relative humidity, solar radiation and wind speed) on physiological responses (respiration rate, skin temperature, and vaginal temperature) were ranked. The result showed that air temperature had the highest effect on all of the physiological responses considered in this study. Wind speed had minimal effects as a heat stressor. (3) Future work in this area may include development of similar models but from larger datasets with multiple species of livestock, and their application to different farms. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements This study is funded by USDA (United States Department of Agriculture, Washington, DC) as part of the W-3173 Hatch/Regional Project through Cornell University. References Barash, H., Silanikove, N., Shamay, A., Ezra, A., 2001. Interrelationships among ambient temperature, day length and milk yield in dairy cows under a Mediterranean climate. J. Dairy Sci. 84, 2314–2320. https://doi.org/10.3168/jds.S0022-0302(01)74679-6. Bergstra, J., Bengio, Y., 2012. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305. Breiman, L., 2001. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/ A:1010933404324.

8