Image motion feature extraction for recognition of aggressive behaviors among group-housed pigs

Image motion feature extraction for recognition of aggressive behaviors among group-housed pigs

Computers and Electronics in Agriculture 142 (2017) 380–387 Contents lists available at ScienceDirect Computers and Electronics in Agriculture journ...

981KB Sizes 0 Downloads 52 Views

Computers and Electronics in Agriculture 142 (2017) 380–387

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag

Original papers

Image motion feature extraction for recognition of aggressive behaviors among group-housed pigs Chen Chen a,⇑, Weixing Zhu a,⇑, Changhua Ma a, Yizheng Guo a,b, Weijia Huang a,c, Chengzhi Ruan a,d a

School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, Jiangsu, China Nanjing Normal University Taizhou College, Taizhou 225300, Jiangsu, China c School of Electronic Information, Jiangsu University of Science and Technology, Zhenjiang 212003, Jiangsu, China d School of Mechanical and Electrical Engineering, Wuyi University, Wuyishan 354300, Fujian, China b

a r t i c l e

i n f o

Article history: Received 10 March 2017 Received in revised form 24 June 2017 Accepted 8 September 2017

Keywords: Target location Key frame sequence extraction Hierarchical clustering Aggressive behavior recognition Animal welfare

a b s t r a c t The aim of this study is to develop a computer vision-based method to automatically detect aggressive behaviors among pigs. Ten repetitions of the same experiment were performed. In each of the experiment, 7 piglets were mixed from three litters in two pigpens and captured on video for a total of 6 h. From these videos, the first 3 h of video after mixing were recorded as a training set, and the 3 h of video after 24 h were recorded as a validation set. Connected area and adhesion index were used to locate aggressive pigs and to extract key frame sequences. The two pigs in aggression were regarded as a whole rectangle according to their characteristics of continuous and large-proportion adhesion. The acceleration feature was extracted by analyzing the displacement change of four sides of this rectangle between adjacent frames, and hierarchical clustering was used to calculate its threshold. Based on this feature, the recognition rules of medium and high aggression were designed. Testing 10 groups of pigs, the accuracy of recognizing medium aggression was 95.82% with a sensitivity of 90.57% and with a specificity of 96.95%, and the accuracy of recognizing high aggression was 97.04% with a sensitivity of 92.54% and with a specificity of 97.38%. These results indicate that the acceleration can be used to recognize pigs’ aggressive behaviors. Ó 2017 Elsevier B.V. All rights reserved.

1. Introduction Since group-housed pigs are confronted with capacity limited, poor environment, low fiber diet and repeated changes of group composition in the intensive farming conditions, they express higher levels of aggression than they do in the natural environment (Stukenborg et al., 2012). Aggression usually occurs in the artificial pigpen allocation after weaning and at the beginning of the fattening period, the mixed pigs will frequently attack each other within 2 days until the new hierarchy is established (Keeling and Gonyou, 2001). The aggression among pigs can cause skin trauma, infection and even fatal injuries (Turner et al., 2006). The injured pigs intake food more difficultly, thereby the growth rate is getting low, which influences pork production (Stookey and Gonyou, 1994). Additionally, the stress produced by aggression will reduce the reproductive performance of the surrounding sows (Kongsted, 2004). Therefore, aggressive behaviors are regarded as one of the most important health, welfare and economic problems in modern ⇑ Corresponding authors. E-mail addresses: [email protected] (C. Chen), [email protected] (W. Zhu). https://doi.org/10.1016/j.compag.2017.09.013 0168-1699/Ó 2017 Elsevier B.V. All rights reserved.

production systems (D’Eath and Turner, 2009). Currently, the recognition of aggression among pigs mainly depends on manual observation and video surveillance, these means are timeconsuming, laborious and hysteretic, it is difficult to achieve the real-time aggression detection in large-scale farms. Using computer vision technology to recognize aggressive behaviors will improve the efficiency of recognition, increase animal welfare and reduce the economic losses of farms (Faucitano, 2001; Bracke et al., 2002). The pigs aggression is a complex interactive behavior which has continuous and large-proportion adhesion of pig-body, it can last from a few seconds to a few minutes (McGlone, 1985). The process of animal mating has the similar phenomenon of continuous adhesion, the computer vision-based mating recognition has been achieved mainly by analyzing the shape of animals in mating. For instance, Tsai and Huang (2014) used the length of circumscribed rectangle of cattles in mating as the feature, when this length lasted about 2 s and 2 times of cattle length then turned into about 2 s and 1.5 times of cattle length, this process was recognized as a mating event. Nasirahmadi et al. (2016) used the pixel area of the fitted ellipse of pigs in mating as the feature, when this ellipse area

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

changed into 1.3–2 times of pig-body area, it was recognized as mating behavior. Compared with the mating behavior, although the geometry shape and displacement of the two pigs in high and medium aggression have the mutation, they always maintain adhesion or a very small distance. Thus, the aggressive pigs are regarded as a whole for motion analysis in this paper. Recently, computer vision technology has been widely used for animal behavior analysis such as pigs comfort discrimination (Shao and Xin, 2008), pigs drinking water detection (Kashiha et al., 2013), pigs tripping and stepping behavior recognition (Gronskyte et al., 2015). However, the computer vision-based research for recognizing pigs aggressive behaviors has still been few. In order to detect pigs aggression using motion history image (MHI), the moving pixels of all individuals were extracted as the mean intensity, and the ratios of moving pixels account for all pixels of pig-body were extracted as the occupation index. Linear discriminant analysis (LDA) was used to classify these two categories of features and to recognize aggression with an accuracy of 89% (Viazzi et al., 2014). In order to classify aggression, the average, maximum, minimum, sum and deviation of the occupation index were extracted as features, a multilayer feed forward neural network was used to train these features and to classify the high and medium aggression with an average accuracy of 99.2% (Oczak et al., 2014). In the above methods, the number or proportion of the moving pixels of all individuals was selected as the feature, while these features contained the moving pixels of not aggressive individuals, it would increase the data amount of feature and the computation amount of algorithm. Additionally, using the mean in a period of time as the feature will lose the aggression details of each frame in this period. Hence, the objective of this paper is to develop a computer vision-based method to further separate the aggressive pigs from all the moving individuals and to automatically recognize aggressive behaviors by analyzing their acceleration between adjacent frames. Connected area and adhesion index were used to locate aggressive pigs and to extract key frame sequences. Among them, the diagonal length of circumscribed rectangle of aggressive pigs in the former frame was used to predict the aggression range in the latter frame to achieve the continuous tracking of aggressive pigs. The aggressive pigs were regarded as an entirety to analyze their motion between adjacent frames and to extract the acceleration feature. Hierarchical clustering was used to calculate the threshold of acceleration. Based on this feature, the recognition rules of medium and high aggression were designed. Accuracy, sensitivity and specificity were used to evaluate the effectiveness of this method.

2. Materials and methods 2.1. Experimental setup 2.1.1. Video acquisition The videos used in this study were collected from pig farms of the Danyang Rongxin Nongmu Development Co., Ltd., which is the experimental base for key disciplines of the Agricultural Electrification and Automation of Jiangsu University. The pigs were monitored in a reconstructed experimental pigsty. The pigsty was 1 m high, 3.5 m long and 3 m wide. A camera was located above the pigsty at the height of 3 m relative to the ground. The camera was the Canadian point grey industrial camera FL3-U388S2C-C (Sony Exmor technology, Canada). The camera used a Kowa LM6NCL 6.0 mm lens (Kowa Company Ltd., Japan). It recorded at a resolution of 1760  1840 pixels. The camera enabled top-view RGB colour images of group-housed pigs to be captured. The camera was connected to the computer with software Point Grey FlyCap2, and the videos were recorded in MJPEG. The computer processor was the Intel (R) Core (TM) i7-2670QM CPU @

381

2.2 GHz with 8 GB of physical memory. The operating system was Microsoft Windows 7 Ultimate. Ten repetitive experiments were conducted between the 16th of June 2015 and the 4th of September 2016. In each of the experiment, 7 piglets with the average weight of 24 kilograms were mixed from three litters in two pigpens after weaning. Videos were captured for the first 3 h (08 h00–11 h00) after the groups were established and then for 3 h after approximately 24 h. The rationality of video acquisition in this way is that the mixed pigs have the most violent aggressive behavior during the first 3 h after mixing and they will continuously attack within 2 days (Erhard et al., 1997; Spoolder et al., 2000). Sufficient data can be collected to meet the needs of research. 2.1.2. Data labelling In order to evaluate the proposed algorithm, the frames with aggression in videos were labelled to compare with the recognition results of algorithm. The aggressive and not aggressive behaviors are very close and difficult to be distinguished at the early stage of pigs aggression. Additionally, the initial aggression doesn’t cause much harm to the pigs. Therefore, only the high and medium aggression were studied in this paper (Jensen and Yngvesson, 1998). The veterinary experts in morphology proposed that head to head knocking, head to body knocking, parallel pressing, inverse parallel pressing and other behavior processes were recorded as medium aggression. Neck biting, body biting, ear biting and other behavior processes were recorded as high aggression (Gonyou, 2001; O’Connell et al., 2005). These two categories of aggression were manually labelled frame by frame in the recorded videos using the software ‘‘Labelling Tool” developed in Matlab (R2012a, The MathWorks Inc., MA). Labelling the 60 h of video cost about 145 person-hours. 2.1.3. Dataset allocation The first 3 h of video after pigs mixing were recorded as the training set, and the 3 h of video after 24 h were recorded as the validation set. The test set was built by 10 groups of pig videos using the same sampling method. Table 1 illustrates the datasets of the first pig group, the number (N), minimum, maximum and mean duration time of each category of episode were shown. 2.2. Algorithm 2.2.1. Key frame sequence extraction In order to remove part of the frames without aggression and to reduce the amount of computation by using key frame technology (Wang et al., 2015), the behavior characteristics at the beginning, in the process and at the end of aggression were analyzed to extract the key frame sequence which may has aggression in it. The specific steps are as follows: (1) Image preprocessing. The pigpen scenes include light changes, the influence of ground water stains, urine stains, manure and other sundries, varying colors of foreground objects, and pigs slow movement patterns. Comparing the segmentation effects of combining image enhancement (e.g. wavelet enhancement, histogram equalization, pseudo color transformation, etc.) with commonly used threshold segmentation (e.g. Otsu method, maximum entropy, percentage threshold, etc.), histogram equalization and percentage threshold segmentation were used to initially segment images to get the results as shown in Fig. 1(c) (Gonzalez and Woods, 2001). Since the wall around the image is not the pigs activity range, and the position of feeder is fixed. The pixels outside the red box and inside the blue box were set to zero to remove wall and feeder (Lao et al., 2016). The

382

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

Table 1 Dataset allocation of the first pig group. Episode category

High aggression Medium aggression Non-aggression

Training set

Validation set

N

Minimum duration/s

Maximum duration/s

Mean duration/s

Frame number

N

Frame number

15 47 63

7 5 9

216 190 316

22.3 17.5 153.1

1340 3284 38,576

11 34 46

945 2387 39,868

Fig. 1. Extraction process of key frame sequence: (a) original frame, (b) histogram equalization, (c) percentage threshold segmentation, (d) results of image preprocessing, (e) determination of the first frame and initial extraction of aggressive pigs, (f) setting of aggression range in second segmentation, (g) results of second segmentation, (h) precise extraction and initial location of aggressive pigs in the first frame, (i) prediction of aggression range for latter frame, (j) setting of aggression region in latter frame, (k) results of segmentation in latter frame, and (l) extraction results of aggressive pigs in latter frame.

‘‘open” operation with the fixed disc structure elements was used to remove the isolated noise points and to break the adhesion between pigs and background on the basis of keeping the shape of pigs. Because the connected domain with the pixel number less than pig standard area can’t be the foreground object. Labelling connected domain was used to remove iron bar, ground region, etc. (Guo et al., 2014). The ‘‘holes” inside some pig individuals were filled. The mathematical morphological processing mainly involves strel, imopen, bwlabel, regionprops, bwmorph, imfill and other operators in Matlab (Gonzalez and Woods, 2001). The preprocessing results were as shown in Fig. 1(d). The results show that some not fast moving or stationary pigs around pigpen were not segmented completely, the segmentation results of fast moving pigs were more complete but had smears on ear, leg, tail, etc.

(2) Determination of the first frame and initial location of aggressive pigs. With the increase of the number of pigs in the limited space, the adhesion becomes more and more frequent, which brings great difficulties to the extraction of pigs. In order to precisely locate the aggressive pigs in pigpen, only the few 7 pigs were studied presently so as to ensure the not aggressive pigs had enough moving space when aggression occurred. Through observing the labelled frame sequences with aggression found that, the not aggressive pigs began to flee at the moment of the occurrence of aggression due to stress reaction, thereby it brought out the phenomenon that more than 3 not aggressive pigs get together with other not aggressive ones scattered in the edge of pigpen during a short time due to the limited space as shown in Fig. 1(a). At the beginning of aggression, the frame with only one pair of the two adhesive pigs always

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

can be found. Even if more than 2 pairs of the two adhesive pigs simultaneously existed in one frame, they would soon become only one pair of the two adhesive pigs. It provided the conditions for the determination of the first frame and the location of aggressive pigs. Based on this property, the frame with only one connected area between 1.7 and 2.3 times of the pig standard area was defined as the first frame in a key frame sequence. This connected domain was used to initially extract the aggressive pigs in the first frame, as shown in Fig. 1(e). In order to remove the smears of aggressive pigs in a smallest region, the circumscribed rectangle of the extracted aggressive pigs was used as the interest region of second segmentation (Fig. 1(f)). Histogram equalization and maximum entropy threshold segmentation were used to secondly segment the images in the interest region (Fig. 1(g)). The above method of morphological processing was used to precisely extract and initially locate the aggressive pigs, and the smears on aggressive pigs were improved (Fig. 1(h)). (3) Continuous extraction of aggressive pigs. The diagonal length of circumscribed rectangle of the two aggressive pigs extracted in the first frame was used to predict the aggression range in the latter frame. Due to the uncertain direction of aggression, the aggression region was set as a circle with the radius r as shown in Fig. 1(i) (Guo et al., 2015). Through the statistical analysis of all frames with aggression found that, the aggression radius between adjacent frames is not more than 0.6 times of the diagonal length of circumscribed rectangle. Hence, the radius r was calculated in Eq. (1). Where points ðxmin ; ymin Þ and ðxmax ; ymax Þ are the two endpoints of rectangle diagonal of the two aggressive pigs, o is the center of rectangle and also the center of circle. The circumscribed rectangle is enclosed by four straight lines (x ¼ xmin , x ¼ xmax , y ¼ ymin and y ¼ ymax ). Where xmin is the minimum horizontal coordinate among all pixels of the two aggressive pigs. ymin is the minimum longitudinal coordinate. xmax is the maximum horizontal coordinate. ymax is the maximum horizontal coordinate.

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r ¼ 0:6 ðxmin  xmax Þ2 þ ðymin  ymax Þ2

ð1Þ

Starting from the frame after the first frame, histogram equalization and maximum entropy threshold segmentation were used to segment the image in aggression region (Fig. 1(j)). In order to remove the partial body of not aggressive pigs in this circular region (Fig. 1(k)), the connected domain that more than 0.9 times and smaller than 2.3 times of pig standard area was detected to extract the aggressive pigs with and without adhesion in aggression region, as shown in Fig. 1(l). Using the above steps, predicting the aggression range and extracting the aggressive objects were performed to the subsequent each frame to continuously extract aggressive pigs. (4) Determination of the last frame. When the pigs stand too close, they will make the response of attack or escape within 5 s. With the original frame rate of 4 frames s1, the continuous 20 frames without adhesion are regarded as the stopping of aggression (Turner et al., 2009). Thus, these continuous 20 frames were removed and defined as the last frame of this key frame sequence. (5) Optimization of key frame sequences. According to the characteristics of the large-proportion adhesion in pigs aggression, the proportion of frames with adhesion account for total frames in a key frame sequence was defined as the adhesion index P ad in Eq. (2). Where nad is the number of frames with adhesion in a key frame sequence, nsum is the

383

number of total frames in this key frame sequence. The range of adhesion index ½Rmin ; Rmax  was set to further remove the episodes without aggression. Additionally, the minimum duration time of aggression was 5 s in this paper, thus nsum was set to be at least greater than 16 (4 s).

P ad ¼ nad =nsum

ð2Þ

2.2.2. Feature extraction Due to the larger acceleration in aggression than in other behaviors, the acceleration was selected as the feature to recognize aggressive behaviors. In this paper the aggressive pigs were regarded as a whole rectangle, the acceleration of this entirety between adjacent frames was used as the feature, as shown in Fig. 2. Where the two red rectangle boxes f i ðx; yÞ and f iþ1 ðx; yÞ respectively represent the pigs entirety in the ith frame and the i þ 1th frame. Ln;i and Ln;iþ1 represent the four sides of rectangle box in the ith frame and the i þ 1th frame. On the one hand, although the two aggressive individuals may produce relative quiescence, they always maintain rapid movement relative to the ground. Between adjacent image frames, the velocity of circumscribed rectangle of the two aggressive pigs represents these two pigs’ velocity relative to the ground rather than their relative velocity. On the other hand, in very few labelled episodes with aggression, the two pigs’ quiescence relative to ground may exist in a short time or a few frames. After dividing the episode with aggression into many minimum recognition units, no more than one minimum recognition unit may be classified as non-aggression in few episode with aggression, it will not influence the recognition results of aggression in this whole episode. For instance, in an episode with aggression which includes 10 minimum recognition units, only 1 minimum recognition unit is classified as non-aggression, while the 9 minimum recognition units before and after this one unit are all classified as aggression. It will not influence the real-time recognition of aggression in this whole episode. Therefore, the obtained acceleration based on this velocity has very strong discriminative power. Eq. (3) was used to calculate the displacement Dn;i of the nth side of rectangle box between the ith frame and the i þ 1th frame. Where lpigsty is the actual length of floor region in pigpen, limage is the pixel length of floor region in pigpen. i ¼ 1; 2; . . . ; 43; 200. n ¼ 1; 2; 3; 4 .

D1;i ¼ ðlpigsty =limage Þ  jymin ði þ 1Þ  ymin ðiÞj D2;i ¼ ðlpigsty =limage Þ  jxmin ði þ 1Þ  xmin ðiÞj D3;i ¼ ðlpigsty =limage Þ  jymax ði þ 1Þ  ymax ðiÞj

ð3Þ

D4;i ¼ ðlpigsty =limage Þ  jxmax ði þ 1Þ  xmax ðiÞj In order to quantify the displacement change of aggressive pigs between adjacent frames, the velocity sum of the four sides of rectangle was used to calculate the velocity of pigs in Eq. (4). Where

Fig. 2. Schematic diagram of the acceleration extraction between adjacent frames.

384

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

V 1;i , V 2;i , V 3;i and V 4;i are respectively the velocity of entirety in horizontal and vertical directions in the ith frame, f is the original frame rate.

V i ¼ V 1;i þ V 2;i þ V 3;i þ V 4;i ¼ D1;i =t þ D2;i =t þ D3;i =t þ D4;i =t ¼ f ðD1;i þ D2;i þ D3;i þ D4;i Þ

ð4Þ

The velocity change of rectangle between adjacent frames was used to calculate the acceleration of pigs in Eq. (5). Where 1=f is the time interval between adjacent frames.

ai ¼ dV=dt ¼ DV=Dt ¼ ðV iþ1  V i Þ=ð1=f Þ ¼ f ðV iþ1  V i Þ

ð5Þ

2.2.3. Feature classification and threshold setting The positive acceleration in aggression has larger discrimination than the one in other behaviors, while the negative acceleration in aggression has smaller discrimination than the one in other behaviors. Thus, only the positive acceleration was used to set its threshold. According to the regularity distribution and the low dimensionality of the acceleration data, the hierarchical clustering function in the MATLAB toolbox was used to classify the accelerations in the following steps (Xu et al., 2016): (1) The zscore function was used to standardize the acceleration data. The pdist function was used to calculate the Euclidean distance between every two accelerations and to return a row vector with M ⁄ (M  1) /2 elements. The squareform function was used to generate a symmetric distance matrix. (2) The linkage function was used to generate a (M  1) ⁄ 3 cluster tree matrix. Where the first two ranks are the index tag showing which two acceleration samples can be clustered into one category, the third rank is the shortest distance between the two samples. (3) The cophenet function was used to calculate the correlation coefficient c of the distances generated by pdist function and linkage function. (4) The cluster function was used to classify the accelerations according to the shortest distance method. In order to improve the accuracy of threshold, the mean of the same category of accelerations was set as the threshold after removing the abnormal points in data. 2.2.4. Aggressive behavior recognition The proposed method mainly consists of the following four parts. Part 1 aims to design the classification rules of frames. The key frame sequence extraction and the acceleration extraction were performed to all frames in training set. Hierarchical clustering was used to classify the acceleration data and to get a acceleration threshold of medium aggression (a1 ) and a acceleration threshold of high aggression (a2 ). The current frame which meets a 2 ½a1 ; a2 Þ was classified as the frame with medium aggression. The current frame which meets a 2 ½a2 ; þ1Þ was classified as the frame with high aggression. The rest frame was classified as the frame without aggression. Part 2 aims to classify the frames. The key frame sequence extraction and the acceleration extraction were also performed to all frames in validation set. Using the designed classification rules, the frames were divided into the frame with medium aggression, the frame with high aggression and the frame without aggression. Part 3 aims to classify the minimum recognition units. In order to judge whether aggression exists in a video within a shortest

time or a minimum number of frames, the minimum recognition unit needs to be defined. By analyzing the accelerations in all key frame sequences found that, the acceleration shows the law of positive and negative alternation as shown in Fig. 3. The maximum frame distance among the continuous negative, positive and negative accelerations is defined as the minimum recognition unit (MRU), it ensures that the positive acceleration must appear in MRU for recognition. In order to prevent the false recognition caused by the sudden accelerated motion without aggression (e.g. chase, exploration, etc.) and to determine the type of the MRU that has the frames with two categories of aggression, the MRU that has more than a certain number (q) of frames with same aggression is classified as the MRU with this category of aggression. The rest MRU is classified as the MRU without aggression. Part 4 aims to evaluate the results of recognition. The number of true positive, false negative, false positive and true negative MRUs of recognizing high and medium aggression were respectively counted. Eq. (6) was used to calculate accuracy, sensitivity and specificity to comprehensively evaluate the proposed algorithm. positive and true negative MRUs Accuracy ¼ Number of true  100% Total number of MRUs of true positive MRUs  100% Sensitivity ¼ Number ofNumber true positive and false negative MRUs

Specificity ¼

Number of true negative MRUs Number of false positive and true negative MRUs

ð6Þ

 100%

3. Experiment results and analysis 3.1. Results in training set In the stage of extracting key frame sequences, the pig standard area was set to 61,700 pixels and the range of adhesion index was set as ½0:8; 1 according to the labelled frames with aggression in all the 60 h. The 80 episodes of key frame sequences (15,137 frames) were extracted from the training set (43,200 frames). Among them, the 62 episodes with aggression were extracted, and 71.43% episodes without aggression were removed. The results show that all the episodes with aggression can be extracted, and the not removed episodes without aggression are mainly caused by the frequent interactions with adhesion (e.g. chasing, intimacy, etc.) rather than aggression. In the stage of feature extraction, the parameter lpigsty =limage was set to 3.5/1385 m. Table 2 illustrates the extraction process of acceleration in the 10 continuous frames. Where ad = 1 represents adhesion, ad = 0 represents non-adhesion. In the stage of threshold solution, the 10,596 positive accelerations in the 80 episodes of key frame sequences were used to calculate the acceleration threshold. Set M = 10,596. c = 0.89 indicates the correlation between the distance in clustering tree and the actual distance generated by pdist function is relatively large. Table 3 illustrates the cluster results of accelerations. The data were divided into 5 categories. Obviously, the accelerations in category 3 and in category 5 only accounted for a small proportion, they were regarded as the abnormal points. The data in category 1 accounted for the maximum proportion but with the minimum mean, they responded to a large number of frames without aggression and the frames with lower velocity in some aggression. The means of accelerations in category 2 and in

Fig. 3. Schematic diagram of minimum recognition unit.

385

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387 Table 2 Extraction process of acceleration. Frame

ymin

xmin

ymax

xmax

D1/m

D2/m

D3/m

D4/m

V/m/s

a/m/s2

ad

1 2 3 4 5 6 7 8 9 10

249 251 258 263 215 231 254 255 236 240

799 762 766 781 793 761 759 757 773 740

781 779 762 775 785 779 778 779 756 764

1422 1392 1364 1367 1365 1363 1361 1367 1373 1376

0.0051 0.0177 0.0127 0.1214 0.0405 0.0582 0.0025 0.0481 0.0101 0.0531

0.0936 0.0101 0.0380 0.0304 0.0810 0.0051 0.0051 0.0405 0.0835 0.0228

0.0051 0.0430 0.0329 0.0253 0.0152 0.0025 0.0025 0.0582 0.0202 0.0430

0.0759 0.0708 0.0076 0.0051 0.0051 0.0051 0.0152 0.0152 0.0076 0.1822

0.7185 0.5667 0.3643 0.7286 0.5667 0.2834 0.1012 0.6477 0.4858 1.2043

0.6072 0.8096 1.4573 0.6477 1.1334 0.7286 2.1859 0.1619 0.7185 0.5768

1 1 1 1 0 1 1 1 0 1

same time in a MRU, and the number of the frames with same aggression always be greater than 3 in a MRU. Thus, q was set to 3. Table 4 illustrates the number of MRUs in results of recognition in test set. The results show that the proposed algorithm can be used to recognize medium and high aggression from many interferential behaviors, which include sleeping, drinking, feeding, exploring, playing. The main reasons for failure to recognize medium aggression are as follows: (1) The too low velocity in medium aggression leads the accelerations in less than 3 frames to meet a 2 ½a1 ; a2 Þ in MRU, it is falsely recognized as non-aggression. (2) The too high velocity in medium aggression leads the accelerations in more than 3 frames to meet a 2 ½a2 ; þ1Þ in MRU, it is falsely recognized as high aggression. (3) The too low velocity in high aggression leads the accelerations in more than 3 frames to meet a 2 ½a1 ; a2 Þ in MRU, it is falsely recognized as medium aggression. (4) The too high velocity in non-aggression (e.g. exploring, sniffing, knocking caused by the frighten pigs, etc.) leads the accelerations in more than 3 frames to meet a 2 ½a1 ; a2 Þ in MRU, it is falsely recognized as medium aggression. The main reasons for failure to recognize high aggression are as follows: (1) The too low velocity in high aggression leads the accelerations in less than 3 frames to meet a 2 ½a2 ; þ1Þ or a 2 ½a1 ; a2 Þ in MRU, it is falsely recognized as non-aggression. (2) The too low velocity in high aggression leads the accelerations in more than 3 frames to meet a 2 ½a1 ; a2 Þ in MRU, it is falsely recognized as medium aggression. (3) The too high velocity in medium aggression leads the accelerations in more than 3 frames to meet a 2 ½a2 ; þ1Þ in MRU, it is falsely recognized as high aggression. (4) The too high velocity in non-aggression (e.g. accelerated motion with adhesion caused by scrambling, environmental change, lighting switch, strangers’ entering, etc.) leads the accelerations in more than 3 frames to meet a 2 ½a2 ; þ1Þ in MRU, it is falsely recognized as high aggression. Fig. 4 compares the accuracy, sensitivity and specificity of recognizing high and medium aggression in 10 pig groups. In Fig. 4 (a), the mean and standard deviation of the accuracy of recognizing

Table 3 Cluster results of accelerations. Category

Frame number

Mean/m/s2

Percentage/%

1 2 3 4 5

7015 2141 278 995 167

0.5237 1.1892 1.6826 2.3635 3.1764

66.20 20.21 2.62 9.39 1.58

category 4 were respectively 1.1892 m/s2 and 2.3635 m/s2, which were consistent with dividing pigs aggression into two categories of medium and high (Oczak et al., 2014). Thus, 1.1892 m/s2 and 2.3635 m/s2 were set as the acceleration threshold of medium aggression (a1) and the acceleration threshold of high aggression (a2). The acceleration thresholds of 10 pig groups were respectively extracted. The mean and standard deviation of a1 were 1.1492 m/ s2 and 0.1101 m/s2. The mean and standard deviation of a2 were 2.4264 m/s2 and 0.1242 m/s2. The results show that different swinery samples have different acceleration thresholds. These thresholds mainly depend on the weight of pig individuals, the week age of pigs and the degree of adaptation in new group composition. 3.2. Results in validation and test sets In the validation set (43,200 frames), the 58 episodes of key frame sequences (13,754 frames) were extracted. Through analyzing all the 138 (=80 + 58) episodes of key frame sequences, the maximum frame distance (11 frames) was set as the MRU. In the process of deviding MRUs, the MRU with less than 11 frames often exists at the end of a key frame sequence, it is regarded as a complete MRU. Thus, these 13,754 frames were divided into 1276 MRUs. Through analyzing all key frame sequences found that, due to the shortness of MRU and the continuity of aggression, the frames with two categories of aggression will not exceed 3 frames at the

Table 4 Number of true positive (TP), false negative (FN), false positive (FP) and true negative (TN) MRUs in results of recognizing high and medium aggression among 10 groups of pigs. Group number

1 2 3 4 5 6 7 8 9 10

TP MRU

FN MRU

FP MRU

TN MRU

High

Medium

High

Medium

High

Medium

High

Medium

82 85 73 77 84 75 82 76 84 70

213 187 202 195 188 195 201 196 192 201

6 8 5 6 7 4 7 6 9 6

25 19 26 17 16 15 28 25 14 22

30 36 32 27 31 38 28 25 30 29

25 34 31 29 38 35 30 28 37 29

1158 1125 1147 1131 1132 1134 1132 1133 1129 1135

1013 1014 998 1000 1012 1006 990 991 1009 988

386

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

Fig. 4. Comparison of the evaluation parameters in recognition results: (a) histogram of the evaluation parameters of recognizing high aggression, and (b) histogram of the evaluation parameters of recognizing medium aggression.

high aggression were 97.04% and 0.0031. The mean and standard deviation of the sensitivity of recognizing high aggression were 92.54% and 0.0125. The mean and standard deviation of the specificity of recognizing high aggression were 97.38% and 0.0033. In Fig. 4(b), the mean and standard deviation of the accuracy of recognizing medium aggression were 95.82% and 0.0028. The mean and standard deviation of the sensitivity of recognizing medium aggression were 90.57% and 0.0192. The mean and standard deviation of the specificity of recognizing medium aggression were 96.95% and 0.0038. The results show that, the means of these 3 evaluation parameters in 10 pig groups are all higher, it verifies the effectiveness of the proposed algorithm. The standard deviation of these 3 evaluation parameters in 10 pig groups are all very small, it verifies the lower discrete degree of parameters relative to their mean and the robustness of the proposed algorithm.

4. Discussion The computer vision-based algorithm proposed in this paper was used to continuously and automatically recognize pigs high and medium aggression. Compared with previous studies, the innovation and advantage of this paper are as follows. In the aspect of target tracking, all the moving pig individuals were taken as the investigated objects in previous studies (Oczak et al., 2014; Viazzi et al., 2014). Only the aggressive individuals are continuously extracted from the pig group in this paper, the feature extracted only from the aggressive pigs is more accurate. Moreover, the data amount of feature is reduced, and thereby the computation amount of algorithm is smaller. In the aspect of data optimization, the raw data have not been screened in previous literature, so the training time is longer (Oczak et al., 2014). The key frame technology is used to extract all the episodes with aggression and to remove the episodes without aggression as many as possible in this paper. It reduces the interference of the frames without aggression, the amount of data and the running time of algorithm. In the aspect of feature extraction, the acceleration of circumscribed rectangle of aggressive pigs between adjacent frames is

extracted to recognize high and medium aggression. Since the accelerated motion exists in the aggressive or abnormal behaviors of many other animals, the acceleration feature with stronger applicability and generality can be used into other animals’ behavior research. In the aspect of feature classification, the abstract mathematical operation (e.g. linear discriminant analysis and neural network) were used to classify features in previous studies (Viazzi et al., 2014; Oczak et al., 2014). The acceleration is used to compare with its standard threshold to classify features in this paper. Based on this feature, the designed recognition rules are closer to the actual process of aggression, and the thresholds are extracted from the actual acceleration values. Therefore, this method has higher reliability, accuracy and other advantages. In the aspect of MRU setting, the MRU with the minimum of 7 s was used for recognizing aggression in previous study (Oczak et al., 2014). When the MRU is less than 7 s, the accuracy of recognition may fall without presenting the details of each frame in this 7 s. The MRU with the minimum of 2.75 s is used for recognition in this paper, which can meet the application requirements of short-time detection with presenting the acceleration detail of each frame in this 2.75 s. In the aspect of potential application, the occurrence frequency of each category of aggression in a fixed period can be counted to study the law of pigs aggression. Based on a large amount of information such as combination of aggressive pigs, occurrence and duration time of aggression, number of frames with aggression, the comprehensive evaluation model of algorithm, the estimation model of damage and the prediction model of aggression can be established to provide protection for the health and welfare of pigs and the economic benefit of pig farms. 5. Conclusion A computer vision-based method was used to automatically recognize aggressive behaviors among group-housed pigs. Connected area and adhesion index were used to locate aggressive pigs and to extract key frame sequences. The two aggressive pigs were regarded as an entirety to extract the acceleration of this entirety

C. Chen et al. / Computers and Electronics in Agriculture 142 (2017) 380–387

between adjacent frames. The results show that the acceleration feature can be used to recognize pigs high and medium aggression with higher accuracy, sensitivity and specificity. The MRU was improved to 2.75 s with presenting the acceleration detail in each frame. In the future, counting the occurrence frequency of each category of aggression can be used to explore the law of aggression, to assess the damage level and to determine the artificial intervention of harmful behaviors. Acknowledgements This work was part of a project funded by the ‘‘National Natural Science Foundation of China” (grant number: 31172243), the ‘‘Doctoral Program of the Ministry of Education of China” (grant number: 2010322711007), the ‘‘Priority Academic Program Development of Jiangsu Higher Education Institutions” and the ‘‘Graduate Student Scientific Research Innovation Projects of Jiangsu Ordinary University” (grant number: CXLX13_664). References Bracke, M.B., Metz, J.H., Spruijt, B.M., Schouten, W.G., 2002. Decision support system for overall welfare assessment in pregnant sows B: validation by expert opinion. J. Anim. Sci. 80 (7), 1835–1845. D’Eath, R.B., Turner, S.P., 2009. The natural behaviour of the pig. In: Marchant-Forde, J.N. (Ed.), The Welfare of Pigs. Springer Science + Business Media, B.V, Dordrecht, p. 13. Erhard, H.W., Mendl, M., Ashley, D.D., 1997. Individual aggressiveness of pigs can be measured and used to reduce aggression after mixing. Appl. Anim. Behav. Sci. 54 (2), 137–151. Faucitano, L., 2001. Causes of skin damage to pig carcasses. Can. J. Anim. Sci. 81 (1), 39–45. Gonzalez, R.C., Woods, R.E., 2001. Digital Image Processing. Addison-Wesley Publishing Company Limited. Gonyou, H.W., 2001. The social behaviour of pigs. In: Keeling, L.J., Gonyou, H.W. (Eds.), Social Behaviour in Farm Animals. CABI Publ, Wallingford, U.K., pp. 147–176. Guo, Y.Z., Zhu, W.X., Jiao, P.P., Chen, J.L., 2014. Foreground detection of grouphoused pigs based on the combination of Mixture of Gaussians using prediction mechanism and threshold segmentation. Biosyst. Eng. 125 (3), 98–104. Gronskyte, R., Clemmensen, L.H., Hviid, M.S., Kulahci, M., 2015. Pig herd monitoring and undesirable tripping and stepping prevention. Comput. Electron. Agric. 119, 51–60. Guo, Y.Z., Zhu, W.X., Jiao, P.P., Ma, C.H., Yang, J.J., 2015. Multi-object extraction from topview group-housed pig images based on adaptive partitioning and multilevel thresholding segmentation. Biosyst. Eng. 135, 54–60. Jensen, P., Yngvesson, J., 1998. Aggression between unacquainted pigs—sequential assessment and effects of familiarity and weight. Appl. Anim. Behav. Sci. 58 (1), 49–61.

387

Keeling, L.J., Gonyou, H.W., 2001. Social Behavior in Farm Animals. CAB International, Wallingford, UK, pp. 147–176. Kongsted, A.G., 2004. Stress and fear as possible mediators of reproduction problems in group housed sows: a review. Acta Agriculturae Scandinavica 54 (2), 58–66. Kashiha, M., Bahr, C., Haredasht, S.A., Ott, S., Moons, C.P.H., Niewold, T.A., Ödberg, F. O., Berckmans, D., 2013. The automatic monitoring of pigs water use by cameras. Comput. Electron. Agric. 90, 164–169. Lao, F., Brown-Brandl, T., Stinn, J.P., Liu, K., Teng, G., Xin, H., 2016. Automatic recognition of lactating sow behaviors through depth image processing. Comput. Electron. Agric. 125, 56–62. McGlone, J.J., 1985. A quantitative ethogram of aggressive and submissive behaviours in recently regrouped pigs. J. Anim. Sci. 61 (3), 559–565. Nasirahmadi, A., Hensel, O., Edwards, S.A., Sturm, B., 2016. Automatic detection of mounting behaviours among pigs using image analysis. Comput. Electron. Agric. 124, 295–302. Oczak, M., Viazzi, S., Ismayilova, G., Sonoda, L.T., Roulston, N., Fels, M., Bahr, C., Hartung, J., Guarino, M., Berckmans, D., Vranken, E., 2014. Classification of aggressive behaviour in pigs by activity index and multilayer feed forward neural network. Biosyst. Eng. 119 (4), 89–97. O’Connell, N.E., Beattie 1, V.E., Watt, D., 2005. Influence of regrouping strategy on performance, behaviour and carcass parameters in pigs. Livestock Prod. Sci. 97 (2), 107–115. Stookey, J.M., Gonyou, H.W., 1994. The effects of regrouping on behavioral and production parameters in finishing swine. J. Anim. Sci. 72 (11), 2804–2811. Spoolder, H.A.M., Edwards, S.A., Corning, S., 2000. Aggression among finishing pigs following mixing in kennelled and unkennelled accommodation. Livestock Prod. Sci. 63 (2), 121–129. Shao, B., Xin, H.W., 2008. A real-time computer vision assessment and control of thermal comfort for group-housed pigs. Comput. Electron. Agric. 62 (1), 15– 21. Stukenborg, A., Traulsen, I., Puppe, B., Presuhn, U., Krieter, J., 2012. Agonistic behaviour after mixing in pigs under commercial farm conditions. Appl. Anim. Behav. Sci. 129 (1), 28–35. Turner, S.P., Farnworth, M.J., White, I.M.S., Brotherstone, S., Mendl, M., Knap, P., Penny, P., Lawrence, A.B., 2006. The accumulation of skin lesions and their use as a predictor of individual aggressiveness in pigs. Appl. Anim. Behav. Sci. 96 (3), 245–259. Turner, S.P., Roehe, R., D’Eath, R.B., Ison, S.H., Farish, M., Jack, M.C., Lundeheim, N., Rydhmer, L., Lawrence, A.B., 2009. Genetic validation of postmixing skin injuries in pigs as an indicator of aggressiveness and the relationship with injuries under more stable social conditions. J. Anim. Sci. 87 (10), 3076–3082. Tsai, D.M., Huang, C.Y., 2014. A motion and image analysis method for automatic detection of estrus and mating behavior in cattle. Comput. Electron. Agric. 104, 25–31. Viazzi, S., Ismayilova, G., Oczak, M., Sonoda, L.T., Fels, M., Guarino, M., Vranken, E., Hartung, J., Bahr, C., Berckmans, D., 2014. Image feature extraction for classification of aggressive interactions among pigs. Comput. Electron. Agric. 104 (2), 57–62. Wang, Y.X., Sun, S.X., Ding, X.M., 2015. A self-adaptive weighted affinity propagation clustering for key frames extraction on human action recognition. J. Vis. Commun. Image Represent. 33, 193–202. Xu, J., Wang, G.Y., Deng, W.H., 2016. DenPEHC: density peak based efficient hierarchical clustering. Inf. Sci. 373, 200–218.