1. Introduction
With the improvement of people’s living standards, the demand for poultry and eggs is increasing day by day. In order to meet the expanding market demand, cage-rearing duck technology, which can greatly improve production efficiency, has developed rapidly. The original laying duck breeding is mainly in the form of pond stocking and scatter-feed. However, with the increasing attention to environmental quality and the gradual improvement of the requirements for the quality of livestock and poultry products, the original breeding mode leads to myriad problems, such as pollution to the water in contact with the water area, large land occupation, and extensive management. This feeding method is difficult to meet the social requirements. Through the observation and experiments of researchers, it was found that cage-rearing of laying ducks can overcome the above problems [
1,
2,
3], and also improve feed utilization, avoid external environmental interference, help with the laying eggs all year round without the influence of seasons, avoid eggs’ damage and pollution, and facilitate the recording of eggs’ laying. The adoption of large-scale breeding enterprises has high production efficiency and can meet the huge market demand with a large population base in China.
Although laying ducks have significant technical advantages in indoor cages, there are still some problems. In the actual breeding process, it is found that trampling and wing spreading often occur in cages. When ducks are frightened or emotionally unstable, they will produce a stress response. In addition, because of the narrow feeding space, they fight to seize the site and trample and avoid each other, and the trampled ducks will be injured, thereby affecting their health. At the same time, owing to the uncomfortable wound, self-pecking or mutual pecking is more frequent. Living in this pattern over a long period of time will lead to feather phase damage (as shown in
Figure 1), which will affect the carcass quality after depilation.
Precision livestock farming (PLF) uses modern information technology to realize real-time monitoring of livestock and poultry individuals (groups) to realize an accurate management and optimize the production performance of livestock and poultry [
4]. While paying attention to information collection, precision livestock and poultry breeding excavates the deep meaning of animal health level and animal adaptability to the breeding environment contained in the information to provide low-cost and high-precision solutions for animal disease early warning and breeding environment feedback regulation. In order to more conveniently observe and study the daily living state of poultry, wearable devices are introduced to transmit data. For example, RFID (radio frequency identification) sensors are used to record the identity information, weight, and daily exercise of chickens [
5]. Wearing an inertial measurement unit (IMU) [
6] detects the amount of exercise, posture duration, and abnormal behavior of the animal to judge its health status.
With the development of technology, monitoring equipment has played a great role in the farm. At first, surveillance video was mainly used for manual observation, which was inefficient, depended on the experienced judgment of staff, and there was no standard. With the development of image processing technology, machine vision is widely used in animal behavior recognition. Commonly, the ellipse model is used to fit the chicken’s body [
7] and distinguish some behavioral characteristics. The image is processed by clustering/dispersion descriptor and associated with the air temperature value [
8]. Then, the grouping and dispersion behavior are obtained as the index of thermal comfort state. Combined with infrared thermal imaging technology, the number of color pixels in the temperature image is calculated to detect whether there are hens in the target area [
9]. Infrared sensors can also be used to monitor the laying performance of hens in the free-range system [
10]. Similarly, the scattered images can be divided into different temperatures and similar times [
11].
The combination of machine learning and image processing technology has been widely used in agricultural production [
12,
13] and analysis of various animals, such as pigs [
14,
15], cattle [
16], poultry, and insects [
17]. In the earlier stage, based on the support vector machine model, healthy chickens and sick chickens can be classified by extracting the head features [
18] and body features [
19]. With the rise of deep learning algorithms, the weight of broilers can be predicted by a Bayesian artificial neural network by acquiring depth images [
20], and the health status can also be analyzed by posture and movement characteristics [
21]. The application of the Yolo model is also generally used in poultry target detection. For example, YoloV3 model was used to detect the behavior images of six kinds of laying hens [
22] and evaluate whether the behavior of laying hens is abnormal by the number of fights. The YoloV4 target detection model was used to identify some daily behaviors of chickens through 4 months of continuously monitoring pictures. The average precision (mAP) of the model was 79.69%. Aiming at the low recall rate of feather pecking behavior, a processing algorithm based on time series was proposed [
23].
At present, the research on behavior recognition and health detection of chickens is relatively sufficient and the technology is more mature. However, the related research on cage-reared waterfowl is still insufficient. If they live in an uncomfortable environment, laying ducks will lose their appetite, grow slowly, and have increased mortality. In order to improve their survival rate and increase their productivity, especially in the large-scale, high-density breeding of laying ducks, it is very significant to pay attention to the living conditions of the animals. Combined with machine vision and deep learning technology, through the research on behavior identification of cage-reared ducks, we can better understand the living habits. It provides a theoretical support for the standardized production and technical promotion of cage-reared laying ducks and provides a basis for the subsequent establishment of automated and intelligent duck houses in the future.
2. Materials and Methods
2.1. Experimental Environment and Equipment
This experiment was carried out in a welfare laying duck farm in Hubei Province (ethical approval ID number: HZAUDU-2020-001). The laying ducks in the experiment lived in four layers of H-type laminated duck cages. The size of the duck cage was 300 mm × 480 mm × 420 mm (length × width × height). Two ducks were raised in each cage (the average living space of each duck was 6.48 × 10
7 mm
3), and every four duck cages were a group, with a total of 66 groups in each layer. There were five rows in each duck house, and about 21,120 ducks were raised in each building. The scene map of the experiment site is shown in
Figure 2.
The duck cages were arranged back-to-back. There were water supply pipes between the positive and negative duck cages. Each cage position was provided with a nipple-shaped water dispenser, and the water source was supplied all day. In terms of the environmental configuration of the duck cage, the duck cage had a push-in cage door and a trough was installed in front (
Figure 3). The feeding method was to feed three times a day at 7:30 a.m., 11:00 a.m. and 4:00 p.m., respectively. The bottom of the duck cage was inclined with an inclination of 6° to ensure that the duck eggs could automatically roll onto the conveyor belt. The conveyor belt was rotated regularly every day to collect duck eggs.
Laying ducks have strong sensitivity, neuroticism, and quick response. When there was staff passing by or standing around in the duck house, ducks usually had a strong emergency response. In order to collect the images of ducks at this time, it was necessary to install a camera device in the duck house to ensure that the emergency state and non-emergency state of laying ducks could be captured for subsequent research. Since the illumination light source in the duck house was a dim, yellow, incandescent lamp, independent lighting equipment was added when collecting images to ensure a clearer shooting effect. The camera was installed 200 mm away from the cage net, and the lens was arranged at the center line of the duck cage, with an inclination of 5° (
Figure 4). Video recording was used to record the life behavior of 264 groups of 528 laying ducks in the cages to ensure the adaptability of the method to detect the behavior of different ducks. The shooting time was from 7:30 a.m. to 7:30 p.m.
2.2. Data Collection and Labeling
We used the PotPlayer software to intercept a picture every 100 ms of the desired segment of the corresponding behavior in the video and generated a total of 5560 pictures. The training set and validation set samples were labeled by the software called ‘labelimg’. Three behaviors were labeled, including neck extension, trample, and wing spread. The judging standards for laying duck behavior are shown in
Table 1.
In order to improve the generalization ability of the model, the original data set had to be preprocessed. The data enhancement processing methods used in this paper included mosaic enhancement, image blur, filtering, scaling, flipping, and color gamut transformation.
2.3. YoloV5 Network Structure
In this work, the YoloV5 network was used to identify three emergency behaviors of cage-reared laying ducks. YoloV5 consists of focus structure, CSPDarknet, and PANet network (as shown in
Figure 5).
Compared with YoloV3 and YoloV4, the focus structure was added to the backbone network of YoloV5. The key part is the slicing operation. The principle is shown in
Figure 6. Coupled with the 3 × 3 convolution operation, the input of a 640 × 640 × 3 image became a 320 × 320 × 32 feature graph. This method reduces the disadvantage of down-sampling information loss, but the amount of calculation and parameters increases. The same as YoloV4, YoloV5’s backbone network also uses CSPDarknet to extract image features for later networks [
24]. CSPDarknet reduces the parameters and calculation of the model and reduces the size of the model, ensuring the operation speed and accuracy.
SPP (Spatial Pyramid Pooling) network [
25] is used to increase the receptive field of the network. The network uses three different scales of 5 × 5, 9 × 9, and 13 × 13 to obtain richer features. As shown in the
Figure 7, we pooled the feature maps and combined the results to obtain a fixed length output. SPP network is not sensitive to the aspect ratio and size characteristics of the input image; so, it improves the scale invariance and reduces the overfitting of the image.
Then, using PANet (Path Aggregation Network) (as shown in
Figure 8) as the neck network can accurately save spatial information. This network helps to correctly locate pixels and form a mask to make better use of the extracted features [
26]. When the image passes through each layer of neural network, the complexity of features increases and the spatial resolution of the image decreases. Therefore, a pixel level mask cannot be accurately recognized by high-level features. FPN (Feature Pyramid Networks) uses a top-down path to extract semantic-rich features and combine them with accurate location information. When using PANet in YoloV5, instead of adding adjacent layers, a splicing operation is applied to them to improve the accuracy of prediction.
The head part also uses the head network of YoloV3 to predict the obtained characteristics [
27]. As shown in
Figure 9, the head of YoloV3 is predicted by the kernel_size = 3 and is used for feature integration, while convolution of kernel_size = 1 transforms the obtained features into prediction results and, finally, completes target detection.
2.4. Data Set Training
The behavior detection data set of cage-reared ducks was divided into three categories: stretching out of the cage, trampling, and wing opening. In the detection process, in order to facilitate marking, the names were simplified as extension, trample, and spread. Each category included ducks with different looks, different lighting conditions, and different camera positions. The data set was randomly shuffled. The training set and verification set were set at a ratio of 9:1, and each photo contained at least one behavior.
Based on the deep learning framework pytorch, this experiment was completed on a Dell model T5810 tower graphics processing workstation, the computer system was a Windows 10 professional operating system Intel@core64 Xeon 3.70 GHz W-2145 processor, and memory and graphics card were 12 GB and NVIDIA Tesla K80, respectively.
This training included 100 epochs. Among the first 50 epochs, the batch size was 8 and the learning rate was 0.001. From 50 to 100 epochs, the batch size was reduced to 4 and the learning rate was adjusted to 0.0001. The training loss curve of the model is shown in the figure. It can be seen from
Figure 10 that the convergence effect of the loss curve was good, that is, the training was effective.
3. Results and Discussion
In order to evaluate the detection effect of the YoloV5 model on the behavior of cage-reared ducks, this work chose precision, recall, and F1 score as indicators; the calculation method was as follows:
where
TP means that some behaviors in the tested picture were correctly recognized,
FP means that irrelevant behaviors were misrecognized as existing classified behaviors, and
FN means that they were misrecognized as another behavior. We calculated the precision, recall, and F1 score (see
Table 2).
The average precision of the three behaviors (98.4%) was neck extension (98.2%), trampling (98.5%), and wing spread (98.6%). From
Table 2, it can be seen that the recognition performance of the three behavior classifications can meet the requirements of detection accuracy. It provides a technical reference for the deployment of caged duck behavior detection models and duck house inspection robots in small mobile terminals in the future. The behavior detection results of the intercepted parts are shown in
Figure 11. The green box represents wing spreading behavior, the red box represents neck stretching behavior, the blue box represents trampling behavior, and the confidence is behind the name of the prediction box. In
Figure 11 a–c are the test results under normal light, (d), (e), and (f) are the test results under darker light, and (g), (h), and (i) are the results under bright light. It can be seen from
Figure 11 that the behaviors of cage-reared duck breeding can be accurately framed under different light conditions.
3.1. Performance Comparison of Different Target Detection Algorithms
The algorithms used for target detection develop very rapidly and have various detection methods. In this work, Faster-RCNN and YoloV4 algorithms were selected for performance comparison. The training steps of the above algorithms are similar to YoloV5. The models with the best training effect were respectively applied to detect the behavior of cage-reared ducks. The detection results are shown in
Table 3 and
Figure 12.
In addition, in this study, the test time of three detection models was calculated, as shown in
Figure 13. The average speed of each method was calculated based on FPS (frames per second). YoloV5 was the fastest (20.7 FPS) and can almost meet the real-time detection function. The FPS of the other two target detectors, YoloV4 and Faster RCNN, were 11.8 FPS and 3.2 FPS, respectively.
It can be seen from
Table 3,
Figure 12 and
Figure 13 that YoloV5 target detection algorithm had obvious advantages in the detection effect.
3.2. Behavioral Comparison between Calm and Emergency States
In order to explore the different behaviors of cage-reared ducks under calm state and emergency state, when there were people walking in the duck house and no one walking for more than 20 min, 10 duck cages were randomly selected, and each duck cage recorded video for 3 min. The generation time was determined by identifying the lines of stretching out the neck, stepping on, and spreading wings. The recorded time results are shown in
Figure 14 and
Figure 15.
When the duck house was quiet and there was no interference from people walking, the ducks in the cage were more stable in mood and less active in action. When the ducks felt fear and had an emergency response, their behavior in the cage became very active, which was much higher than that in the non-emergency state in terms of the number and total time of behavior. Through calculation, the average occurrence time of neck extension, trampling, and wing expansion under emergency and non-emergency conditions was 4 times, 15.6 times, and 21.3 times, and the average occurrence frequency was 3.1 times, 24.3 times, and 38 times.
Each radar chart in
Figure 16 contains 10 feature points (number 1–10 in
Figure 16).
Figure 16a–c respectively represents the occurrence time of neck extension behavior, trampling behavior, and wing spreading behavior. The difference between the two adjacent axes in the figure is 11.3 units, 10 units, and 10 units, respectively. The outer, light area in the radar chart represents the occurrence time of the three behaviors under emergency, while the inner, dark area represents the occurrence time of the three behaviors under non-emergency. The neck stretching behavior of ducks occurred more frequently than the trampling and wing spreading behavior, whether in the emergency response state or in the absence of sound in the duck house. In this work, there was a water supply pipe at the back of the cage in the duck house; therefore, there was a certain space for a duck’s head and neck to pass through. A duck’s avoidance behavior under emergency or stretching its neck to the other side to communicate with the ducks in other cages or generate curiosity will cause it to stretch its neck through the cage, that is, in the identification statistics the stretching behavior occurred more frequently. In the absence of an emergency response, the stampede and wing spreading behaviors of ducks were very rare, usually not or only 1–2 times; but the stampede behavior occasionally lasted for a certain time and did not change the posture.
3.3. Comparison of Behavior under Emergency State of Different Feeding Density
In order to record the target behavior of cage-reared ducks under different feeding densities, the experiment added two control groups, one duck for each cage and three ducks for each cage; that is, the average living spaces of the three groups were 6.48 × 10
7 mm
3, 3.24 × 10
7 mm
3, and 2.16 × 10
7 mm
3, respectively. Similarly, 10 duck cages were randomly selected. When the ducks were in an emergency state, we shot and recorded each duck cage for 3 min; the behavior results are shown in
Figure 17.
Figure 17a shows the time record of behavior occurrence during single cage feeding. Since there was no stampede behavior in single cage feeding, it was recorded as 0 in the histogram. With the increase in the number of laying ducks raised in a cage, the total time of emergency behavior gradually increased. Stampede behavior occurred more often when three laying ducks were raised in each cage and the living space was crowded.
3.4. Correlation Analysis of Different Behaviors
Through observation, it was found that there was a certain correlation between some behaviors of ducks. As shown in
Figure 18, when raising two laying ducks in each cage, the wing spreading behavior of cage-reared ducks was accompanied by stampede behavior and neck extension behavior, and the frequency area diagram showed only wing spreading behavior. Wing spreading behavior occurred in both water feeding and dry feeding. Generally, ducks only stretch their bodies and express pleasure through wing spreading behavior [
28]; however, it can be seen from the sector diagram that the wing spreading behavior of cage-reared ducks alone often accounted for only about one-third. When there was abnormal noise in the duck house or people going in and out, a duck would try to avoid it, moving away from the cage door. In order to speed up, the duck will improve its running speed by spreading its wings. When two or more ducks were raised in one cage, they would flap their wings in case of collision and trampling. When ducks escaped to the rear corner, they would try to jump to a higher height by spreading their wings and try their best to extend their necks out of the cage. Therefore, there was a wing spreading behavior in cooperation with trampling and extending their necks. It can be seen from the histogram in
Figure 19 that the stampede behavior between some ducks was due to one stepping on the other to raise its height when it stretched its neck out of the cage. The chart shows that the frequency of a behavior affected the others. Therefore, under the emergency state, the frequency of these three behaviors described in this paper increased significantly compared with the non-emergency state, which increased the possibility of injury and was not conducive to the promotion of welfare breeding.
3.5. Discussion
In order to realize the automatic detection of cage-reared duck behavior in duck houses, this work selected YoloV5, YoloV4, and Faster RCNN target detection algorithms, compared the three methods with precision, recall, F1 score, and detection speed, and finally selected the YoloV5 model with the best performance as the algorithm to detect the neck extension, wing spread, and trampling behaviors of cage-reared ducks. In the images collected in this paper, there were usually 1–2 complete cages. The target and behavior characteristics in the field of vision were obvious. The application of a deep learning target detection network had a good effect and can meet the needs of real-time behavior detection. Based on the YoloV5 model, this paper analyzed the correlation between emergency and non-emergency behaviors, behaviors under emergency state with different feeding densities, and the correlation of different behaviors. When ducks felt afraid, they became very active in the cages. Due to hiding in the back of the cages, the behaviors of neck extension, wing spread, and trampling increased significantly. Compared with the single cage feed, one duck mode, when the feeding quantity increased, stampede behavior was more likely to occur due to space congestion and emotional interaction between ducks. In avoidance, wing spread behavior usually occurred with running, keeping balance and neck extension. The occurrence of these behaviors has a negative impact on the ducks’ health and the appearance of ducks’ feathers. Therefore, it further reflects the necessity of automatic detection of cage-reared duck behavior and the importance of unmanned breeding in duck houses.
4. Conclusions
Laying ducks are raised in a multi-layer mode in a duck house. It is difficult for workers to observe the survival state of laying ducks raised in too high or low floors. Therefore, it is easier to capture the positions that are difficult for workers to reach by using camera technology, reducing work intensity, and improving work efficiency. This study shows that the behavior of cage-reared ducks can be recognized and analyzed by means of machine vision and deep learning algorithms. Comparing YoloV5, YoloV4, and Faster RCNN, the YoloV5 model was optimized. For the three behaviors of neck extension (extension), trampling (trample), and wing spreading (spread), the precisions were 97.89%, 95.92%, and 98.86%, respectively, and the recalls were 95.88%, 97.92%, and 95.60%, respectively, while the F1 score was 0.97. The detection speed of cage-reared duck behavior was 20.7 FPS, which can meet the needs of real-time detection. On this basis, this paper recorded the statistics of the frequency and time of the three behaviors of cage-reared ducks in different states. The comparison showed that, in the emergency state, ducks were very active in the cages, and the behaviors of neck extension, trampling, and wing spreading increased significantly in the process of avoiding. With the increase in the number of ducks raised in cages, trampling behavior occurred more frequently, followed by extension, while the amount of wing spreading behavior did not increase significantly. In future research, we can focus on improving the detection speed while maintaining the existing detection accuracy to be applied in a real-time monitoring system. At the same time, a target tracking mode should be added to better analyze the behavior characteristics and health status of each laying duck in the state of raising multiple laying ducks in a single cage. The behavior classification method of cage-reared ducks based on YoloV5 has a certain reference significance for the follow-up behavior research of cage duck technology and lays a foundation for the development of intelligent breeding. However, there are still more representative behaviors worth discussing, and much more exploratory work needs to be performed. The future work will be combined with more disciplines and applied to more fields of other livestock to improve the science and accuracy of the research.