Application of Deep Learning-Based Object Detection Techniques in Fish Aquaculture: A Review
Abstract
:1. Introduction
2. Datasets and Image Preprocessing
2.1. Datasets
2.1.1. Public Datasets
- Fish4Knowledge: Fish4Knowledge dataset was supported by the European Commission and developed jointly by a project team from the University of Edinburgh, Academia Sinica, and other groups, with the main aim of assisting in marine ecosystem research. This dataset contains about 700,000 underwater video clips with 10 min of monitoring Taiwan coral reefs over five years, which can be used for fish identification, detection, and tracking in images and videos. However, this dataset has unbalanced the number of fish categories (over 3000 fish species).
- LifeCLEF2014: LifeCLEF2014 was built based on the Fish4-Knowledge dataset by a project team from the University of Catania and Edinburgh, which contains about 1000 videos with 10 fish species. Approximately 20,000 fish in this dataset were labeled with species in the videos. However, this dataset also suffers from an unbalanced fish population.
- LifeCLEF2015: LifeCLEF2015 was also built based on the Fish4Knowledge dataset by a project team from the University of Catania and Edinburgh, which contains about 93 underwater videos with 15 fish species. This dataset contains about 9000 annotations (bounding boxes and species) in videos and 20,000 images with fewer labels. Compared with LifeCLEF2014, the image and videos in LifeCLEF2015 present a noisier, blurred, and poorer illumination environment.
- NOAA: NOAA dataset was developed by the National Oceanic and Atmospheric Administration (NOAA) during rockfish surveys in the Southern California Bight. It was collected by a digital camera deployed on a remotely operated vehicle (ROV). This dataset contains 929 images and 1005 annotations (locations and bounding rectangles). The challenges of the dataset include variations in the appearance and size of fish, small particles in the water, various swimming speeds and directions of fish, fish hidden behind the rocks or in crevices, and some indistinct fish-like objects.
- NCFM: The NCFM dataset was supported by the worldwide competition of “The Nature Conservancy Fisheries Monitoring” hosted by Kaggle, which contains about 3777 fish images. The images were taken by cameras installed on different fishing boats. The light variation, complex background, and occlusion of fish in the dataset make fish recognition very challenging.
- ImageNet: ImageNet was initiated by the team of Fei-Fei Li at Stanford University, which contains over 14 million images. ImageNet is an image dataset organized by the WordNet hierarchy, in which each node is connected to hundreds or thousands of images.
2.1.2. On-Site Datasets
2.2. Image Preprocessing
- Image size transformation: Image size transformation (such as image cropping and resizing) is the most common image preprocessing method, which can reduce the computation or meet the input requirements of DNN models by adjusting images of different sizes to a uniform size [47].
- Image enhancement: Blurred and low-contrast images will lose some detail in the target. Image enhancement strategies such as linearization, contrast-limited adaptive histogram equalization (CLAHE), Retinex, and discrete wavelet transform (DWT) can recover high-quality images from low-resolution data [29,49]. In addition, the DL-based image enhancement approaches have received increasing attention in the aquatic field [50,51].
- Data augmentation: Data augmentation techniques can extend the number of training samples to avoid overfitting DNN models to small amounts of training samples. Data augmentation methods include rotation, cropping, flipping, and Cutmix [52]. In recent years, generative adversarial networks (GAN) that can generate pseudo-images based on input noise have been widely used in data augmentation [53].
3. Typical DL-Based Object Detection Algorithms
3.1. Two-Stage Object Detection Algorithms
- R-CNN: Girshick et al. proposed the R-CNN algorithm in 2014, introducing DL into object detection. R-CNN first uses the selective search to generate region proposals of the input image, then inputs the region proposals into a convolutional neural network (CNN) to extract features. Finally, it classifies the features using SVM and fine-tunes the bounding regions via bounding-box regression and greedy non-maximal suppression (NMS). Although R-CNN pushes object detection into the era of DNN, it occupies a lot of computing resources. Moreover, it easily leads to missing data information caused by the norm of the region proposals.
- Fast R-CNN: To further improve the training speed of the algorithm and reduce the occupation of computing resources, Fast R-CNN inputs the whole image into a CNN for feature extraction and introduces a region of interest (ROI) pooling layer for scale transformation of the region proposal features with different sizes. However, the region proposal generation mechanism based on the selective search is still the bottleneck restricting the further improvement of the detection speed.
- Faster R-CNN: Faster R-CNN enables end-to-end detection, which innovatively proposes a regional proposal network (RPN) to replace the selective search, thereby significantly improving the generation speed of the detection bounding box. The performance of the Faster R-CNN for fish detection with an accuracy of 82.7% exceeds Fast R-CNN on the ImageCLEF dataset [63,64]. In addition, compared with ZF Net and CNN-M, the Faster R-CNN algorithm based on VGG-16 achieved the best results for fish detection with a mean average precision (mAP) of 82.4% on underwater images obtained from remote underwater video stations [65].
- Mask R-CNN: Mask R-CNN is an extension of Faster R-CNN by adding an object segmentation branch parallel to object classification and bounding box regression branches, which can perform object detection and instance segmentation simultaneously using one network structure.
3.2. One-Stage Object Detection Algorithms
- YOLO: Redmon et al. proposed YOLO in 2016 [58], an end-to-end one-stage DNN algorithm. It divides the whole input image into S × S grids. Then, it performs classification and bounding box detection of the target on each grid, which can directly regress the location and type of the target from the input image. YOLO achieved a fish detection accuracy of 93% and a detection speed of 16.7 frames per second (FPS) on the NOAA dataset, which can process noisy, dim-light, and hazy underwater images and outperformed the HOG classifier-based algorithm and SVM classifier [66].
- SSD: Aiming to overcome the low detection accuracy of YOLO for small objects, Liu et al. proposed the SSD algorithm in 2016 [59]. SSD combines multi-scale feature maps with the anchor mechanism in the Faster R-CNN and replaces the fully connected layer in YOLO with a convolutional layer, ensuring the detection speed while meeting the detection accuracy.
- YOLOV2: Although YOLO achieves real-time object detection, it suffers from many localization errors. To obtain higher detection accuracy, YOLOV2 [60] introduces some new technologies based on YOLOV1, including batch normalization, a high-resolution classifier, and bounding box prediction based on K-Means clustering and multi-scale training.
- YOLOV3: Redmon et al. utilized the residual network, feature pyramid network (FPN), and binary cross-entropy loss to upgrade YOLOV2 to YOLOV3 [61], making it suitable for multi-size objects. YOLOV3 achieved a mAP of 53.92% on MHK and hydropower underwater dataset, which can distinguish bubbles, debris, and fish [67].
- YOLOV4: YOLOV4 [62] applies a new backbone network and combines spatial pyramid pooling and path aggregation network (PAN) for feature fusion, which achieves higher detection performance.
4. Application of DL-Based Object Detection Techniques in Fish Aquaculture
4.1. Fish Counting
4.1.1. Image-Based Fish Counting
4.1.2. Video-Based Fish Counting
4.2. Fish Body Length Measurement
4.2.1. Monocular Vision-Based Fish Body Length Measurement
4.2.2. Stereo Vision-Based Fish Body Length Measurement
4.3. Individual Fish Behavior Analysis
4.3.1. Image-Based Individual Fish Behavior Analysis
4.3.2. Video-Based Individual Fish Behavior Analysis
5. Performance Evaluation Metrics
6. Challenges and Future Perspectives
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Lauder, G.V. Fish Locomotion: Recent Advances and New Directions. Annu. Rev. Mar. Sci. 2015, 7, 521–545. [Google Scholar] [CrossRef] [PubMed]
- Monkman, G.G.; Hyder, K.; Kaiser, M.J.; Vidal, F.P. Application of machine vision systems in aquaculture with emphasis on fish: State-of-the-art and key issues. Rev. Aquac. 2017, 9, 369–387. [Google Scholar] [CrossRef]
- FAO. The State of World Fisheries and Aquaculture 2020: Sustainability in Action; FAO: Rome, Italy, 2020; p. 244. [Google Scholar]
- Bossier, P.; Ekasari, J. Biofloc technology application in aquaculture to support sustainable development goals. Microb. Biotechnol. 2017, 10, 1012–1016. [Google Scholar] [CrossRef] [PubMed]
- Zhao, S.; Zhang, S.; Liu, J.; Wang, H.; Zhu, J.; Li, D.; Zhao, R. Application of machine learning in intelligent fish aquaculture: A review. Aquaculture 2021, 540, 736724. [Google Scholar] [CrossRef]
- Yang, L.; Liu, Y.; Yu, H.; Fang, X.; Song, L.; Li, D.; Chen, Y. Computer Vision Models in Intelligent Aquaculture with Emphasis on Fish Detection and Behavior Analysis: A Review. Arch. Comput. Methods Eng. 2020, 28, 2785–2816. [Google Scholar] [CrossRef]
- Mei, Y.; Sun, B.; Li, D. Recent advances of target tracking applications in aquaculture with emphasis on fish. Comput. Electron. Agric. 2022, 201, 107335. [Google Scholar] [CrossRef]
- Sutterlin, A.M.; Jokola, K.J.; Holte, B. Swimming Behavior of Salmonid Fish in Ocean Pens. J. Fish. Res. Board Can. 1979, 36, 948–954. [Google Scholar] [CrossRef]
- Yada, S.; Chen, H. Weighing Type Counting System for Seedling Fry. Nihon-Suisan-Gakkai-Shi 1997, 63, 178–183. [Google Scholar] [CrossRef]
- Li, D.; Hao, Y.; Duan, Y. Nonintrusive methods for biomass estimation in aquaculture with emphasis on fish: A review. Rev. Aquac. 2019, 12, 1390–1411. [Google Scholar] [CrossRef]
- An, D.; Hao, J.; Wei, Y.; Wang, Y.; Yu, X. Application of computer vision in fish intelligent feeding system—A review. Aquac. Res. 2020, 52, 423–437. [Google Scholar] [CrossRef]
- Yang, X.; Zhang, S.; Liu, J.; Gao, Q.; Dong, S.; Zhou, C. Deep learning for smart fish farming: Applications, opportunities and challenges. Rev. Aquac. 2020, 13, 66–90. [Google Scholar] [CrossRef]
- Li, D.; Du, L. Recent advances of deep learning algorithms for aquacultural machine vision systems with emphasis on fish. Artif. Intell. Rev. 2021, 55, 4077–4116. [Google Scholar] [CrossRef]
- Kutlu, Y.; Iscimen, B.; Turan, C. Multi-stage fish classification system using morphometry. Fresenius Environ. Bull. 2017, 26, 1910–1916. [Google Scholar]
- Lalabadi, H.M.; Sadeghi, M.; Mireei, S.A. Fish freshness categorization from eyes and gills color features using multi-class artificial neural network and support vector machines. Aquac. Eng. 2020, 90, 102076. [Google Scholar] [CrossRef]
- Zhao, Y.-P.; Sun, Z.-Y.; Du, H.; Bi, C.-W.; Meng, J.; Cheng, Y. A novel centerline extraction method for overlapping fish body length measurement in aquaculture images. Aquac. Eng. 2022, 99, 102302. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
- Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object Detection With Deep Learning: A Review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef]
- Ranjan, R.; Patel, V.M.; Chellappa, R. HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 41, 121–135. [Google Scholar] [CrossRef]
- Liu, W.; Hasan, I.; Liao, S. Center and Scale Prediction: Anchor-free Approach for Pedestrian and Face Detection. Pattern Recognit. 2023, 135, 109071. [Google Scholar] [CrossRef]
- Ma, J.; Shao, W.; Ye, H.; Wang, L.; Wang, H.; Zheng, Y.; Xue, X. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. IEEE Trans. Multimed. 2018, 20, 3111–3122. [Google Scholar] [CrossRef]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.-S.; Bai, X. Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 1452–1459. [Google Scholar] [CrossRef] [PubMed]
- Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware Fast R-CNN for Pedestrian Detection. IEEE Trans. Multimed. 2017, 20, 985–996. [Google Scholar] [CrossRef]
- Islam, M.M.; Newaz, A.A.R.; Karimoddini, A. Pedestrian Detection for Autonomous Cars: Inference Fusion of Deep Neural Networks. IEEE Trans. Intell. Transp. Syst. 2022, 23, 23358–23368. [Google Scholar] [CrossRef]
- Wang, H.; Yu, Y.; Cai, Y.; Chen, X.; Chen, L.; Liu, Q. A Comparative Study of State-of-the-Art Deep Learning Algorithms for Vehicle Detection. IEEE Intell. Transp. Syst. Mag. 2019, 11, 82–95. [Google Scholar] [CrossRef]
- Li, G.; Ji, Z.; Qu, X. Stepwise Domain Adaptation (SDA) for Object Detection in Autonomous Vehicles Using an Adaptive CenterNet. IEEE Trans. Intell. Transp. Syst. 2022, 23, 17729–17743. [Google Scholar] [CrossRef]
- Ben Tamou, A.; Benzinou, A.; Nasreddine, K. Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors. Appl. Intell. 2021, 51, 5809–5821. [Google Scholar] [CrossRef]
- Liu, T.; Li, P.; Liu, H.; Deng, X.; Liu, H.; Zhai, F. Multi-class fish stock statistics technology based on object classification and tracking algorithm. Ecol. Inform. 2021, 63, 101240. [Google Scholar] [CrossRef]
- Monkman, G.G.; Hyder, K.; Kaiser, M.J.; Vidal, F.P. Using machine vision to estimate fish length from images using regional convolutional neural networks. Methods Ecol. Evol. 2019, 10, 2045–2056. [Google Scholar] [CrossRef]
- Álvarez-Ellacuría, A.; Palmer, M.; Catalán, I.A.; Lisani, J.-L. Image-based, unsupervised estimation of fish size from commercial landings using deep learning. ICES J. Mar. Sci. 2019, 77, 1330–1339. [Google Scholar] [CrossRef]
- Hu, J.; Zhao, D.; Zhang, Y.; Zhou, C.; Chen, W. Real-time nondestructive fish behavior detecting in mixed polyculture system using deep-learning and low-cost devices. Expert Syst. Appl. 2021, 178, 115051. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, S.; Zhao, S.; Wang, Q.; Li, D.; Zhao, R. Real-time detection and tracking of fish abnormal behavior based on improved YOLOV5 and SiamRPN++. Comput. Electron. Agric. 2021, 192, 106512. [Google Scholar] [CrossRef]
- Fisher, R.B.; Chen-Burger, Y.-H.; Giordano, D.; Hardman, L.; Lin, F.-P. Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar] [CrossRef]
- Joly, A.; Goëau, H.; Glotin, H.; Spampinato, C.; Bonnet, P.; Vellinga, W.P.; Planque, R.; Rauber, A.; Fisher, R.; Müller, H. Lifeclef 2014: Multimedia life species identification challenges. In Information Access Evaluation. Multilinguality, Multimodality, and Interaction, Proceedings of the 5th International Conference of the CLEF Initiative, CLEF 2014, Sheffield, UK, 15–18 September 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 229–249. [Google Scholar] [CrossRef]
- Joly, A.; Goëau, H.; Glotin, H.; Spampinato, C.; Bonnet, P.; Vellinga, W.-P.; Planqué, R.; Rauber, A.; Palazzo, S.; Fisher, B. LifeCLEF 2015: Multimedia life species identification challenges. In Experimental IR Meets Multilinguality, Multimodality, and Interaction, Proceedings of the 6th International Conference of the CLEF Association, CLEF’15, Toulouse, France, 8–11 September 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 462–483. [Google Scholar] [CrossRef]
- Cutter, G.; Stierhoff, K.; Zeng, J. Automated detection of rockfish in unconstrained underwater videos using haar cascades and a new image dataset: Labeled fishes in the wild. In Proceedings of the 2015 IEEE Winter Applications and Computer Vision Workshops, Waikoloa, HI, USA, 6–9 January 2015; pp. 57–62. [Google Scholar]
- Ali-Gombe, A.; Elyan, E.; Jayne, C. Fish classification in context of noisy images. In Engineering Applications of Neural Networks, Proceedings of the 18th International Conference, EANN 2017, Athens, Greece, 25–27 August 2017; Springer International Publishing: Cham, Switzerland, 2017; pp. 216–226. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June; pp. 248–255. [CrossRef]
- Li, Q.; Li, Y.; Niu, J. Real-time detection of underwater fish based on improved Yolo and transfer learning. Pattern Recognit. Artif. Intell. 2019, 32, 193–203. [Google Scholar] [CrossRef]
- Arvind, C.; Prajwal, R.; Bhat, P.N.; Sreedevi, A.; Prabhudeva, K. Fish detection and tracking in pisciculture environment using deep instance segmentation. In Proceedings of the TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India, 17–20 October 2019; pp. 778–783. [Google Scholar] [CrossRef]
- Costa, C.; Scardi, M.; Vitalini, V.; Cataudella, S. A dual camera system for counting and sizing Northern Bluefin Tuna (Thunnus thynnus; Linnaeus, 1758) stock, during transfer to aquaculture cages, with a semi automatic Artificial Neural Network tool. Aquaculture 2009, 291, 161–167. [Google Scholar] [CrossRef]
- Petritoli, E.; Cagnetti, M.; Leccese, F. Simulation of Autonomous Underwater Vehicles (AUVs) Swarm Diffusion. Sensors 2020, 20, 4950. [Google Scholar] [CrossRef]
- Wu, Y.; Duan, Y.; Wei, Y.; An, D.; Liu, J. Application of intelligent and unmanned equipment in aquaculture: A review. Comput. Electron. Agric. 2022, 199, 107201. [Google Scholar] [CrossRef]
- Zhou, C.; Zhang, B.; Lin, K.; Xu, D.; Chen, C.; Yang, X.; Sun, C. Near-infrared imaging to quantify the feeding behavior of fish in aquaculture. Comput. Electron. Agric. 2017, 135, 233–241. [Google Scholar] [CrossRef]
- Lin, K.; Zhou, C.; Xu, D.; Guo, Q.; Yang, X.; Sun, C. Three-dimensional location of target fish by monocular infrared imaging sensor based on a L–z correlation model. Infrared Phys. Technol. 2018, 88, 106–113. [Google Scholar] [CrossRef]
- Cai, K.; Miao, X.; Wang, W.; Pang, H.; Liu, Y.; Song, J. A modified YOLOv3 model for fish detection based on MobileNetv1 as backbone. Aquac. Eng. 2020, 91, 102117. [Google Scholar] [CrossRef]
- Salman, A.; Jalal, A.; Shafait, F.; Mian, A.; Shortis, M.; Seager, J.; Harvey, E. Fish species classification in unconstrained underwater environments based on deep learning. Limnol. Oceanogr. Methods 2016, 14, 570–585. [Google Scholar] [CrossRef]
- Garcia, R.; Prados, R.; Quintana, J.; Tempelaar, A.; Gracias, N.; Rosen, S.; Vågstøl, H.; Løvall, K. Automatic segmentation of fish using deep learning with application to fish size measurement. ICES J. Mar. Sci. 2019, 77, 1354–1366. [Google Scholar] [CrossRef]
- Zhou, W.-H.; Zhu, D.-M.; Shi, M.; Li, Z.-X.; Duan, M.; Wang, Z.-Q.; Zhao, G.-L.; Zheng, C.-D. Deep images enhancement for turbid underwater images based on unsupervised learning. Comput. Electron. Agric. 2022, 202, 107372. [Google Scholar] [CrossRef]
- Ranjan, R.; Sharrer, K.; Tsukuda, S.; Good, C. Effects of image data quality on a convolutional neural network trained in-tank fish detection model for recirculating aquaculture systems. Comput. Electron. Agric. 2023, 205, 107644. [Google Scholar] [CrossRef]
- Hu, X.; Liu, Y.; Zhao, Z.; Liu, J.; Yang, X.; Sun, C.; Chen, S.; Li, B.; Zhou, C. Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 2021, 185, 106135. [Google Scholar] [CrossRef]
- Lu, Y.; Chen, D.; Olaniyi, E.; Huang, Y. Generative adversarial networks (GANs) for image augmentation in agriculture: A systematic review. Comput. Electron. Agric. 2022, 200, 107208. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 1137–1149. [Google Scholar] [CrossRef]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.; Liao, H. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Li, X.; Shang, M.; Qin, H.; Chen, L. Fast accurate fish detection and recognition of underwater images with fast r-cnn. In Proceedings of the OCEANS 2015-MTS/IEEE Washington, Washington, DC, USA, 19–22 October 2015; pp. 1–5. [Google Scholar] [CrossRef]
- Li, X.; Shang, M.; Hao, J.; Yang, Z. Accelerating fish detection and recognition by sharing CNNs with objectness learning. In Proceedings of the OCEANS 2016—Shanghai, Shanghai, China, 10–13 April 2016. [Google Scholar] [CrossRef]
- Mandal, R.; Connolly, R.M.; Schlacher, T.A.; Stantic, B. Assessing fish abundance from underwater video using deep neural networks. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–6. [Google Scholar]
- Sung, M.; Yu, S.-C.; Girdhar, Y. Vision based real-time fish detection using convolutional neural network. In Proceedings of the OCEANS 2017—Aberdeen, Aberdeen, UK, 19–22 June 2017. [Google Scholar] [CrossRef]
- Xu, W.; Matzner, S. Underwater fish detection using deep learning for water power applications. In Proceedings of the 2018 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 12–14 December 2018; pp. 313–318. [Google Scholar] [CrossRef]
- Li, D.; Miao, Z.; Peng, F.; Wang, L.; Hao, Y.; Wang, Z.; Chen, T.; Li, H.; Zheng, Y. Automatic counting methods in aquaculture: A review. J. World Aquac. Soc. 2020, 52, 269–283. [Google Scholar] [CrossRef]
- Yu, X.; Wang, Y.; An, D.; Wei, Y. Counting method for cultured fishes based on multi-modules and attention mechanism. Aquac. Eng. 2021, 96, 102215. [Google Scholar] [CrossRef]
- Zhao, Y.; Li, W.; Li, Y.; Qi, Y.; Li, Z.; Yue, J. LFCNet: A lightweight fish counting model based on density map regression. Comput. Electron. Agric. 2022, 203, 107496. [Google Scholar] [CrossRef]
- Ditria, E.M.; Lopez-Marcano, S.; Sievers, M.; Jinks, E.L.; Brown, C.J.; Connolly, R.M. Automating the Analysis of Fish Abundance Using Object Detection: Optimizing Animal Ecology With Deep Learning. Front. Mar. Sci. 2020, 7, 429. [Google Scholar] [CrossRef]
- Labao, A.B.; Naval, P.C., Jr. Cascaded deep network systems with linked ensemble components for underwater fish detection in the wild. Ecol. Inform. 2019, 52, 103–121. [Google Scholar] [CrossRef]
- Li, H.; Yu, H.; Gao, H.; Zhang, P.; Wei, S.; Xu, J.; Cheng, S.; Wu, J. Robust detection of farmed fish by fusing YOLOv5 with DCM and ATM. Aquac. Eng. 2022, 99, 102301. [Google Scholar] [CrossRef]
- Salman, A.; Siddiqui, S.A.; Shafait, F.; Mian, A.; Shortis, M.R.; Khurshid, K.; Ulges, A.; Schwanecke, U. Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system. ICES J. Mar. Sci. 2019, 77, 1295–1307. [Google Scholar] [CrossRef]
- Levy, D.; Belfer, Y.; Osherov, E.; Bigal, E.; Scheinin, A.P.; Nativ, H.; Tchernov, D.; Treibitz, T. Automated analysis of marine video with limited data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18-22 June 2018; pp. 1385–1393. [Google Scholar] [CrossRef]
- Mohamed, H.E.-D.; Fadl, A.; Anas, O.; Wageeh, Y.; ElMasry, N.; Nabil, A.; Atia, A. MSR-YOLO: Method to Enhance Fish Detection and Tracking in Fish Farms. Procedia Comput. Sci. 2020, 170, 539–546. [Google Scholar] [CrossRef]
- White, D.; Svellingen, C.; Strachan, N. Automated measurement of species and length of fish by computer vision. Fish. Res. 2006, 80, 203–210. [Google Scholar] [CrossRef]
- Shafry, M.R.M. FiLeDI framework for measuring fish length from digital images. Int. J. Phys. Sci. 2012, 7, 607–618. [Google Scholar] [CrossRef]
- Muñoz-Benavent, P.; Andreu-García, G.; Valiente-González, J.M.; Atienza-Vanacloig, V.; Puig-Pons, V.; Espinosa, V. Enhanced fish bending model for automatic tuna sizing using computer vision. Comput. Electron. Agric. 2018, 150, 52–61. [Google Scholar] [CrossRef]
- Palmer, M.; Álvarez-Ellacuría, A.; Moltó, V.; Catalán, I.A. Automatic, operational, high-resolution monitoring of fish length and catch numbers from landings using deep learning. Fish. Res. 2021, 246, 106166. [Google Scholar] [CrossRef]
- Huang, K.; Li, Y.; Suo, F.; Xiang, J. Stereo vison and mask-rcnn segmentation-based 3D points cloud matching for fish dimension measurement. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 6345–6350. [Google Scholar] [CrossRef]
- Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9157–9166. [Google Scholar] [CrossRef]
- Wang, X.; Kong, T.; Shen, C.; Jiang, Y.; Li, L. Solo: Segmenting objects by locations. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; pp. 649–665. [Google Scholar] [CrossRef]
- Fernandes, A.F.; Turra, E.; de Alvarenga, R.; Passafaro, T.L.; Lopes, F.B.; Alves, G.F.; Singh, V.; Rosa, G.J. Deep Learning image segmentation for extraction of fish body measurements and prediction of body weight and carcass traits in Nile tilapia. Comput. Electron. Agric. 2020, 170, 105274. [Google Scholar] [CrossRef]
- Yu, X.; Wang, Y.; Liu, J.; Wang, J.; An, D.; Wei, Y. Non-contact weight estimation system for fish based on instance segmentation. Expert Syst. Appl. 2022, 210, 118403. [Google Scholar] [CrossRef]
- Chen, F.; Sun, M.; Du, Y.; Xu, J.; Zhou, L.; Qiu, T.; Sun, J. Intelligent feeding technique based on predicting shrimp growth in recirculating aquaculture system. Aquac. Res. 2022, 53, 4401–4413. [Google Scholar] [CrossRef]
- Liu, J.; Bienvenido, F.; Yang, X.; Zhao, Z.; Feng, S.; Zhou, C. Nonintrusive and automatic quantitative analysis methods for fish behaviour in aquaculture. Aquac. Res. 2022, 53, 2985–3000. [Google Scholar] [CrossRef]
- Zhou, C.; Xu, D.; Chen, L.; Zhang, S.; Sun, C.; Yang, X.; Wang, Y. Evaluation of fish feeding intensity in aquaculture using a convolutional neural network and machine vision. Aquaculture 2019, 507, 457–465. [Google Scholar] [CrossRef]
- Sun, L.; Wang, B.; Yang, P.; Wang, X.; Li, D.; Wang, J. Water quality parameter analysis model based on fish behavior. Comput. Electron. Agric. 2022, 203, 107500. [Google Scholar] [CrossRef]
- Måløy, H.; Aamodt, A.; Misimi, E. A spatio-temporal recurrent network for salmon feeding action recognition from underwater videos in aquaculture. Comput. Electron. Agric. 2019, 167, 105087. [Google Scholar] [CrossRef]
- Xu, W.; Zhu, Z.; Ge, F.; Han, Z.; Fengli, G. Analysis of Behavior Trajectory Based on Deep Learning in Ammonia Environment for Fish. Sensors 2020, 20, 4425. [Google Scholar] [CrossRef]
- Han, F.; Zhu, J.; Liu, B.; Zhang, B.; Xie, F. Fish shoals behavior detection based on convolutional neural network and spatio-temporal information. IEEE Access 2020, 8, 126907–126926. [Google Scholar] [CrossRef]
- Wang, G.; Muhammad, A.; Liu, C.; Du, L.; Li, D. Automatic Recognition of Fish Behavior with a Fusion of RGB and Optical Flow Data Based on Deep Learning. Animals 2021, 11, 2774. [Google Scholar] [CrossRef]
- Wang, M.; Deng, W.; Wang, M.; Deng, W. Deep visual domain adaptation: A survey. Neurocomputing 2018, 312, 135–153. [Google Scholar] [CrossRef]
- Chen, J.C.; Chen, T.-L.; Wang, H.-L.; Chang, P.-C. Underwater abnormal classification system based on deep learning: A case study on aquaculture fish farm in Taiwan. Aquac. Eng. 2022, 99, 102290. [Google Scholar] [CrossRef]
- Darapaneni, N.; Sreekanth, S.; Paduri, A.R.; Roche, A.S.; Murugappan, V.; Singha, K.K.; Shenwai, A.V. AI Based Farm Fish Disease Detection System to Help Micro and Small Fish Farmers. In Proceedings of the 2022 Interdisciplinary Research in Technology and Management (IRTM), Kolkata, India, 24–26 February 2022; pp. 1–5. [Google Scholar] [CrossRef]
Datasets | Total Videos/Images | Annotation | URL |
---|---|---|---|
Fish4Knowledge [34] | 700,000 underwater videos with 3000 fish species | - | https://homepages.inf.ed.ac.uk/rbf/Fish4Knowledge/resources.htm (accessed on 15 June 2022) |
LifeCLEF2014 [35] | 1000 underwater videos with 10 fish species | 20,000 labeled fish | https://www.imageclef.org/2014/lifeclef/fish(accessed on 15 June 2022) |
LifeCLEF2015 [36] | 93 underwater videos with 15 fish species | 9000 annotations in videos and 20,000 images with fewer labels | http://www.imageclef.org/lifeclef/2015/fish(accessed on 15 June 2022) |
NOAA [37] | 929 underwater images | 1005 labeled fish | https://swfscdata.nmfs.noaa.gov/labeled-fishes-in-the-wild/(accessed on 15 June 2022) |
NCFM [38] | 3777 images | - | https://www.kaggle.com/c/the-nature-conservancy-fisheries-monitoring(accessed on 5 April 2023) |
ImageNet [39] | over 14 million images | - | http://www.image-net.org/(accessed on 5 April 2023) |
Object Detection Algorithms | Advantages | Disadvantages | |
---|---|---|---|
Two-stage object algorithms | R-CNN [54] | Introduced DL to object detection for the first time | Slow training process; computer resource heavy |
Fast R-CNN [55] | Use ROI pooling to change the scale of the feature | Time-consuming selective search for region proposals | |
Faster R-CNN [56] | End-to-end training | Low detection accuracy for multi-scale and small object | |
Mask R-CNN [57] | Accurate instance segmentation and high detection accuracy | Expensive instance segmentation | |
One-stage object algorithms | YOLO [58] | A novel one-stage detection algorithm, the detection speed is fast | Low detection accuracy; weak generalization ability |
SSD [59] | Combines regression and anchor mechanisms | Loss of small object features | |
YOLOV2 [60] | Further improved detection speed and improved the recall rate | Poor detection accuracy for small objects. | |
YOLOV3 [61] | Improves the detection accuracy of small objects | Low recall rate | |
YOLOV4 [62] | It incorporates a variety of tuning techniques | Largely unchanged detection model |
Data | References | Approaches | Fish Species/ Public Dataset | Data Preprocessing | Results |
---|---|---|---|---|---|
Image | Li et al. [64] | Faster R-CNN | LifeCLEF2014 | N/A | mAP = 82.7%; Time = 0.102 s/im |
Mandal et al. [65] | Faster R-CNN | 50 species of fish and crustaceans | N/A | mAP = 82.4% FPS = 5 | |
Sung et al. [66] | YOLO | NOAA dataset | N/A | Sensitivity = 93%; IOU = 65.2%; FPS = 16.7 | |
Xu et al. [67] | YOLOV3 | Unknown species; | N/A | mAP = 53.9% | |
Ditria et al. [71] | Mask R-CNN | Luderick | N/A | F1-Score = 92.4% mAP50 = 92.5% | |
Labao and Naval Jr [72] | Improved Faster R-CNN | Unknown species | N/A | Precision = 53.29% Recall = 37.77% F1-Score = 44.21% | |
Li et al. [73] | Improved YOLOV5 | Takifugu rubripes | Resize | Precision = 97.53% Recall = 98.09% | |
Li et al. [40] | Improved YOLO | Unknown species | CLAHE; Rotation; Brightness transformation | Precision = 89% Recall = 73% IOU = 66% FPS = 122 | |
Cai et al. [47] | Improved YOLOV3 | Takifugu rub ripe | Resize | AP = 78.63% | |
Video | Salman et al. [74] | Improved Faster R-CNN | Fish4Knowledge with Complex Scenes (FCS); LifeCLEF 2015 | N/A | F1-Score = 87.44%(FCS) F1-Score = 80.02% (LifeCLEF 2015) |
Ben et al. [28] | Improved Faster R-CNN | LifeCLEF 2015 | N/A | F1-Score = 83.16% mAP = 73.69% | |
Levy et al. [75] | RetnaNet + SORT | Unknown species | Resize | Precision = 74% | |
Arvind et al. [41] | Mask R-CNN + GOTURN | Ornamental fish | Resize | Precision = 99.81% Recall = 83.112% F1-Score = 90.70% FPS = 16 | |
Mohamed et al. [76] | YOLO + optical flow | Golden fish | Multi-Scale Retinex | Detected an average of 8 fish from above and 3 fish from underwater | |
Liu et al. [29] | Improved YOLOV4 + Kalman filter | Sebastodes fuscescens; Asteroidea. Hexagrommos otakii | Color compensation. CLAHE; Resize | ACC = 95.6% Recall = 93.3% IOU = 83% FPS = 33 MOTA = 83.6% IDF1 = 83.2% ID Switches = 59% |
Camera | References | Approaches | Fish Species | Data Preprocessing | Results |
---|---|---|---|---|---|
Monocular camera | Monkman et al. [30] | R-CNN | European sea bass | N/A | Mean bias error = 2.2% |
Álvarez-Ellacuría et al. [31] | Mask R-CNN + Statistical models | European hake | N/A | Root-mean-square deviation = 1.9 cm | |
Palmer et al. [80] | Mask R-CNN + Statistical model | Dolphinfish | N/A | The square root of the mean squared deviation = 2.4 cm | |
Stereo camera | Huang et al. [81] | Mask R-CNN + GrabCut + 3D cloud points + Coordinate transformation | Porphyry seabream | N/A | Average error = 5.5 mm (length); Average error = 2.7 mm (width) |
Garcia et al. [49] | Mask R-CNN + Local gradient + Morphological operations + Curve fitting | Saithe; Blue whiting; Redfish; Atlantic mackerel; Velvet belly lanternshark; Norway pout; Atlantic herring | Image linearization. Correction of non-uniform lighting | The average IOU = 0.89 (single fish); The average IOU = 0.79 (Overlapping fish) |
Data | References | Approaches | Fish Species /Feed | Behaviors | Data Preprocessing | Results |
---|---|---|---|---|---|---|
Image | Hu et al. [32] | Improved YOLOv3 | Crucian carp; catfish | Hunger and lack of oxygen behavior | CLAHE; DWT; Median filter; Flipping; Rotation; Gaussian blurring; Resize | Precision = 89.7%; Recall = 88.4%; IOU = 89.2%; FPS = 240 |
Hu et al. [52] | Improved YOLOV4 | Uneaten feed | Feeding status | CLAHE; Mosaic | Precision = 94%; Recall = 89%; F1 Score = 91%; AP50 = 92.61% | |
Video | Xu et al. [91] | Faster R-CNN; YOLOV3 | Red goldfish | Fish behavior in different concentrations of ammonia | Random cutting | ACC = 98.13% (Faster R-CNN); ACC = 95.66% (YOLOV3) |
Wang et al. [33] | Improved YOLOV5 + SiamRPN++ | Porphyry seabream | Turning-over behavior | N/A | Detection: AP50 = 99.4%; Tracking Precision = 76.7% |
Evaluation Metrics | Better Results | Description |
---|---|---|
ACC | Larger | The ratio of the number of correctly identified samples to the total number of identified samples |
Precision | Larger | The ratio of correctly identified fish to all identified fish |
Recall | Larger | The ratio of correctly identified fish to all fish in the sample |
F1-Score | Larger | The harmonic means of precision and recall |
mAP | Larger | Takes both precision and recall into consideration |
IOU | Larger | The overlap rate between the candidate area and the ground truth area |
MAE | Smaller | The expected value of the absolute difference between the predicted value and the ground truth |
MAPE | Smaller | Considers not only the error between the predicted value and the ground truth but also the ratio between the error and the ground truth |
MSE | Smaller | The expected value of the square of the difference between the predicted value and the ground truth |
RMSE | Smaller | The square root of the MSE |
ID switch | Smaller | The average total number of times that a resulting trajectory switches its matched ground-truth identity with another trajectory |
MOTA | Larger | Combines false positives and missed targets and identifies switches |
FPS | Larger | The number of images processed by the algorithm per second |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, H.; Ma, X.; Yu, Y.; Wang, L.; Hao, L. Application of Deep Learning-Based Object Detection Techniques in Fish Aquaculture: A Review. J. Mar. Sci. Eng. 2023, 11, 867. https://doi.org/10.3390/jmse11040867
Liu H, Ma X, Yu Y, Wang L, Hao L. Application of Deep Learning-Based Object Detection Techniques in Fish Aquaculture: A Review. Journal of Marine Science and Engineering. 2023; 11(4):867. https://doi.org/10.3390/jmse11040867
Chicago/Turabian StyleLiu, Hanchi, Xin Ma, Yining Yu, Liang Wang, and Lin Hao. 2023. "Application of Deep Learning-Based Object Detection Techniques in Fish Aquaculture: A Review" Journal of Marine Science and Engineering 11, no. 4: 867. https://doi.org/10.3390/jmse11040867
APA StyleLiu, H., Ma, X., Yu, Y., Wang, L., & Hao, L. (2023). Application of Deep Learning-Based Object Detection Techniques in Fish Aquaculture: A Review. Journal of Marine Science and Engineering, 11(4), 867. https://doi.org/10.3390/jmse11040867