A Comprehensive Framework for Underwater Object Detection Based on Improved YOLOv8

Sineglazov, Victor; Синєглазов, Віктор Михайлович

doi:10.18372/1990-5548.79.18429

Please use this identifier to cite or link to this item: https://er.nau.edu.ua/handle/NAU/63103

Title:	A Comprehensive Framework for Underwater Object Detection Based on Improved YOLOv8
Other Titles:	Мережа для виявлення підводних об’єктів з використанням модифікованої архітектури YOLOv8
Authors:	Sineglazov, Victor Синєглазов, Віктор Михайлович
Keywords:	underwater object detection classification problem YOLO hybrid neural networks deep learning виявлення підводних об’єктів класифікація YOLO гібридні нейронні мережі глибоке навчання
Issue Date:	29-Mar-2024
Publisher:	National Aviation University
Citation:	Sineglazov V. M. A Comprehensive Framework for Underwater Object Detection Based on Improved YOLOv8 / V. M. Sineglazov, M. V. Savchenko // Electronics and Control Systems. Kyiv: NAU, 2024. – No 1(79). – pp. 9–15.
Series/Report no.:	Electronics and Control Systems;№1(79) Електроніка та системи управління;№1(79)
Abstract:	Underwater object detection poses unique challenges due to issues such as poor visibility, small densely packed objects, and target occlusion. In this paper, we propose a comprehensive framework for underwater object detection based on improved YOLOv8, addressing these challenges and achieving superior performance. Our framework integrates several key enhancements including Contrast Limited Adaptive Histogram Equalization for image preprocessing, a lightweight GhostNetV2 backbone, Coordinate Attention mechanism, and Deformable ConvNets v4 for improved feature representation. Through experimentation on the UTDAC2020 dataset, our model achieves 82.35% precision, 80.98 % recall, and 86.21 % mean average precision at IoU = 0.5. Notably, our framework outperforms the YOLOv8s model by a significant margin, while also being 15.1% smaller in terms of computational complexity. These results underscore the efficiency of our proposed framework for underwater object detection tasks, demonstrating its potential for real-world applications in underwater environments. В даній роботі розроблено нейронну мережу для виявлення підводних об’єктів на основі модифікованої архітектури YOLOv8. Розглянуто використання модуля попередньої обробки зображень на основі контрастно-обмеженого адаптивного вирівнювання гістограми, архітектури GhostNetV2 для ефективного вилучення ознак і зменшення загальної кількості параметрів, механізму уваги Coordinate Attention та оператора Deformable ConvNets v4 для покращеної репрезентації ознак. Модель перевірено на вибірці UTDAC2020 (результати – precision 82.35%, recall 80.98%, mAp 86.21% при значенні IoU = 0.5), що випереджає результати YOLOv8s на даній вибірці при зменшенні обчислювальної складності на 15.1%. Результат даної роботи можна застосувати для розробки програмного забезпечення для безпілотних підводних апаратів.
Description:	[1] F. Alenezi, A. Armghan, and K. C. Santosh, “Underwater image dehazing using global color features,” Engineering Applications of Artificial Intelligence, vol. 116, p. 105489, Nov. 2022, https://doi.org/10.1016/j.engappai.2022.105489. [2] K. Hu, C. Weng, Y. Zhang, J. Jin, and Q. Xia, “An Overview of Underwater Vision Enhancement: From Traditional Methods to Recent Deep Learning,” Journal of Marine Science and Engineering, vol. 10, no. 2, p. 241, Feb. 2022, https://doi.org/10.3390/jmse10020241. [3] M. Han, Z. Lyu, T. Qiu, and M. Xu, “A Review on Intelligence Dehazing and Color Restoration for Underwater Images,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 50, no. 5, pp. 1820–1832, May 2020, https://doi.org/10.1109/tsmc.2017.2788902. [4] H. Hu, L. Zhao, B. Huang, X. Li, H. Wang, and T. Liu, “Enhancing Visibility of Polarimetric Underwater Image by Transmittance Correction,” IEEE Photonics Journal, vol. 9, no. 3, pp. 1–10, Jun. 2017, https://doi.org/10.1109/jphot.2017.2698000. [5] Kaiming He, Jian Sun, and Xiaoou Tang, “Single Image Haze Removal Using Dark Channel Prior,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341–2353, Dec. 2011, https://doi.org/10.1109/tpami.2010.168. [6] X. Fu, P. Zhuang, Y. Huang, Y. Liao, X.-P. Zhang, and X. Ding, “A retinex-based enhancing approach for single underwater image,” International Conference on Image Processing, Oct. 2014, https://doi.org/10.1109/icip.2014.7025927. [7] W.-H. Zhang, G. Li, and Z. Ying, “A new underwater image enhancing method via color correction and illumination adjustment,” Visual Communications and Image Processing, Dec. 2017, https://doi.org/10.1109/vcip.2017.8305027. [8] R. Liu, Z. Jiang, S. Yang, and X. Fan, “Twin Adversarial Contrastive Learning for Underwater Image Enhancement and Beyond,” IEEE transactions on image processing, vol. 31, pp. 4922–4936, Jan. 2022, https://doi.org/10.1109/tip.2022.3190209. [9] M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706. [10] H. Liu, P. Song, and R. Ding, “WQT and DG-YOLO: towards domain generalization in underwater object detection,” arXiv (Cornell University), Apr. 2020, https://doi.org/10.48550/arxiv.2004.06333. [11] W. Lin, J.-X. Zhong, S. Liu, T. Li, and G. Li, “ROIMIX: Proposal-Fusion Among Multiple Images for Underwater Object Detection,” arXiv (Cornell University), May 2020, https://doi.org/10.1109/icassp40776.2020.9053829. [12] X. Sun et al., “Transferring deep knowledge for object recognition in Low-quality underwater videos,” Neurocomputing, vol. 275, pp. 897–908, Jan. 2018, https://doi.org/10.1016/j.neucom.2017.09.044. [13] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, Jun. 2017, https://doi.org/10.1109/tpami.2016.2577031. [14] W. Liu et al., “SSD: Single Shot MultiBox Detector,” Computer Vision – ECCV 2016, vol. 9905, pp. 21–37, 2016, https://doi.org/10.1007/978-3-319-46448-0_2. [15] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal loss for dense object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2018, https://doi.org/10.1109/tpami.2018.2858826. [16] Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “YOLOX: Exceeding YOLO Series in 2021,” Computer Vision and Pattern Recognition (cs.CV), Jul. 2018, https://doi.org/10.48550/arXiv.2107.08430. [17] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-stage Object Detection,” Computer Vision and Pattern Recognition (cs.CV), Sep. 2019, https://doi.org/10.48550/arXiv.1904.01355. [18] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” arXiv.org, Jun. 2015, https://doi.org/10.48550/arXiv.1506.02640. [19] M. Sung, S.-C. Yu, and Y. Girdhar, “Vision based real-time fish detection using convolutional neural network,” OCEANS 2017 – Aberdeen, Jun. 2017, https://doi.org/10.1109/oceanse.2017.8084889. [20] M. Pedersen, Joakim Bruslund Haurum, R. Gade, and T. B. Moeslund, “Detection of Marine Animals in a New Underwater Dataset with Varying Visibility,” Computer Vision and Pattern Recognition, pp. 18–26, Jun. 2019. [21] M. Zhang, S. Xu, W. Song, Q. He, and Q. Wei, “Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion,” Remote Sensing, vol. 13, no. 22, p. 4706, Nov. 2021, https://doi.org/10.3390/rs13224706. [22] K. Liu, L. Peng, and S. Tang, “Underwater Object Detection Using TC-YOLO with Attention Mechanisms,” Sensors, vol. 23, no. 5, p. 2567, Jan. 2023, https://doi.org/10.3390/s23052567. [23] X. Shen, X. Sun, H. Wang, and X. Fu, “Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection,” Neural Computing and Applications, vol. 35, no. 27, pp. 19935–19960, Jul. 2023, https://doi.org/10.1007/s00521-023-08781-w. [24] F. Xu, H. Wang, J. Peng, and X. Fu, “Scale-aware feature pyramid architecture for marine object detection,” Neural Computing and Applications, vol. 33, no. 8, pp. 3637–3653, Jul. 2020, https://doi.org/10.1007/s00521-020-05217-7. [25] X. Li et al., “Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,” Jun. 2020, https://doi.org/10.48550/arXiv.2006.04388. [26] S. M. Pizer, R. E. Johnston, J. P. Ericksen, B. C. Yankaskas, and K. E. Muller, “Contrast-limited adaptive histogram equalization: speed and effectiveness,” in IEEE Xplore, May 1990, pp. 337–345. https://doi.org/10.1109/VBC.1990.109340. [27] Y. Tang, K. Han, J. Guo, C. Xu, C. Xu, and Y. Wang, “GhostNetV2: Enhance Cheap Operation with Long-Range Attention,” Nov. 2022, https://doi.org/10.48550/arxiv.2211.12905. [28] Q. Hou, D. Zhou, and J. Feng, “Coordinate Attention for Efficient Mobile Network Design,” Mar. 2021, https://doi.org/10.48550/arXiv.2103.02907. [29] J. Dai et al., “Deformable Convolutional Networks,” Jun. 2017, https://doi.org/10.48550/arXiv.1703.06211. [30] Y. Xiong et al., “Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications,” arXiv (Cornell University), Jan. 2024, https://doi.org/10.48550/arxiv.2401.06197.
URI:	https://er.nau.edu.ua/handle/NAU/63103
ISSN:	1990-5548
DOI:	10.18372/1990-5548.79.18429
Appears in Collections:	Наукові публікації та матеріали кафедри авіаційних комп'ютерно-інтегрованих комплексів (НОВА)

Files in This Item:

File	Description	Size	Format
3.pdf	Наукова стаття	1.39 MB	Adobe PDF	View/Open

Show full item record