Advancing Automated Surveillance: Real-Time Detection of Crown-of-Thorns Starfish via YOLOv5 Deep Learning


  • Guokun Xu Computer Science, Beijing Foreign Studies University, Beijing, China
  • Ying Xie Computer Science, San Francisco Bay University, Fremont, USA
  • Yang Luo Computer Science, China CITIC Bank Software Development Center, Beijing, China
  • Yibo Yin Computer Science, Contemporary Amperex Technology USA Inc, Auburn Hills, USA
  • Zhengning Li Computer Science, Georgetown University, Washington, D.C. USA
  • Zibu Wei Computer Science, University of California, Los Angeles, Los Angeles, USA



Great Barrier Reef, Crown-of-Thorns Starfish, YOLOv5, Object Detection, Deep Learning, Marine Conservation


The Great Barrier Reef faces significant threats from crown-of-thorns starfish (COTS), which consume coral polyps and contribute to reef degradation. Traditional methods for detecting these starfish are manual and labor-intensive, limiting their scalability and efficiency. This study proposes a real-time detection system using deep learning and computer vision to identify COTS in underwater video frames. We utilize the YOLOv5 model, known for its speed and accuracy in object detection tasks. Extensive data augmentation techniques are employed to handle the challenges of the underwater environment, such as varying lighting conditions and water turbidity. Additionally, we modify the YOLOv5 architecture to improve the detection of small objects like COTS, which often blend into the reef. To enhance detection consistency, we integrate a video object tracking system that maintains object continuity across frames, reducing false positives. Our approach demonstrated significant improvements in detection accuracy, achieving a Public Leaderboard score of 0.715, which places us in the top 2\% of submissions. This highlights the potential of our method for scalable and effective monitoring of the Great Barrier Reef, contributing to conservation efforts by providing a tool for continuous and automated detection of harmful species like COTS.


Fayaz, S., Parah, S. A., & Qureshi, G. J. (2022). Underwater object detection: architectures and algorithms–a comprehensive review. Multimedia Tools and Applications, 81(15), 20871-20916.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).

Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934.

Lu, Y., & Lu, G. (2021). Superthermal: Matching thermal as visible through thermal feature exploration. IEEE Robotics and Automation Letters, 6(2), 2690-2697.

Rekavandi, A. M., Xu, L., Boussaid, F., Seghouane, A. K., Hoefs, S., & Bennamoun, M. (2022). A guide to image and video based small object detection using deep learning: Case study of maritime surveillance. arXiv preprint arXiv:2207.12926.

Liang, X., Zhang, J., Zhuo, L., Li, Y., & Tian, Q. (2019). Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Transactions on Circuits and Systems for Video Technology, 30(6), 1758-1770.

Lin, Z., & Xu, F. (2023, July). Simulation of Robot Automatic Control Model Based on Artificial Intelligence Algorithm. In 2023 2nd International Conference on Artificial Intelligence and Autonomous Robot Systems (AIARS) (pp. 535-539). IEEE.

Wang, Q., Du, Z., Jiang, G., Cui, M., Li, D., Liu, C., & Li, W. (2022). A Real-Time Individual Identification Method for Swimming Fish Based on Improved Yolov5. Available at SSRN 4044575.

Chen, Z., Du, M., Yang, X. D., Chen, W., Li, Y. S., Qian, C., & Yu, H. Q. (2023). Deep-learning-based automated tracking and counting of living plankton in natural aquatic environments. Environmental Science & Technology, 57(46), 18048-18057.

Chen, H., Yang, Y., & Shao, C. (2021). Multi-task learning for data-efficient spatiotemporal modeling of tool surface progression in ultrasonic metal welding. Journal of Manufacturing Systems, 58, 306-315.

Alawode, B., Guo, Y., Ummar, M., Werghi, N., Dias, J., Mian, A., & Javed, S. (2022). Utb180: A high-quality benchmark for underwater tracking. In Proceedings of the Asian Conference on Computer Vision (pp. 3326-3342).

Lu, Y., & Lu, G. (2021). An alternative of lidar in nighttime: Unsupervised depth estimation based on single thermal image. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 3833-3843).

Xie, L., Wang, H., Wang, Z., & Cheng, L. (2020, July). DHD-Net: A novel deep-learning-based dehazing network. In 2020 International Joint Conference on Neural Networks (IJCNN) (pp. 1-7). IEEE.

Liu, K., Peng, L., & Tang, S. (2023). Underwater object detection using TC-YOLO with attention mechanisms. Sensors, 23(5), 2567.

Peng, Q., Ding, Z., Lyu, L., Sun, L., & Chen, C. (2022). RAIN: regularization on input and network for black-box domain adaptation. arXiv preprint arXiv:2208.10531.

Tian, G., & Xu, Y. (2022). A Study on the Typeface Design method of Han Characters imitated Tangut. Advances in Education, Humanities and Social Science Research, 1(2), 270-270.

Song, S., Li, Y., Huang, Q., & Li, G. (2021). A new real-time detection and tracking method in videos for small target traffic signs. Applied Sciences, 11(7), 3061.

Popov, R. S., Ivanchina, N. V., & Dmitrenok, P. S. (2022). Application of MS-based metabolomic approaches in analysis of starfish and sea cucumber bioactive compounds. Marine Drugs, 20(5), 320.

Yan, X., Xiao, M., Wang, W., Li, Y., & Zhang, F. (2024). A Self-Guided Deep Learning Technique for MRI Image Noise Reduction. Journal of Theory and Practice of Engineering Science, 4(01), 109-117.

Weimin, W. A. N. G., Yufeng, L. I., Xu, Y. A. N., Mingxuan, X. I. A. O., & Min, G. A. O. (2024). Enhancing Liver Segmentation: A Deep Learning Approach with EAS Feature Extraction and Multi-Scale Fusion. International Journal of Innovative Research in Computer Science & Technology, 12(1), 26-34.

Dai, W., Tao, J., Yan, X., Feng, Z., & Chen, J. (2023, November). Addressing Unintended Bias in Toxicity Detection: An LSTM and Attention-Based Approach. In 2023 5th International Conference on Artificial Intelligence and Computer Applications (ICAICA) (pp. 375-379). IEEE.

Li, Y., Wang, W., Yan, X., Gao, M., & Xiao, M. (2024). Research on the application of semantic network in disease diagnosis prompts based on medical corpus. International Journal of Innovative Research in Computer Science & Technology, 12(2), 1-9.

Yan, X., Wang, W., Xiao, M., Li, Y., & Gao, M. (2024). Survival prediction across diverse cancer types using neural networks. arXiv preprint arXiv:2404.08713.

Xiao, M., Li, Y., Yan, X., Gao, M., & Wang, W. (2024). Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example. arXiv preprint arXiv:2404.08279.

Wang, W., Gao, M., Xiao, M., Yan, X., & Li, Y. (2024). Breast Cancer Image Classification Method Based on Deep Transfer Learning. arXiv preprint arXiv:2404.09226.

Li, Y., Yan, X., Xiao, M., Wang, W., & Zhang, F. (2023, December). Investigation of creating accessibility linked data based on publicly available accessibility datasets. In Proceedings of the 2023 13th International Conference on Communication and Network Security (pp. 77-81).

Shen, X., Zhang, Q., Zheng, H., & Qi, W. (2024). Harnessing XGBoost for Robust Biomarker Selection of Obsessive-Compulsive Disorder (OCD) from Adolescent Brain Cognitive Development (ABCD) data. ResearchGate, May.

Zhang, N., Xiong, J., Zhao, Z., Feng, M., Wang, X., Qiao, Y., & Jiang, C. (2024). Dose My Opinion Count? A CNN-LSTM Approach for Sentiment Analysis of Indian General Elections. Journal of Theory and Practice of Engineering Science, 4(05), 40-50.

Wang, X., Qiao, Y., Xiong, J., Zhao, Z., Zhang, N., Feng, M., & Jiang, C. (2024). Advanced network intrusion detection with tabtransformer. Journal of Theory and Practice of Engineering Science, 4(03), 191-198.

Su, J., Nair, S., & Popokh, L. (2022, November). Optimal resource allocation in sdn/nfv-enabled networks via deep reinforcement learning. In 2022 IEEE Ninth International Conference on Communications and Networking (ComNet) (pp. 1-7). IEEE.

Feng, M., Wang, X., Zhao, Z., Jiang, C., Xiong, J., & Zhang, N. (2024). Enhanced Heart Attack Prediction Using eXtreme Gradient Boosting. Journal of Theory and Practice of Engineering Science, 4(04), 9-16.

Zhao, Z., Zhang, N., Xiong, J., Feng, M., Jiang, C., & Wang, X. (2024). Enhancing E-commerce Recommendations: Unveiling Insights from Customer Reviews with BERTFusionDNN. Journal of Theory and Practice of Engineering Science, 4(02), 38-44.

Zhu, E. Y., Zhao, C., Yang, H., Li, J., Wu, Y., & Ding, R. (2024). A Comprehensive Review of Knowledge Distillation-Methods, Applications, and Future Directions. International Journal of Innovative Research in Computer Science & Technology, 12(3), 106-112.

Li, Z., Yin, Y., Wei, Z., Luo, Y., Xu, G., & Xie, Y. (2024). High-Precision Neuronal Segmentation: An Ensemble of YOLOX, Mask R-CNN, and UPerNet. Journal of Theory and Practice of Engineering Science, 4(04), 45-52.

Luo, Y., Wei, Z., Xu, G., Li, Z., Xie, Y., & Yin, Y. (2024). Enhancing E-commerce Chatbots with Falcon-7B and 16-bit Full Quantization. Journal of Theory and Practice of Engineering Science, 4(02), 52-57.

Ding, R., Zhu, E. Y., Zhao, C., Yang, H., Li, J., & Wu, Y. (2024). Research on Optimizing Lightweight Small Models Based on Generating Training Data with ChatGPT. Journal of Industrial Engineering and Applied Science, 2(2), 39-45.

Bao, W., Che, H., & Zhang, J. (2020, December). Will_Go at SemEval-2020 Task 3: An accurate model for predicting the (graded) effect of context in word similarity based on BERT. In Proceedings of the Fourteenth Workshop on Semantic Evaluation (pp. 301-306).

Popokh, L., Su, J., Nair, S., & Olinick, E. (2021, September). IllumiCore: Optimization Modeling and Implementation for Efficient VNF Placement. In 2021 International Conference on Software, Telecommunications and Computer Networks (SoftCOM) (pp. 1-7). IEEE.

Peng, Q., Zheng, C., & Chen, C. (2024). A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2240-2249).

Yin, Y., Xu, G., Xie, Y., Luo, Y., Wei, Z., & Li, Z. (2024). Utilizing Deep Learning for Crystal System Classification in Lithium-Ion Batteries. Journal of Theory and Practice of Engineering Science, 4(03), 199-206.

Xie, Y., Li, Z., Yin, Y., Wei, Z., Xu, G., & Luo, Y. (2024). Advancing Legal Citation Text Classification A Conv1D-Based Approach for Multi-Class Classification. Journal of Theory and Practice of Engineering Science, 4(02), 15-22.

Peng, Q., Zheng, C., & Chen, C. (2023). Source-free domain adaptive human pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 4826-4836).

Su, J., Nair, S., & Popokh, L. (2023, February). EdgeGYM: a reinforcement learning environment for constraint-aware NFV resource allocation. In 2023 IEEE 2nd International Conference on AI in Cybersecurity (ICAIC) (pp. 1-7). IEEE.

Su, J., Jiang, C., Jin, X., Qiao, Y., Xiao, T., Ma, H., ... & Lin, J. (2024). Large Language Models for Forecasting and Anomaly Detection: A Systematic Literature Review. arXiv preprint arXiv:2402.10350.

Li, K., Zhu, A., Zhou, W., Zhao, P., Song, J., & Liu, J. (2024). Utilizing deep learning to optimize software development processes. arXiv preprint arXiv:2404.13630.

Zeyu Wang, Yue Zhu, Zichao Li, Zhuoyue Wang, Hao Qin, and Xinqi Liu. “Graph Neural Network Recommendation System for Football Formation”. Applied Science and Biotechnology Journal for Advanced Research, vol. 3, no. 3, May 2024, pp. 33-39, doi:10.5281/zenodo.12198843.




How to Cite

Xu, G., Xie, Y., Luo, Y., Yin, Y., Li, Z., & Wei, Z. (2024). Advancing Automated Surveillance: Real-Time Detection of Crown-of-Thorns Starfish via YOLOv5 Deep Learning. Journal of Theory and Practice of Engineering Science, 4(06), 1–10.