Backbone Alignment and Cascade Tiny Object Detecting Techniques for Dolphin Detection and Classification

Yih-Cherng LEE  Hung-Wei HSU  Jian-Jiun DING  Wen HOU  Lien-Shiang CHOU  Ronald Y. CHANG  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E104-A   No.4   pp.734-743
Publication Date: 2021/04/01
Publicized: 2020/09/29
Online ISSN: 1745-1337
DOI: 10.1587/transfun.2020EAP1054
Type of Manuscript: PAPER
Category: Image
object detection,  deep learning,  object segmentation,  rotation measurement,  image classification,  

Full Text: PDF(4.1MB)>>
Buy this Article

Automatic tracking and classification are essential for studying the behaviors of wild animals. Owing to dynamic far-shooting photos, the occlusion problem, protective coloration, the background noise is irregular interference for designing a computerized algorithm for reducing human labeling resources. Moreover, wild dolphin images are hard-acquired by on-the-spot investigations, which takes a lot of waiting time and hardly sets the fixed camera to automatic monitoring dolphins on the ocean in several days. It is challenging tasks to detect well and classify a dolphin from polluted photos by a single famous deep learning method in a small dataset. Therefore, in this study, we propose a generic Cascade Small Object Detection (CSOD) algorithm for dolphin detection to handle small object problems and develop visualization to backbone based classification (V2BC) for removing noise, highlighting features of dolphin and classifying the name of dolphin. The architecture of CSOD consists of the P-net and the F-net. The P-net uses the crude Yolov3 detector to be a core network to predict all the regions of interest (ROIs) at lower resolution images. Then, the F-net, which is more robust, is applied to capture the ROIs from high-resolution photos to solve single detector problems. Moreover, a visualization to backbone based classification (V2BC) method focuses on extracting significant regions of occluded dolphin and design significant post-processing by referencing the backbone of dolphins to facilitate for classification. Compared to the state of the art methods, including faster-rcnn, yolov3 detection and Alexnet, the Vgg, and the Resnet classification. All experiments show that the proposed algorithm based on CSOD and V2BC has an excellent performance in dolphin detection and classification. Consequently, compared to the related works of classification, the accuracy of the proposed designation is over 14% higher. Moreover, our proposed CSOD detection system has 42% higher performance than that of the original Yolov3 architecture.