YOLO-UNET Fusion For Medical Imaging Small Object Detection
DOI:
https://doi.org/10.64252/d4s1sv72Keywords:
Small Object Detection, YOLOv4, Deep Learning, Computer Vision, DarkNet Frame- work, Neural NetworksAbstract
Small object detection remains one of the most challenging problems in computer vision due to limited spatial information, low resolution, and contextual ambiguity. This paper introduces an evolved framework based on the YOLOv4 Deep DarkNet architecture specifically optimized for small object detection tasks. Our approach integrates an enhanced Cross-Stage Partial DarkNet53 (CSPDark- Net53) backbone with improved Spatial Pyramid Pool- ing (SPP) modules and augmented Path Aggregation Networks (PANet) for superior feature extraction and fusion. The proposed framework incorporates advanced data augmentation techniques, including Mosaic augmentation, CutMix transformations, and multi-scale training strategies. Additionally, we imple- ment an optimized Complete Intersection over Union (CIoU) loss function combined with enhanced Non- Maximum Suppression (NMS) algorithms. Extensive experiments conducted on the COCO dataset demon- strate that our framework achieves a remarkable 47.2% overall mAP and 28.3% small objects mAP, representing improvements of 8.5% and 14.6% re- spectively over the baseline YOLOv4 model. The pro- posed architecture maintains real-time performance with 62 FPS while introducing minimal computational overhead. These results establish new benchmarks for small object detection and provide practical solutions for applications requiring high-precision detection of diminutive targets.