A Lightweight and Efficient Hybrid CNN Model for Face Detection
DOI:
https://doi.org/10.64252/edmkva81Keywords:
Face Detection, Occlusion, Viola-Jones, Convolutional Neural Networks, Hybrid Model, Real-Time Detection.Abstract
Face detection under occlusion remains a significant challenge in real-world computer vision applications. This paper proposes a hybrid two-stage detection framework that integrates the real-time efficiency of the Viola-Jones algorithm with the Accuracy of a lightweight, modified AlexNet-based Convolutional Neural Network (CNN). The system initially uses Viola-Jones to propose candidate face regions, which are then verified by the CNN trained on over 70,000 face and non-face images, half of which include partial occlusions such as masks, sunglasses, or hands. CNN incorporates dropout and batch normalisation to ensure robust generalisation. Experimental results demonstrate that the proposed hybrid model achieves a detection accuracy of 93%, precision of 95%, and a false positive rate of only 3%, outperforming state-of-the-art models such as MTCNN, SSD, YOLOv3, and RetinaFace in occlusion-specific scenarios. With a processing speed of approximately 12 frames per second on standard CPU hardware and a memory footprint of only 60 MB, the model is well-suited for real-time applications like surveillance and access control in occlusion-prone environments.