Using Deep Learning Algorithms, Video Indexing Through The Human Faces Represented As Ean-8 Linear Bar Code
DOI:
https://doi.org/10.64252/r5vhep65Keywords:
MTCNN, Viola Jones, DSFD, Blaze face, YOLO v3, EAN-8 linear bar Code, Key frame, Human face, Window Technique.Abstract
This research paper proposes a technique that uses a facial image as a key frame to create an EAN-8 linear bar code from video. Recognition of faces has gained popularity because it can be used in various industries, including information security, smart cards, video surveillance, and law enforcement. Additionally, it aids in recognizing patterns and imagery processing. Numerous approaches for indexing videos use human faces as cues, but none are effective in identifying faces in videos due to factors like direction changes in the face, image brightness variations, illuminations of the face, etc. This research uses a novel method called "Video indexing through the human face as an EAN-8 Linear bar code using machine learning and deep learning algorithm" to overcome these problems. Viola-Jones, DSFD, Multi-Task Cascaded Convolutional Neural Networks (MTCNN), Blaze Face, and YOLO v3 extract key frames, or the human face, from video frames to address this problem. Following key frame identification, this key frame is converted to an EAN-8 linear bar code for video indexing and recognition. The main application case for this research is the identification of human face frames from videos and their representation as an EAN-8 linear video indexing bar code. This method is helpful for various tasks, including security, human activity recognition, video surveillance, and communication channel description. After describing each one, this approach indexes the video as a Linear EAN-8 bar code based on each human face in the movie. This research first compares the performance of several widespread face recognition and detection models (like Viola Jones (Haar Cascade), MTCNN, DSFD, Blaze face, and YOLO v3) that use machine learning and deep learning concepts. Following that, it is advised to use the top- performing model (MTCNN) to extract specific faces from the provided video footage. These extracted faces are used to generate EAN-8 linear barcodes, giving us a simple and fast way to produce a particular facial individuality predicated on natural features and thus reducing bandwidth, storage, and time complexity. The Hollywood video dataset, FDDB face dataset, WIDER FACE dataset, and LFW dataset were employed to assess the suggested method, which was proven to efficiently index videos using human faces.