Face detection using Faceboxes

Deep Learning Nov 08, 2019

NOTE: Implementation of Facebox in TensorFlow is discussed Here, Deep learnin model to detect faces in the image with code.


Do you wonder, How facebook detects all the faces in a given image or how your phone can allocate your face to focus, well these all things are not rocket-science these are a simple application of neural networks and deep learningknown as a face detection.

Face Detection is the first and essential step for face recognition, and it is used to detect faces present in the images and videos. Face Detection is a part of object detection and can use in many areas such as security, biometrics, law enforcement, entertainment, personal safety, culprit detection, etc.

Face Detection is used to detect faces in real-time surveillance of passengers on airports and tracking of objects in real-time. Face Detection techniques are widely used in cameras to identify multiple appearances in the frame Ex- Mobile cameras and DSLR’s. Facebook is also using a face detection algorithms to detect faces in the images and recognize the person present in the image for tagging.


  1. First we'll see How face detection works.
  2. Then We'll learn about the architecture of Faceboxes
  3. Then We'll see How to train face Detection Model.
  4. Then We'll Perform Face Detection on our Images using Pre-trained Model

How does Face Detection work?

Face detection applications use deep learning algorithms and neural networks that determine whether images are positive images (i.e. images with a face) or negative images (i.e. images without a face). To be able to do this accurately, the algorithms must be trained on a large number of image datasets containing hundreds of thousands of face images and non-face images.

Once trained, the algorithms can answer two questions when we give an image as input:

  • Are there any faces present  in the given image(Yes or No)?
  • If yes, where are they located inside the image(Place a bounding box around the detected face(s))?

We will learn about FaceBoxes: A CPU Real-time Face Detector with High Accuracy, because it solves two major problem

1) The Problem of visual variation of faces in the cluttered backgrounds, which requires face detectors to accurately address a complicated face and non-face classification problem;

2) The large search space of possible face positions and face sizes in the image further imposes a time efficiency requirement.

To meet these two requirements Face detection has been implemented in two ways.

  • Using Hand-Craft features :: These methods are highly depending on nonrobust hand-craft features and optimize each component
    separately by humans, making the face detection pipeline sub-optimal.
    In brief, they are efficient and fast on the CPU but not accurate
    enough against the large visual variation of faces(poor accuracy).

  • Using CNN These CNN based face detection methods
    were robust to the large variety of facial appearances and
    demonstrate state-of-the-art performance, But they are too
    time-consuming to achieve real-time speed, especially on
    the CPU devices they were slow because of highly dense neural networks taking large computation time.

To perform well on both speed and accuracy, one obvious idea is to combine the advantages of these two types of methods.

Therefore, cascaded CNN based methods are proposed to put features learned by CNN into cascade framework to achieve high accuracy in real-time. However, there are three main problems with cascaded CNN based face detection methods:

  1. Their speed is inversely related to the number of faces present in the image. The speed would dramatically degrade as the number of faces increases;

  2. The cascade based detectors optimize each component separately, this makes the training process extremely complicated and the final model sub-optimal;

  3. For the VGA-resolution images, their runtime efficiency on the CPU is about 14 FPS, which is not fast enough to achieve the real-time speed.

Architecture of Faceboxes

. Fig.1 : Architecture of the FaceBoxes and the detailed information table about our anchor designs.

To solve all these problems with existing Face Detection methods, we will explore three contributions made by the FaceBoxes for accurate and efficient face detection on the CPU devices

  1. Rapidly Digested Convolutional Layers (RDCL)
  2. Multiple Scale Convolutional Layers (MSCL)
  3. , and the anchor densification strategy.

(See architecture above in the diagram)

To learn More about detailed architecture Go here


Implementation of Facebox In Tensorflow, Deep Learning Model to detect faces in the image with code.

Below given some Awesome resources to enhance you knowledge and skill.

sheetala tiwari

I am passionate about Data Science and Machine Learning. I am currently building an AI community on DataDiscuss and we are committed to providing free access to education for everyone.