Some sample outputs from our implementation

EXAMPLES

Motivation

Examples

Implementation

BACKGROUND AND MOTIVATION

There are multiple ways of doing face and facial landmark detection. Recent studies show that deep learning approaches can achieve impressive performance on attacking the problem with accuracy and robustness [1].

Robustness is a major requirement for facial algorithms to produce the correct output. Even if a landmark point is not visible, it is typical for facial landmark detection algorithms to guess a position for the point [2]. This is common to be able to detect the rotation of the face for instance. Despite being critical, it definitely poses a great challenge to such algorithms.

On [1] and [5], they both utilize deep cascaded multi-task neural networks to exploit efficiently the correlation between face bounding boxes, and the landmark features locations. For this reason, we target the implementation of a robust algorithm based on these studies. If we are able to achieve a high accuracy algorithm that also provides robustness on the hardest of scenarios (face covered, looking sideways, etc.), this will allow applications to perform even better face recognition.

Real world applications also require algorithms to perform efficiently and with great speed. A problem that deep convolutional neural networks have is that their implementation usually involves high computational cost. This means that the speed of such implementations is slow as shown by the results in [5]. Through the use of Caffe [10], we can achieve higher speeds on the Neural Network models, particularly when using Caffe with GPUs, and even more so, when training the models. To further improve the performance (speed) of the implementation, we will use C++ as our programming language. This will enable faster execution time in comparison to other high level languages such as Matlab.

Basics

SETTING UP THE ENVIRONMENT

The computer system used for testing has Ubuntu 16.04 Operating System and has a gcc compiler version of 5.4.0. This is important as there are some incompatibility issues that need to be addressed in order to make Caffe and the rest of the applications work correctly.

In order to use [1] Matlab implementation, we first need to install Caffe [10] by following their steps including any installation of library dependencies. If you have an NVIDA GPU, you can also follow the steps to setup the system to use the GPU for all the computations (including using the CuDNN libraries). When you compile Caffe, some issues might appear due to the hdf5 naming. This happens given that after Ubuntu 15.10, the default path and name of hdf5 head files and libraries changed.

To fix this problem you need to modify the Makefile files. Append /usr/include/hdf5/serial/ to INCLUDE_DIRS at line 90 in Makefile.config. And modify hdf5_hl and hdf5 to hdf5_serial_hl and hdf5_serial at line 181 in Makefile. Once the build is done we need to verify that matCaffe is working correctly by building it using make and then running the tests. Given that the g++ version of my system was more recient than the one matlab used for their own libraries, I had to do some manual fixes given I was receiving error messages. The following commands were executed to fix the problems:

Besides changing the pointer of the Matlab libraries to fix the issues, we need to make sure libstdc is installed and have an environment variable pointing to the share object for this library. As shown by the following commands:

Now we are ready to compile and run Matcaffe tests:

export glnxa=/usr/local/MATLAB/R2016b/bin/glnxa64

export x86lib=/usr/lib/x86_64-linux-gnu

sudo mv $glnxa/libopencv_imgproc.so.2.4
$glnxa/libopencv_imgproc.so.2.4.old

sudo cp $x86lib/libopencv_imgproc.so.2.4 $glnxa/

sudo mv $glnxa/libopencv_core.so.2.4
$glnxa/libopencv_core.so.2.4.old

sudo cp $x86lib/libopencv_core.so.2.4 $glnxa/

sudo mv $glnxaMat/libopencv_highgui.so.2.4
$glnxa/libopencv_highgui.so.2.4.old

sudo cp $x86lib/libopencv_highgui.so.2.4 $glnxa/

C++ IMPLEMENTATION

The following work had as an objective to implement the Matlab code from [1] using C++ and Caffe. This implementation can make use of their available trained Caffe models, enabling us the possibility to avoid training the model again as they exhaustively trained it through 3 different tasks: Face/non-face classification, bounding box regression, and facial landmark localization.

Their algorithm can be divided into 5 steps [1]:
1) PNET stage (Deep Convolutional Neural Network)
2) NMS (Non-maximum suppression and boundary box regression)
3) RNET stage (DCNN)
4) NMS (Non-maximum suppression and boundary box regression)
5) ONET stage (DCNN)

Each DCNN stage is modeled with a Caffe prototype. The PNET stage (Proposal stage) receives as input one of 12 scaled images based on the original picture and outputs multiple candidates for the bounding boxes for the faces for each scaled image. The second step grabs these bounding boxes and merge the highly overlapped candidates by creating a ”heat map” with the highest potential candidates.

The RNET stage (Refine Network) further rejects false candidates, and decrease the number of possible bounding boxes. The forth step runs NMS on the output again to further merge overlapped candidates. The ONET (Output Network) provides the final bounding boxes and the five facial landmark positions for each face.

To implement this in C++ a ”Detector” class was created with the functions that are needed to feed data into the different Caffe models and correctly interpret this data as described in the C++ Object Tree section of the final report.

The performance of this implementation was still not acceptable and further debug is required to understand why the performance is still slow. Despite using a GPU, speedup compared to Matlab implementation is just 2x, when we are expecting much higher performance.

For future work, I would follow with the implementation of the DCNNs described in [5] as shown in figure 3. The complexity of this implementation is even higher given that we don’t have a Caffe model for these CNNs. Based on the available information provided in the paper, this might not be enough to be able to successfully recreate and train a model in the remaining time for the semester.

The reason this paper has a more robust implementation is because they created 3 levels of neural networks, where they first created a DCNN to correlate the bounding box created and the landmark positions. On the next two stages, they fine-tune the detected positions based on small CNN applied to each landmark individually, where these CNNs are not allowed to make significant changes to the previously calculated positions. These three stages enable the algorithm to have incredible robustness, sacrificing execution time.

For the complete code, please visit github page:

https://github.com/jgutierrezm113/Face-Detection

MATLAB HACK

Our implementation actually works as it should (after lots of testing and debugging). The reason why it wasn’t outputting correct result is that the Image processing library we use is different than the one used in their Matlab implementation.

When the bounding boxes are scaled to fit the trained Caffe models, this input is slightly different (affecting overall scores and other data).

To fix this issue I hardcoded a fix on the last model to call a Matlab engine and create the input data based on the values obtained from the previous model and then pass this data to the C++ code to run the model through it.

This fix makes the output of the program much more accurate but it takes a lot longer as it has to open Matlab to execute the pre-processing there.

I decided to not apply fix to PNET or RNET because the output from those nets is “good enough”.

C++ Object Tree

Build

CPU: 6700K OC @ 4.5 GHz

GPU: GTX1080

RAM: 16 GB OC @ 3.2 GHz

SSD: 240 GB

Robust Facial Landmark Detection Tracking in a Deep Learning Framework

Julian Gutierrez

Final Project

ECE7398 - Human Centered Computing

BASICS

BASELINE

C++

FINAL OBJECTIVE: GPU

The idea is to eventually port this code to GPU using NVIDIA's Cuda Framework to achieve even higher performance.

Using C++ we can achieve higher performance than other Matlab implementations.

We also use OpenCV framework to manipulate the images and Caffe as our deep learning Framework.

In this project we explore the capabilities of Deep Neural Networks for accurate and robust face and facial landmark detection mechanisms through the implementation of a State-of-the-art algorithm [1] using
C++ and Caffe as the Neural Network interface in order to
achieve higher throughput and speed on the computation.

Environment

CONCLUSIONS AND FUTURE WORK

This project became increasingly difficult for me. I’ve never worked with deep learning algorithms, let alone, Neural Networks. Despite the fact that I understand the underlying basics of how they work, its proven to be a steep learning curve in understanding how Caffe models them.

Considering all of the dependencies required to successfully run a program using Caffe, setting up the environment prove to be a challenge as well. The Caffe interface is also not a simple install-and-use and requires in-depth knowledge on how it works to be able to use it correctly (to learn how to interpret the output data from a net for example).

I learned how face detection algorithms work, and in particular which common approaches are taken when using deep learning frameworks. I also learned a lot from Matlab and how it works (I had little knowledge before), in particular how the Matlab Engine/C++ interface works.

Future work will involve training the model with the new way of feeding in the data which will include creating a wrapper around the training functionality. Create a wrapper to test the whole FDDB database to obtain the accuracy results. Improve the performance on the CPU by improving the PNET scaling manipulation, Improving the NMS approach, and other options. Implement a new layer to improve the accuracy of the facial features. And Finally, port the algorithm to run on a GPU completely to be able to achieve a higher performance on the whole algorithm.

Conclusions

sudo apt-get install lib64stdc++6
export LD_PRELOAD= $x86lib/libstdc++.so.6

make all matcaffe

make mattest

TESTING MATLAB IMPLEMENTATION

To be able to run MTCNN code we also need to install Piotr’s Matlab toolbox [12]. Unzip the code, and add it in the matlab toolbox folder.

Final setup is to copy MTCNN code into a folder inside your Caffe directory and then execute Matlab from the Caffe main directory. You’ll have to run the following commands to setup Matlab correctly to use their implementation. When opening the matlab Demo.m file and executing it, a prompt will appear to change the current working directory. Accept this.

addpath(’./matlab ’ )
addpath(’/usr/local/MATLAB/R2016b/toolbox/ptoolbox’ )
addpath(’/usr/local/MATLAB/R2016b/toolbox/ptoolbox/channels’ )

After doing so, we executed small tests to make sure the code was working correctly.