.. _alfacedetection:

ALFaceDetection
=================

.. toctree::
    :hidden:
    :maxdepth: 1

    alfacedetection-api
    alfacedetection-tuto

Overview | :ref:`API <alfacedetection-api>` | :ref:`Tutorials <alfacedetection-tuto>`

.. seealso::

   - :ref:`Video camera Hardware<hardware-video>`


What it does
------------

**ALFaceDetection** is a vision module in which NAO tries to detect,
and optionally recognize, faces in front of him.

How it works
------------

**ALFaceDetection** is based on a face detection/recognition solution provided
by OKI with an upper layer improving recognition results.

Face detection
++++++++++++++

Face detection detects faces and provides their position, as well as a list of angular 
coordinates for important faces features (eyes, eyebrows, nose, mouth). 

Recognition
+++++++++++++

To make NAO not only detect but also recognize people, a learning stage is necessary.
For further details, see :ref:`alfacedetection-learning-stage` section.

Recognition feature returns for every image the names of people that are recognized.

**Temporal filter**: in addition, there is temporal filter output to easily build higher
level features using recognition. Indeed we don't want NAO to say "Hello Michel" several 
times per second, so someone's name will only be output the first time he is recognized
and will be placed in a short term memory. This memory will be kept as long as some faces
is not only recognized but detected by NAO. 
As soon as there are more than 4 seconds without detecting any face, the short term
memory is cleared and Michel name will be output again if NAO encounters him.
This is that output that is used in the Choregraphe **Face Reco** box. 

FaceDetected ALMemory key
+++++++++++++++++++++++++
   
Once ALFaceDetection is started, results are written in a variable in ALMemory
called *"FaceDetected"* organized as follows:


.. code-block:: guess

  FaceDetected =
  [
    TimeStamp,
    [ FaceInfo[N], Time_Filtered_Reco_Info ],
    Camera_Position_Info
  ]

**TimeStamp**: this field is the time stamp of the image that was used to perform the detection.

.. code-block:: guess

  TimeStamp =
  [
    TimeStamp_Seconds,
    Timestamp_Microseconds
  ]

**FaceInfo**: for each detected face, we have one FaceInfo field.

.. code-block:: guess

  FaceInfo =
  [
    ShapeInfo,
    ExtraInfo[N]
  ]


*ShapeInfo*: shape information about a face.

.. code-block:: guess

  ShapeInfo =
  [
    0,
    alpha,
    beta,
    sizeX,
    sizeY
  ]

- **alpha** and **beta** represent the face's location in terms of camera angles
- **sizeX** and **sizeY** are the face's size in camera angle


*ExtraInfo*: shape information about a face.

.. code-block:: guess

  ExtraInfo =
  [
    faceID,
    scoreReco,
    faceLabel,
    leftEyePoints,
    rightEyePoints,
    leftEyebrowPoints,
    rightEyebrowPoints,
    nosePoints,
    mouthPoints
  ]

- **faceID** represents the ID number for the face
- **scoreReco** is the score returned by the rocognition process (the higher, the better)
- **faceLabel** is the name of the recognized face if the face has been recognized
- **leftEyePoints** and **rightEyePoints** provide interesting points positions for the eyes (given in camera angles)

.. code-block:: guess

  EyePoints =
  [
    eyeCenter_x,
    eyeCenter_y,
    noseSideLimit_x,
    noseSideLimit_y,
    earSideLimit_x,
    earSideLimit_y,
    topLimit_x,
    topLimit_y,
    bottomLimit_x,
    bottomLimit_y,
    midTopEarLimit_x,
    midTopEarLimit_y,
    midTopNoseLimit_x,
    midTopNoseLimit_y
  ]

- **leftEyebrowPoints** and **leftEyebrowPoints** provide interesting points positions for the eyebrows (given in camera angles)

.. code-block:: guess

  EyebrowPoints =
  [
    noseSideLimit_x,
    noseSideLimit_y,
    center_x,
    center_y,
    earSideLimit_x,
    earSideLimit_y
  ]

- **nosePoints** provides interesting points positions for the nose (given in camera angles)

.. code-block:: guess

  NosePoints =
  [
    bottomCenterLimit_x,
    bottomCenterLimit_y,
    bottomLeftLimit_x,
    bottomLeftLimit_y,
    bottomRightLimit_x,
    bottomRightLimit_y
  ]

- **mouthPoints** provides interesting points positions for the mouth (given in camera angles)

.. code-block:: guess

  MouthPoints =
  [
    leftLimit_x,
    leftLimit_y,
    rightLimit_x,
    rightLimit_y,
    topLimit_x,
    topLimit_y,
    bottomLimit_x,
    bottomLimit_y,
    midTopLeftLimit_x,
    midTopLeftLimit_y,
    midTopRightLimit_x,
    midTopRightLimit_y,
    midBottomRightLimit_x,
    midBottomRightLimit_y,
    midBottomLeftLimit_x,
    midBottomLeftLimit_y
  ]


**Time_Filtered_Reco_Info** can be equal to:

- [] if there is nothing new
- [ 2, [ faceLabel ] ] if there is one face recognized
- [ 3, [ faceLabel0, ..., faceLabelP ] ] if there are several recognized faces
- [ 4 ] if a face has been detected for more than 8 seconds without being recognized. 
  Getting this result is a suggestion to learn this face if desired, but keep in mind
  that recognition only works for faces looking towards NAO.

**Camera_Position_Info**: position6D of the camera at the time the image was taken, in Nao Space.


Performances and Limitations
----------------------------

Detection
+++++++++++

**Performances**

* **Size range** for the detected faces:

  Minimum: ~45 pixels in a QVGA image. For an adult, this corresponds to
           around 3m with v3.x VGA cameras and more than 2m on v4 HD cameras. 

  Maximum: ~160 pixels in a QVGA image

* **Tilt**: +/- 20 deg (0 deg corresponding to a face facing the camera)

* **Rotation** in image plane: +/- 20 deg

**Limitations**

* **Lighting**: the face detection has been tested under office lightning
  conditions - ie, under 100 to 500 lux. If you feel that the detection is not
  running well, try to activate the camera auto gain - via the Monitor
  interface - or try to manually adjust the camera contrast.

Recognition
+++++++++++

**Performances**

When learning someones face, the subject is supposed to face the camera and to
keep a neutral face because a neutral face is between sadness and hapyness. 
Otherwise, it would be harder to recognize someone sad if he was smiling during
the learning process.

In order to get a more robust output, NAO checks first that he recognises the same person in
2 consecutive images from the camera before outputing the name.

Sometimes, depending on a change of location or haircut, a known face can be difficult to recognize.
To improve the robustness, a reinforcement process as been added. If someone is not
recognized, or mistaken for someone else, just learn him again. This learning will be
added to that person's database. After some days, you should get more reliable recognitions.


**Limitations**

Recognition is less robust than detection regarding pan, tilt, rotation and maximal distance.
Reason is that the recognition algorithm doesn't have a 3D representation of the person to
recognize and uses some info like distances between keypoints for the recognition (in a way
functionning partially like an identikit would do). If we turn the head, distances ratios will
be modified. 

Learning
+++++++++++

**Performances**

The learning stage is an intelligent process in which NAO checks that the face is correctly
exposed (e.g. no backlighting, no partial shadows) in 3 consecutive images.


**Limitations**

The learning stage can only be achieved with one face in the field of view at a time.


Getting Started
---------------


Detection
+++++++++++

To get a feel of what the ALFaceDetection can do, you can use Monitor and launch
the vision plugin. Activate the face detection checkbox and start the camera
acquisition. Then, if you present your face to the camera - or show a picture
with a face on it - Monitor should report the detected faces with blue crosses.

.. image:: /medias/dev/modules/vision/face_detection_telepathe.png

Another way to use face detection is to launch the Choregraphe **Walk Tracker**
or **WB Tracker** boxes and switch default value from **Red Ball** to **Face**.
Doing so, you can ask NAO to move toward the person in order to always keep the
face in the middle of his field of view.

.. _alfacedetection-learning-stage:

Learning stage for recognition
++++++++++++++++++++++++++++++++

**Learning stage** can be done via the learnFace bound method of the API or
through user friendly interface of Choregraphe **Learn Face** box.

* Once you have clicked on the box and entered the name of the person, this person
  has 5 seconds to place its face in front of NAO.

* Then the learning process is launched during wich NAO's eyes gets blue. 

  * His eyes turns green in less than a second if the face is seen by NAO in correct
    conditions (e.g. no partial shadow on the face, no backlight, person is not too far). 
  
  * If the eyes are still blue after some seconds, the person should move in order
    to change the learning conditions. 
  
.. note::
   The algorithm requires better conditions for the learning stage than the ones needed
   for detection.

.. note::
   You can launch the `WB Tracker` box in parallel with the learning stage so the face to
   learn will always be in the middle of NAO's field of view.