.. _alfacedetection: ALFaceDetection ================= .. toctree:: :hidden: :maxdepth: 1 alfacedetection-api alfacedetection-tuto Overview | :ref:`API ` | :ref:`Tutorials ` .. seealso:: - :ref:`Video camera Hardware` What it does ------------ **ALFaceDetection** is a vision module in which NAO tries to detect, and optionally recognize, faces in front of him. How it works ------------ **ALFaceDetection** is based on a face detection/recognition solution provided by OKI with an upper layer improving recognition results. Face detection ++++++++++++++ Face detection detects faces and provides their position, as well as a list of angular coordinates for important faces features (eyes, eyebrows, nose, mouth). Recognition +++++++++++++ To make NAO not only detect but also recognize people, a learning stage is necessary. For further details, see :ref:`alfacedetection-learning-stage` section. Recognition feature returns for every image the names of people that are recognized. **Temporal filter**: in addition, there is temporal filter output to easily build higher level features using recognition. Indeed we don't want NAO to say "Hello Michel" several times per second, so someone's name will only be output the first time he is recognized and will be placed in a short term memory. This memory will be kept as long as some faces is not only recognized but detected by NAO. As soon as there are more than 4 seconds without detecting any face, the short term memory is cleared and Michel name will be output again if NAO encounters him. This is that output that is used in the Choregraphe **Face Reco** box. FaceDetected ALMemory key +++++++++++++++++++++++++ Once ALFaceDetection is started, results are written in a variable in ALMemory called *"FaceDetected"* organized as follows: .. code-block:: guess FaceDetected = [ TimeStamp, [ FaceInfo[N], Time_Filtered_Reco_Info ], Camera_Position_Info ] **TimeStamp**: this field is the time stamp of the image that was used to perform the detection. .. code-block:: guess TimeStamp = [ TimeStamp_Seconds, Timestamp_Microseconds ] **FaceInfo**: for each detected face, we have one FaceInfo field. .. code-block:: guess FaceInfo = [ ShapeInfo, ExtraInfo[N] ] *ShapeInfo*: shape information about a face. .. code-block:: guess ShapeInfo = [ 0, alpha, beta, sizeX, sizeY ] - **alpha** and **beta** represent the face's location in terms of camera angles - **sizeX** and **sizeY** are the face's size in camera angle *ExtraInfo*: shape information about a face. .. code-block:: guess ExtraInfo = [ faceID, scoreReco, faceLabel, leftEyePoints, rightEyePoints, leftEyebrowPoints, rightEyebrowPoints, nosePoints, mouthPoints ] - **faceID** represents the ID number for the face - **scoreReco** is the score returned by the rocognition process (the higher, the better) - **faceLabel** is the name of the recognized face if the face has been recognized - **leftEyePoints** and **rightEyePoints** provide interesting points positions for the eyes (given in camera angles) .. code-block:: guess EyePoints = [ eyeCenter_x, eyeCenter_y, noseSideLimit_x, noseSideLimit_y, earSideLimit_x, earSideLimit_y, topLimit_x, topLimit_y, bottomLimit_x, bottomLimit_y, midTopEarLimit_x, midTopEarLimit_y, midTopNoseLimit_x, midTopNoseLimit_y ] - **leftEyebrowPoints** and **leftEyebrowPoints** provide interesting points positions for the eyebrows (given in camera angles) .. code-block:: guess EyebrowPoints = [ noseSideLimit_x, noseSideLimit_y, center_x, center_y, earSideLimit_x, earSideLimit_y ] - **nosePoints** provides interesting points positions for the nose (given in camera angles) .. code-block:: guess NosePoints = [ bottomCenterLimit_x, bottomCenterLimit_y, bottomLeftLimit_x, bottomLeftLimit_y, bottomRightLimit_x, bottomRightLimit_y ] - **mouthPoints** provides interesting points positions for the mouth (given in camera angles) .. code-block:: guess MouthPoints = [ leftLimit_x, leftLimit_y, rightLimit_x, rightLimit_y, topLimit_x, topLimit_y, bottomLimit_x, bottomLimit_y, midTopLeftLimit_x, midTopLeftLimit_y, midTopRightLimit_x, midTopRightLimit_y, midBottomRightLimit_x, midBottomRightLimit_y, midBottomLeftLimit_x, midBottomLeftLimit_y ] **Time_Filtered_Reco_Info** can be equal to: - [] if there is nothing new - [ 2, [ faceLabel ] ] if there is one face recognized - [ 3, [ faceLabel0, ..., faceLabelP ] ] if there are several recognized faces - [ 4 ] if a face has been detected for more than 8 seconds without being recognized. Getting this result is a suggestion to learn this face if desired, but keep in mind that recognition only works for faces looking towards NAO. **Camera_Position_Info**: position6D of the camera at the time the image was taken, in Nao Space. Performances and Limitations ---------------------------- Detection +++++++++++ **Performances** * **Size range** for the detected faces: Minimum: ~45 pixels in a QVGA image. For an adult, this corresponds to around 3m with v3.x VGA cameras and more than 2m on v4 HD cameras. Maximum: ~160 pixels in a QVGA image * **Tilt**: +/- 20 deg (0 deg corresponding to a face facing the camera) * **Rotation** in image plane: +/- 20 deg **Limitations** * **Lighting**: the face detection has been tested under office lightning conditions - ie, under 100 to 500 lux. If you feel that the detection is not running well, try to activate the camera auto gain - via the Monitor interface - or try to manually adjust the camera contrast. Recognition +++++++++++ **Performances** When learning someones face, the subject is supposed to face the camera and to keep a neutral face because a neutral face is between sadness and hapyness. Otherwise, it would be harder to recognize someone sad if he was smiling during the learning process. In order to get a more robust output, NAO checks first that he recognises the same person in 2 consecutive images from the camera before outputing the name. Sometimes, depending on a change of location or haircut, a known face can be difficult to recognize. To improve the robustness, a reinforcement process as been added. If someone is not recognized, or mistaken for someone else, just learn him again. This learning will be added to that person's database. After some days, you should get more reliable recognitions. **Limitations** Recognition is less robust than detection regarding pan, tilt, rotation and maximal distance. Reason is that the recognition algorithm doesn't have a 3D representation of the person to recognize and uses some info like distances between keypoints for the recognition (in a way functionning partially like an identikit would do). If we turn the head, distances ratios will be modified. Learning +++++++++++ **Performances** The learning stage is an intelligent process in which NAO checks that the face is correctly exposed (e.g. no backlighting, no partial shadows) in 3 consecutive images. **Limitations** The learning stage can only be achieved with one face in the field of view at a time. Getting Started --------------- Detection +++++++++++ To get a feel of what the ALFaceDetection can do, you can use Monitor and launch the vision plugin. Activate the face detection checkbox and start the camera acquisition. Then, if you present your face to the camera - or show a picture with a face on it - Monitor should report the detected faces with blue crosses. .. image:: /medias/dev/modules/vision/face_detection_telepathe.png Another way to use face detection is to launch the Choregraphe **Walk Tracker** or **WB Tracker** boxes and switch default value from **Red Ball** to **Face**. Doing so, you can ask NAO to move toward the person in order to always keep the face in the middle of his field of view. .. _alfacedetection-learning-stage: Learning stage for recognition ++++++++++++++++++++++++++++++++ **Learning stage** can be done via the learnFace bound method of the API or through user friendly interface of Choregraphe **Learn Face** box. * Once you have clicked on the box and entered the name of the person, this person has 5 seconds to place its face in front of NAO. * Then the learning process is launched during wich NAO's eyes gets blue. * His eyes turns green in less than a second if the face is seen by NAO in correct conditions (e.g. no partial shadow on the face, no backlight, person is not too far). * If the eyes are still blue after some seconds, the person should move in order to change the learning conditions. .. note:: The algorithm requires better conditions for the learning stage than the ones needed for detection. .. note:: You can launch the `WB Tracker` box in parallel with the learning stage so the face to learn will always be in the middle of NAO's field of view.