Community Research and Development Information Service - CORDIS


StillNoFace Report Summary

Project ID: 656094
Funded under: H2020-EU.1.3.2.

Periodic Reporting for period 1 - StillNoFace (Identity matching from still images without face information)

Reporting period: 2015-09-01 to 2016-08-31

Summary of the context and overall objectives of the project

Problem statement
In computer vision, human identity matching from images and/or video has been an active research topic for more than two decades and its popularity is increasing with the increase in computing power. Integrating soft biometrics such as gender, height, weight, age, and ethnicity to a primary biometrics system (e.g., face) has been studied. Inthe majority of the existing methods, the problem of human classification assisted by soft biometrics has been approached using facial information. However, in real-life scenarios, such information might not be available (e.g., the face might be covered or occluded). This led to methods that employ information from the human body to perform human identification and tracking based on soft biometrics.

In this research project, we propose methods for predicting a person’s identity from images without facial information based on soft biometrics. Having as input a still image or a video showing only the body of an individual in the wild, the overall objectives of the project are:
• Estimate the gender of the individual
• Estimate soft biometrics of the individual, such as his/her age and height

Benefit for the society
A major application of the outcome of the project is the automated recognition of individuals from images captured by standard cameras in order to allow them to enter to their house or office or to control a car. Moreover, a prominent category of applications involve security and safety in public and private spaces (e.g. airports, train stations, concert halls). In these places, surveillance cameras generally do not provide facial information as the individual’s image may be acquired from the behind. Besides, when it is available, face information may be of low resolution, thus difficult to extract information it.

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

Different types of anthropometric measurements are available in various application scenarios in computer vision making the task difficult to address using a global training algorithm with a predefined set of anthropometric measurements as features. Several approaches such as multi-task learning and domain adaptation have been proposed for dealing with multimodal problems. To this end, learning using privileged information (LUPI) [1] has been explored to cope with the inhomogeneity in training and testing information. The idea of privileged information is that one may have access to additional information about the training samples, which is not available during testing. The LUPI framework emulates the human’s perception of learning as it resembles the way that an educator teaches his/her students by providing additional knowledge, comments, explanations, or rewards in class, while the students latter are forced to solve problems without having access to this additional knowledge. In brief, the new learning model places a nontrivial teacher in the training process who supplies the training set with additional information (i.e., features) that is not available for test examples. However, defining which information may be considered as privileged and which as regular is not an easy task as the problem is not straightforward, while the lack of informative data or the presence of misleading information may influence the performance of the model by introducing bias.

At first, the principle of privileged information was investigated where we proposed a new machine learning method that couples privileged information and conditional random fields [P1]. Then, we proposed a novel method, which performs gender (binary) classification using ratios of anthropometric measurements using the LUPI paradigm [P2]. Based on the findings of the work of Cao et al. [2], using the actual values of anthropometric measurements (e.g., limb lengths in mm) from an anthropometric database results in good gender classification accuracy. We argue though, that such information cannot be accurately obtained from state-of-the-art computer vision algorithms without employing depth information (e.g., use data obtained from a Kinect RGB-D sensor). To address this limitation, we proposed to exploit the use of ratios of anthropometric measurements. Hence, errors during the estimation of the actual values would be alleviated. We divided these measurements into two groups. The first group contains only ratios of body measurements that can be captured from a regular surveillance camera and computed from state-of-the-art computer vision algorithms. This set contains only observable information (e.g., arm or leg lengths) and it is available during both the training and the testing phases. The second group contains ratios of body measurements that are difficult to obtain with an automated acquisition system (e.g., circumferences of body parts) as well as a few measurements that correspond to the head (e.g., head breadth or face length). This type of information is considered as privileged and it is not available at test time.

[1] V. Vapnik and A. Vashist. A new learning paradigm: Learning using privileged information. Neural Networks, 22(5–6):544–557, 2009.
[2] D. Cao, C. Chen, D. Adjeroh, and A. Ross, “Predicting gender and weight from human metrology using a copula model,” in Proc. 5th IEEE International Conference on Biometrics Theory, Applications and Systems, Washington DC, USA, Sep. 23 - 26 2012, pp. 162–169.

Publications of the project
[P1] M. Vrigkas, C. Nikou and I. Kakadiaris. Exploiting privileged information for facial expression recognition. IAPR/IEEE International Conference on Biometrics (ICB’16), 13-16 June 2016, Halmstad, Sweden.
[P2] Kakadiaris, N. Sarafianos and C. Nikou. Show me your body: gender classification from still images. IEEE International Conference on Image Processing (ICIP’16), 25-28 September 2016, Phoenix, Arizona, USA.

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

We demonstrated that using privileged information results in an accuracy of 98% for gender classification in the CAESAR dataset ( Also, results are reported under different scenarios where part of the body is not fully visible. Moreover, using real images, we used a state-of-the-art 3D pose estimation algorithm to obtain the joint locations in three dimensions, computed the ratios of the respective limb lengths, and obtained very promising results (86% of correct classification).

Our work in gender estimation from anthropometric measurements follows a similar approach with the method in (D. Cao et al. 2012) , in the sense that the same anthropometric database is used to predict soft biometric attributes. However, there are two distinctive differences. First, since the actual anthropometric measurements are highly unlikely to be obtained accurately from state-of-the-art computer vision algorithms that use images or videos captured from surveillance cameras, we opted for using ratios of anthropometric measurements. Second, we argue that several anthropometric measurements are relatively difficult to be estimated automatically (e.g., circumferences of human parts) and that such information will not be available in automatically acquired data.

Related information

Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top