Automatic face detection and recognition has proved to have significant potential as a research and development topic in image and real time video processing. Though complex, demanding and often error prone, a well-built face recognition system has considerable applicability in biometric scanning for airport control or in any field that requires security and surveillance measures. Even more, there is high demand from mobile companies for challenging face recognition and detection applications for devices whose video cameras have enhanced a great deal in recent years. This project is looking to investigate face recognition techniques and classification algorithms such as principal component analysis and nearest-neighbour algorithm, methods for performing face detection using Haar-like filters and implementing such concepts in a fully functional and tested system
Chapter 1: Introduction
1.1 Project Objectives
The aim of this project was to research techniques for performing face recognition and detection by machines, implement and assess the applicability of such a system and test the system.
The specific objectives were the following:
To understand the basics of face recognition techniques and algorithms such as Principal Component Analysis (PCA) and Nearest Neighbour Algorithm
To understand image processing techniques such as resizing, thresholding, greyscale conversion, histogram equalization
To develop a real-time face recognition algorithm using PCA
To test the system performance on a database of people such as the ORL database but also in real-time.
1.2 Introduction to the concepts of Face Recognition
Over the last decade the face recognition area has become a subject of great interest due to its applicability in many fields such as Computer Vision and Biometrics and it is advancing rapidly in potentially becoming one of the greatest research topics in understanding human behaviour.
The act of recognising a face is an extremely simple human act for your average individual so much so that it is not even consciously acknowledged. Take the case of a person watching their favourite show on television: as soon as the protagonist pops up onto the screen their face is instantly recognised and this behavioural act, simple as it may seem at a first glance, is what started computer face recognition research. Only by trying to design a system that has some of the capabilities of a human being, that you actually begin to appreciate this amazing gift.
Face Recognition Systems have their uses in many fields such as security human tracking and biometrics in controlled environments (environments in which the system is provided with the right parameters so that it will function correctly such as proper lighting, correct body posture) but also shows commercial potential for mobile devices. Having the capability to recognise human faces on your hand-held device would be a major achievement and would help to integrate such systems into the human world.
But what should we focus on if we wish to build a face recognition system. Comparison of static images is a simple thing to implement in a program but recognition is much more than this. When we look at a person’s face the image our brain receives is mostly different at every moment in time: the facial expression, the eye focus, the angle of the head are all different. The word “mostly” is used intentionally to reflect the fact that the similarities between images are the key to solving the face recognition problem. These similarities are what allows a person to distinguish a person from another but also to tell that a couple of images are of the same person. There is always a pattern between images of a person and finding a way of exploiting this characteristic in a systematic way is paramount to the face recognition topic.
The project focuses on investigating eigenfaces for recognition (PCA) and how they might be used to recognise faces. Image processing techniques have been investigated, with the goal of constructing a functional facial recognition system. Moreover, face detection techniques have been researched, which are used to detect and isolate faces contained within an image. The two topics, detection and recognition were implemented separately for improved testability, but were later integrated into a final system that provides real-time detection and recognition of people, using a video camera.
1.3 Recognition Techniques
1.3.1 Previous Work
Previous work in the field of facial recognition has focused on detecting individual features such as eyes, nose, mouth and head outline and defining a model for the relationship between these features. Even though this approach has proven to be inefficient because the relationships between facial features is insufficient to account for the way human recognition works.
The first research to attempt to build a semi-automated recognition system was Woody Bledsoe in the 1960’s. His system involved tracing major facial features manually such as eye corner, nose tip, mouth corners etc. He then calculated the normalized difference of these features from a reference point and compared the differences with a set of reference data. The process was slow as the calculations had to be done manually, so his system was far from automatic. Later on, Goldstein, Harmon & Lensk created a system that used 21 of these features in standard classification techniques but it proved to be hard to automate.
The first to provide a systematic way of performing face recognition were Turk and Pentland in the 1980’s in their widely known paper “Eigenfaces for Recognition”. Their technique uses Principal Component Analysis to reduce the dimensionality of a set of components used to describe a face as well as noise contained in the set of pictures.
In recent years, 3D face recognition has become a popular research topic for its ability to achieve better recognition accuracy due to not being sensitive to lighting changes, head rotation, make-up and change in facial expression, factors which heavily and negatively affect 2D recognition methods. Drawbacks of such systems include a large amount of necessary memory for storing the faces as 3D meshes and textures.
1.4 The ORL Database of Faces
The “ORL Database of Faces” is a vector of 400 images of 40 individuals which is heavily used in face recognition research. There are 10 different images for each individual and each of them is taken at different times, varying the lighting, facial expressions (open/closed eyes, smiling/not smiling) and facial details (wearing glasses or makeup). The same background is used in all the photos, with the subjects standing in upright, frontal position with a tolerance of about 15-18 degrees for side movement.
Each image has a resolution of 92 x 112 pixels and has been cropped and centered.. The format of the image files is pgm (portable grey map) which is a greyscale formatted array that contains a single 16-bit value for each pixel (the brightness information). This format was used because colour is not required in the recognition process and only one value must be stored and processed for each pixel reducing the complexity of the system.
Although taken in a methodical manner (same background, approximately the same lighting conditions) and not fully representative of the arbitrary conditions found in a mobile real-time recognition system (changing lighting conditions, photos taken over long periods of time with significant physical changes present, random background), there is plenty of variation present in the sets that was very useful for initial testing of the prototype recognition system.
Chapter 2: Initial Research and Development
This chapter will focus on the findings of the initial research of the project which was necessary to gain an understanding of the requirements for building a prototype facial recognition system and testing it.
Research began with understanding basic image processing techniques (bicubic interpolation resizing, greyscale conversion, histogram fitting) that are all required for recognition. Later on, my research was focused on the Eigenface approach (PCA or Karhunen-Loeve transformation), classification and thresholding techniques and way of implementing them in Matlab.
2.2 Compressing the images (Dimensionality Reduction)
The ORL database is small (in terms of number of photos), with every image containing 92 x 112 pixels/values (10,304). Knowing that each value is 2 bytes and that every value is processed multiple times in the PCA algorithm we can appreciate that without any compression performance of the system will be very slow. Compression or dimensionality reduction is clearly needed in order to save space, achieve better performance and get rid of unnecessary information.
2.3 Principal Component Analysis
Work previous to PCA or Karhunen-Loeve transformation (KLT) used face features(eyes, nose, mouth) as a means of recognising faces since these features seemed intuitive to the way humans recognize faces.
References and Works Cited
Turk and Pentland, Eigenfaces for Recognition. [Online] http://www.face-rec.org/algorithms/PCA/jcn.pdf
Wikipedia, Facial recognition system [Online] http://en.wikipedia.org/wiki/Facial_recognition_system
Bledsoe, Man-Machine Facial Recognition, 1966
Goldstein, Harmon, and Lesk, “Identification of Human Faces”, 1971
Wikipedia, 3D Face Recognition [Online] http://en.wikipedia.org/wiki/Three-dimensional_face_recognition
The ORL Database of Faces, University of Cambridge http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html