The computer vision problem of face detection has over the years become a common high-requirements benchmark for machine learning methods. In the last decade, highly efficient face detection systems have been developed that extensively use the nature of the image domain to achieve accurate real-time performance. However, the effectiveness of such systems would not be possible without the progress in the underlying machine learning and classification methods.
Support vector machine learning is a relatively recent method that offers a good generalisation performance. As with other methods, SVM learning has been applied to the task of face detection, where the drawbacks of the technique became evident. Research focusing on accuracy found that competitive performance is possible but training on adequately large datasets is hard. Others tackled the speed issue and while various approximation methods made interactive response times possible, those generally came at a price of reduced accuracy.
This thesis holds the position that SVM learning can be extended in ways that make it an adequate approach to high-requirements problems such as face detection. An SVM-based face detection system is described that uses the three main contributions of the research: a combination of a novel dataset upscale method and an improved large training algorithm to obtain highly accurate SVM classifiers and a new strategy to produce highly efficient classifier cascades.
The resulting face detection system is shown to achieve competitive results both accuracy and speed-wise, and the new methods are demonstrated to be more generally applicable to other computer vision problems–a system for mouth state estimation in video sequences is presented that demonstrates real-time performance at a 94% classification rate.
Author: Kukenys, Ignas
Source: University of Otago