|Abstract:||The images of a given object under varying conditions is commonly called an appearance manifold. This thesis presents methods for approximating the appearance manifold, implicitly or explicitly by linear subspaces. Based on this representation, a number of useful algorithms are presented for face recognition and tracking.
First, this thesis shows that for any Lambertian object, the low-dimensional linear subspace spanned by a small set of corresponding images acquired under certain single light source directions can provide a good approximation of the image variation under all lighting conditions. Since the subspace is generated directly from real images, potentially complex modeling processes can be completely avoided, nor is it necessary to acquire large numbers of training images. As shown, this representation provides good face recognition results under a wide range of difficult lighting conditions.
This thesis also introduces a complementary but very simple approach for estimating directional lighting in uncalibrated frontal-pose face images. We show that this particular inverse problem can be solved using constrained least-squares and class-specific priors on shape and reflectance. This approach implicitly applies the class-specific 3D shape and an average 2D albedo to construct an illumination subspace to capture the image variation of the class-specific generic object under variable lighting. By using the illumination subspace, we can efficiently and accurately compute the lighting direction in real-time and with or without shadows. We then use this lighting estimate in a forward rendering step to ``relight'' arbitrarily-lit input faces to a canonical (diffuse) form as needed for illumination-invariant face verification. Although this technique cannot deal with large illumination changes, it does have the advantage that only one image per object is required in the gallery.
Next, a novel integrated framework is presented to track and recognize human faces in video sequences. The appearance manifold representing each registered person is approximated by a collection of sub-manifolds and the connectivity between them. In turn, each sub-manifold is approximated by a low-dimensional linear subspace computed by principal component analysis using images of nearby poses sampled from training video sequence, while the connectivity is modeled by transition probabilities between pairs of subspaces. The integrated task of tracking and recognition is formulated as a maximum a posteriori estimation problem. Within this framework, the tracking and recognition modules are complementary to each other, and the capability and performance of one are enhanced by the other. This approach contrasts sharply with more rigid conventional approaches where tracking and recognition are performed independently and sequentially.
Finally, this thesis presents an online learning algorithm to construct the aforementioned probabilistic appearance manifolds. For a class of objects (e.g., human faces), a generic representation of the appearances of the class is learned off-line. From video of a particular person, an appearance model is incrementally learned on-line using the prior generic model and successive frames from the video. The online learning algorithm consists of two steps. The first is a pose estimation problem, where our goal is to identify the best sub-manifold to which the current image of the specific object belongs with the highest posteriori probability. The second step is to incrementally update the appearance manifold. The result from the first step is applied to find a set of pre-training images that are expected to appear similar to the specific object in other poses. Then all of the subspaces in the appearance manifold are updated to minimize the reconstruction error. The online learning results are shown to be effective for face tracking, and its use in video-based face recognition compares favorably to the representation constructed with a batch technique.
While the techniques have been applied to human faces, the presented approximations to the appearance manifold are general and can be applied to other classes of objects.