Most of the cameras used in computer vision, computer graphics, and image processing applications are designed to capture images that are similar to the images we see with our eyes. This enables an easy interpretation of the visual information by a human observer.
Nowadays though, more and more processing of visual information is done by computers. Thus, it is worth questioning if these human inspired “eyes” are the optimal choice for processing visual information using a machine. In this thesis I will describe how one can study problems in computer vision without reference to a specific camera model by studying the geometry and statistics of the space of light rays that surrounds us.
The study of the geometry will allow us to determine all the possible constraints that exist in the visual input and could be utilized if we had a perfect sensor. Since no perfect sensor exists we use signal processing techniques to examine how well the constraints between different sets of light rays can be exploited given a specific camera model. A camera is modeled as a spatio-temporal filter in the space of light rays which lets us express the image formation process in a function approximation framework. This framework then allows us to relate the geometry of the imaging camera to the performance of the vision system with regard to the given task.
In this project I apply this framework to problem of camera motion estimation. I show how by choosing the right camera design we can solve for the camera motion using linear, scene-independent constraints that allow for robust solutions. This is compared to motion estimation using conventional cameras. In addition we show how we can extract spatio-temporal models from multiple video sequences using multi-resolution subdivison surfaces.
Source: University of Maryland
Author: Neumann, Jan