Super Resolution

Dr Stefan van der Walt (PhD)



The images on the right, taken from the popular television series CSI, describe a typical situation: You are given a blurry photography and asked to extract additional, not visible, information from it. With some knowledge about the degradation, it is possible to do what amounts to a deconvolution, thereby improving the image. What is shown here is nonsense, it cannot be done, not now, not ever. A photograph samples the scene and there is a limit, the Nyquist limit, to the amount of information that can be encoded with a fixed number of samples.

We were talking about extracting information from a single image. How about extracting information from multiple images? The video shows a building and we would like to read the name of the building. It is unreadable from any single frame, but we have several frames from slightly different positions. The problem now becomes one of extracting information from multiple views.


In order to understand how the situation changes if one has multiple images, let us think of a signal, sampled at a fixed rate. According to Nyquist we are limited to the amount of information that can be captured. Suppose we are allowed to sample a second time, at the same rate but shifted from the first samples. For example, if the second sampling takes place halfway between the first samples, the combined samples effectively provide us with twice the sampling rate, hence a higher Nyquist limit. This is the basic idea behind super resolution, using multiple images allows one to beat Nyquist for a single image.

The super resolution process consists of a number of steps:

  1. Image acquisition.
  2. Registration. This is a crucial step where one has to align the different images exactly. After alignment it is possible to do a vertical interpolation between the aligned images, resulting in a significantly improved image.
  3. Reconstruction. This amounts a deconvolution where the image formation model is inverted. In practice one has to solve a large, sparse linear least squares problem.

Let us now apply it to the video sequence of the building below. There is no point working with the whole image if we are inly interested in reading the name of the building. We therefore make cut-outs from each frame, a typical one shown on the left.


After alignment one can do a vertical interpolation leading to a significant improvement. This is not super resolution yet, however. The final step is to do a deconvolution, i.e. inverting the image formation process. The improvement is again significant, and is probably about is good as it gets with current technology.


Courses offered
Indigenous Plants