Parallel Computer Graphics Architectures for Computer Vision

pre-alpha version Free Source library and programs available from and (faster access mirror site) at

(Upper, middle images) A computer vision machine with 6 capture cards designed for a simultaneous 6 channel capture application, which required fast processing to display projected versions of the images, using computer graphics hardware, prompting the investigation of applying computer graphics hardware to computer vision. (Lower image) A computer vision machine with 6 PCI graphics cards and 1 AGP card, providing the processing for computer vision algorithms.
Click for larger images


In some sense, computer graphics and computer vision are inverses of one another. Special purpose computer vision hardware is rarely found in typical mass-produced personal computers, but graphics processing units (GPUs) found on most personal computers, often exceed (in number of transistors as well as in compute power) the capabilities of the Central Processing Unit (CPU).

The present research involves implementing computer vision algorithms on modern computer graphics cards. This work investigates methods of efficiently mapping mathematical operations of computer vision onto modern computer graphics architecture.

As an example computer vision algorithm, a real--time projective camera motion tracking routine has been implemented on a modern, GeForce FX class GPU. This was done using using OpenGL and the nVIDIA Cg fragment shaders. Trade--offs between computer vision requirements and GPU resources were examined such as floating point accuracy, high framerate thoroughput and latency. The algorithm implementation was examined closely, and hardware bottlenecks were discovered and addressed. The performance of the GPU architecture for computer vision was examined and it was demonstrated that significant speedups can be achieved, while leaving the CPU free for other signal processing tasks.

Applications of our work include wearable, computer mediated reality systems that use both computer vision and computer graphics, and require realtime processing with low--latency and high throughput provided by modern GPUs. This ability to run many algorithms in real--time is an important step in creating a working mediated reality system, and allows others to easily build low cost systems capable of real--time computer vision.

This work includes and extends concepts presented in:
James Fung, Steve Mann, "Using Multiple Graphics Cards as a General Purpose Parallel Computer : Applications to Computer Vision", Proceedings of the 17th International Conference on Pattern Recognition (ICPR2004) , Cambridge, United Kingdom, August 23-26, 2004, volume 1, pages 805-808. [ Adobe PDF ] [ HTML ]

James Fung, Steve Mann, "Computer Vision Signal Processing on Graphics Processing Units", Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, Quebec, Canada, May 17-21, 2004, pp. V-93 - V-96 [ Adobe PDF ] [ HTML ]

James Fung, Felix Tang, Steve Mann, "Mediated Reality Using Computer Graphics Hardware for Computer Vision", Proceedings of the International Symposium on Wearable Computing 2002 (ISWC2002), Seattle, Washington, USA, Oct 7-10, 2002, pp. 83--89.

Example: Correcting for Radial Distortion on a graphics card

Typical webcams have wide angle lense which are useful for viewing large amounts of a scene in a single image. However, these low cost, wide angle lenses suffer from a significant amount of radial distortion.

The left image below shows the original camera image, which some diskettes and a pop can in the scene being visibly distorted. They appear bent, though in reality are straight.

Original Image. The diskettes and cans, though straight, appear curved.

Corrected using the graphics hardware. The graphics hardware has corrected the radial distortion, so the straight lines appear straight in the image. Because it is done in the graphics hardware, the CPU is not loaded with this computation, and undistorted images are available directly to the CPU.

Example Graphics Processor code (Cg): corrects radial distortion of form kr^2

void FragmentProgram(
    in float2 fptexCoord0 : TEXCOORD0,
    in float2 fptexCoord1 : TEXCOORD1,
    out float4 colorO       : COLOR0,
    const uniform samplerRECT FPE0,
    const uniform samplerRECT FPE1 )
   int i=0;
   float2 orig_coord;
   float2 new_coord, delta;

   float kappa =  -0.15;
   float2 center = {-0.05, 0.0}; //if known, focal point should be used

   //normalize coord [0,1.0]
   orig_coord =  (fptexCoord0)/float2( 320.0, 240.0);
   //shift coords [-0.5,0.5]
   orig_coord = orig_coord - float2(0.5,0.5);

   float2 radius = orig_coord - center;
   float2 r = distance(orig_coord, center);

   delta = orig_coord*kappa*r*r;
   new_coord = (fptexCoord0)/float2( 320.0, 240.0) + delta;

   colorO = texRECT(FPE1, new_coord*float2(320.0, 240.0));


When a fully calibrated camera is used, the graphics hardware could correct for the intrinsic parameters as well.

Example: Video Framerate Projective Tracking

Examples of VideoOrbits : click above for larger versions

A real--time projective camera motion tracking routine VideoOrbits has been implemented on a modern, GeForce FX class GPU. This was done using using OpenGL and the nVIDIA Cg fragment shaders. The system runs at video framerate (30 frames per second), without taxing the CPU. The graphics card calculates an 8 parameter projective transformation, which is returned to the CPU. If desired, a transformed image can also be sent to to the CPU.

Example: UYVY to RGB colorspace conversion

Many video input sources provide images in various different YUV formats. For instance, a common IEEE1394 camera mode provides pixels as UYVY, which needs to be converted into RGB. The graphics processor can conduct this task, sending the results back to the CPU, and thus removing this task from the CPU.

Conducting a conversion on the processor, and then texture mapping and displaying the image runs at 100 frames/second. However, conducting the conversion on the graphics card runs at 640 frames/second.