GPUs for Scientific Computing
|
Introduction
GPUs changing geometry
GPUs improving quality and effects on the pixel level
GPUs doing scientific calculation
BrookGPU
Important Webpages
Introduction |
|
In the past, when we tald about video cards (GPUs - Graphics Processing Units) it was like a "Black Engine" where we sent graphics primitive (triangles) using points
and with certain transformations. At the end of this complicated process and with some luck we would be able to see an image on the screen.
In fact, it still done this like this way. However, nowadays the developer has more control over this process, which we can call "rendering pipeline"
(see figure below). Later the GPUs became programable, even using high level language (
Cg,
OpenGL's SLang,
HLSL).
- GPU's are programmable processors
- They have two types of programs (per-vertex and per-pixel)
- They use a high-level programming language
I'm not intending to tell the whole history of how the GPUs developed, but I want to point out some of their main features. The
Cg Tutorial Book (Randina Fernando and Mark J. Kilgard, April 2003) divides the computer graphics hardware in four generation. It only make sense to talk about GPUs
programable from the sencond generation (1999-2000), which includes NVIDIA's GeForce 256 and GeForce2,
ATI 7500 ... In this generation both, OpenGL
and DirectX 7 support hardware vertex transformation. The third generation of GPUs (2001) includes NVIDIA's GeForce3, GeForce4 Ti and ATI's Radeon 8500.
Dispite the pixel-level not being powerful enough, this generation has more powerful vertex programing and more available pixel-level configurability.
With the fourth generation, (2002 and on) which includes NVIDIA's GeForce Fx family and ATI's Radeon 9700, the programmability opens up in vertex and pixel
levels. This is the generation of GPUs where people start thinking about doing not only complicate shading, but scientific computation as well.
GPU's changing geometry (displacement function) |
|
Since the second generation, it became possible to send parameters for the vertex shading processor at the GPU in order to apply
a function for each vertex. The following figure shows the new feature. In the first sequence (right-top) the CPU is evaluating
the function f(x) for each element of vector x[n]. The second sequence (right-bottom) the function has
been developed at GPU vertex processor. Only doing this change, the majority of examples presented so far speed up the
rendering at least ten times. The figure in the right is a wave simulator applycation developed on Open Inventor,
where the GPU is doing a wave displacement function (more ... for details and code).
GPU's improving quality and effects on the pixel level |
|
Although the user had adquired a reasonable control over the vertex processor (since the second generation), it was not until the fourth
that he was able to reach better control over the fragment shading process. This generation provides:
- support for long shading programs.
- high-precision of color and pixel operations (32 bits).
- High level shading languages get really interesting.
Examples:
GPUs doing scientific calculation |
|
With the new GPUs features mentioned above, people started to look at the GPU as a powerful vector coprocessor to the CPU. The intermediate
values during a computaion (unsing float buffers) are no longer clamped. Additionally, another good reason to use them as a coprocessor is its
parallel nature at the razterization stage (pixel-level). In this way texture-images become matrices of values to do computation.
Nowadays, GPUs are being used for linear Algebra computation, signal-images processing, physical simulation and so on ...
Limitations
- Limited instructions and register space
- No branches, but you can use conditionals or multipass
- No good enough (so far) to send values back to the CPU
Examples:
more examples and appications ...
BrookGPU |
|
Prerequisites: C++ Object-Oriented Programming and intermedia OpenGL knowledge (work with textures)
Now is time to get better idea how people are using GPUs for scientific computation, which is the main goal of this webpage. In order to carry out
this ojetive we select Brook for GPU library that was developed at
Stanford University.
As you have seen, there are already so many application and papers that are using GPUs for computing, but it is difficult to find source code and details, how
this whole interaction betwen CPU->GPU and GPU->CPU. This is one of the reason we select BrookGPU. Although this library works on OpenGL and DirectX backend,
I will explain briefly the OpenGL part.
Introduction questions:
What kind of buffer can we use on the GPU for computation?
They are additional non-visible rendering buffers called float buffers or
pixels buffers (pbuffer).
Do I need expecial configuration or support to use them?
In the OpenGL case, you will need check if your hardware support certain
OpenGL Extensions (see registry).
The glxinfo application, on Linux machines, can tell you what extentions are supported in your hardware.
Usually you can find it at /usr/X11R6/bin/. BrookGPU requires NVIDIA video card so you have to looking for
GL_NV_float_buffer extention. You can check another extentions which not required NVIDIA cards:
WGL_ARB_pbuffer
GLX_SGIX_pbuffer
Note: Before see the next presentation you shoud read BrookGPU webpage. Also you can see GH03-Brook.ppt
How BrookGPU works?
In this presentation (ppt file, pdf file) I try to explain:
- Stream creation
- Kernels creation and rendering
- Reduction operation
Code examples:
- Mark J. Harris (RenderTexture class)
- NVSDK 6.0 (pbuffer.h pBuffer.cpp)
- Patrick Crawley (Unsteady Flow Advection Convolution)
Notes:
- check this out Request For Comment: EXT_render_target proposal.
Thanks Patrick Crawley
Important Webpages |
|
Brook for GPU
OpenGL
HLSL
Cg and HLSL FAQ
Date: 01/14/2004 Last Update: 04/09/2004
|
|
|
|
|
|