distributed computing project

Hi everybody,

I’m currently working on a distributed computing project. What I want to do
know is to use (if possible) the gpu of the graphic cards to do large
scientific calculations. I couldn’t find much on this on web, but I think
this is possible with OpenGL.
Can you help me?? I’m programming in C++.

(Does anybody knows any program that tells the % of usage of the gpu??)

Thanking in advance,
Nuno Lopes

Look what I’ve found: http://www.cs.caltech.edu/courses/cs101.3/cs101_files/notes/week2/RuSt01a.pdf

GPU stands for Graphic Processing Unit, not scientific processing unit

Seriously, you shouldn’t try to use it for anything else than graphics. The GPUs are designed to accelerate graphic calculations, they are not useable for anything like CPUs.

There may be some cases where a scientific calculation is similar to a 3D graphics problem where you could use the power of GPUs, but they are usually very tricky. If you really plan to abuse the GPU for other calculation, you should first learn how to use the GPU for advanced graphics in order to move on to other calculations that are more complex, but based on some graphics algorithm.

In my opinion, you may find that it is much more hassle than it is worth.

Regarding the question about reading the CPU%, this depends on the OS you are using and has really nothing to do with OpenGL.

While it can be harder to write a more general purpose program on the GPU, it can have significant benifits. If utilized properly, you average GPU today has about the processing power of 40 CPUs. For many examples of cool stuff being done with graphics hardware, see Mark Harris’ webpage:
http://wwwx.cs.unc.edu/~harrism/gpgpu/index.shtml

Brian

Overmind, don’t spread your bullsh*t on people who are more pioneering than you. That’s the same closed-minded crap that brilliant people who put piston engines in RX-7’s have to deal with. The potential for a GPU to compute non-conditional, parallel computations is enourmous! It has a limited data path and a single memory bus, but it also doesn’t need to relay data over a network, doesn’t require a menagement system, and is probably cheaper.

Simply amazing topic! However I am a little skeptical. I think that 40 times the power is a bit exaggerated and has to be weighted though.

I fear the biggest problem is precision here.
Even if you use the highest precision possible, chances are you won’t get the same results.
For example, if a numer should be 1.0000015 you may get 1.0000018 on a NV and 1.0000011 on an ATi. Even card from the same vendor may get different results… I think this is inacceptable for scientific calculations, however…

For what this is being used… well, nobody here heard about GP-GPUs (General Purpose GPUs)? Maybe there will be a day in which this would be done as a standard. In fact, some limited physic processing is avaiable even now (fog volumes, cloth animation)…

Hello,

for what it’s worth: I’m using OpenGL hardware to make computers see. :slight_smile:

well, I’ve investigating an idea we’ve had to use graphics to accelerate creating models from photographs. It still early days, but its showing promise.

cheers
John

Can anybody give me some examples on how to do this, please?

Some source code of how to implement an algorithm in OpenGL.

Thanks for your help!

the question you are asking is too open ended

code examples to do what? its like asking how long is a piece of string.

side rant:

anyways, code examples are … well, it bugs me when people ask for code examples. code is just a vehicle for expression, just like mathematics or english or any other language (human, symbol based or computer based) you can think of. code example suggest “i don’t want to understand this, i just want to put it in my program”.

Originally posted by john:
anyways, code examples are … well, it bugs me when people ask for code examples. code is just a vehicle for expression, just like mathematics or english or any other language (human, symbol based or computer based) you can think of. code example suggest “i don’t want to understand this, i just want to put it in my program”.

I don’t want code examples to just copy-paste in my program. I’m asking for examples because I never programmed OpenGL before and I don’t know how and where to start!
I’m asking only if anybody can explain me how to implement an algorithm to sum numbers, multiply, divide, etc… If possible I want to work with big numbers.

The precision on a typical graphics card today is 8 bits per scalar, in fixed format. The latest batch of cards can store 32 bit floating point quantities per scalar. Graphics cards typically process scalars in groups of 4 (as in “R,G,B,A”).

“Large” numbers won’t work at all on typical graphics cards, and will only work to the extent that 32-bit floating point is useful on the modern cards. (There are also some cards that do 16 bits per scalar as an intermediate trade-off between precision and quality, and some cards that do 24 bits instead of 32 bits of precision).

Supposing you have set up a high-resolution render target (typically a pbuffer with floating-point format) the easiest way to perform operations is to texture out of the target, and use that as one source, and use whatever data you feed in as a second source.

To add two 2-dimensional vector fields, bind them to texture units 0 and 1, and draw a quad covering the target from 0-1 in both directions with the following fragment program:

!!ARBFP1.0

TEMP t1, t2;

TEX t1, fragment.texcoord[0], texture[0], 2D;
TEX t2, fragment.texcoord[0], texture[1], 2D;
ADD result.color, t1, t2;

For more examples, you might want to look in the nVIDIA OpenGL SDK (which is a big download) or the ATI OpenGL SDK (which is smaller); look for example at the “High Dynamic Range” examples which show how to use floating-point render targets.

For the basics on OpenGL, you can get started by following the tutorials at http://nehe.gamedev.net/ – do and understand at least the first 7 or 8 to get your feet wet before going to those other referecens.

Last, I recommend that everyone download and refer to the OpenGL specification, version 1.4 is available as a PDF on the front page of this site. When I want to look something up, this is my preferred reference!

The CPUs that everybody have can do operations with 32 bits of precision.
But if you implement a large number library, like GMP, you can work with huge numbers, that in my case can have up to 700 digits.

So, is it possible to implement such a library in a graphic card?

Originally posted by nlopes:
[b]The CPUs that everybody have can do operations with 32 bits of precision.
But if you implement a large number library, like GMP, you can work with huge numbers, that in my case can have up to 700 digits.

So, is it possible to implement such a library in a graphic card?[/b]

32 bits for integers you mean? There are plenty of ancient CPU’s out there that can do 64bit in hardware, but that’s not enough if you want 700 digits.

The problem with GPU’s is input, intermediary stages and output.
What I mean by this is that you might put an integer in, then this is converted to something else (most likely 32 bit IEEE float), then finally you get a color output which can be 32bit float too now (great times ahead!).

So anyway, to answer the question, yes I think the functions are there. I think you’ll need some compare, jump instructions (NV_vertex_program2).

To give an idea on what to do, assume your float frame buffer is an array of integer numbers, not pixels.
Let’s say we want to add values 10 billion and 10 thousand billion.
Send a GL_POINTS down the pipe, with a couple of vertex attributes.

attrib 10 for point might be (0.0, 0.0, 10.0, 0.0)
attrib 11 for point might be (0.0, 00.0, 10000.0, 0.0)

you need to write a special vertex program to add the values.

The result will be (0.0, 0.0, 10010.0, 0.0)
which is written to the frame buffer somewhere.

Now you do what you need to do what that RGBA value.

Anyone see a problem with the method?

No, you cannot for all practical intents and purposes implement a large-number library for a GPU.

Using the vertex processing part of the GPU for scientific processing seems useless, as it’s mostly a traditional vector processor. Modern pentiums can probably out-process the GPU, especially if you consider CPU memory bus (6.2 GB/s) to AGP 4x (1.0 GB/s) or 8x.

Using the fragment processor is more interesting because of its ability to represent a vector field or other 2D data as textures/rendertargets, and the higher degree of parallelism and faster memory (REALLY fast memory – 20 GB/s on the high end!)

I have to agree. Doing arbitrary precision on a GPU would be rediculously difficult. You’re pretty okay using floating point, but you have to remember that while they use a IEEE floating point format, they don’t guarantee anything about the accuracy of the operations. I don’t believe that up to this point and of the card manufacturers have published anything about the floating point accuracy, so I’d be very wary about high precision stuff.

Performing computation on a modern programmable GPU is a pretty good idea for several kinds of computation, specifically low-precision SIMD computation. Perhaps this generation’s hardware isn’t quite up to the task, but it will only become more and more appropriate. GPUs need only be made more flexible in a few more aspects to be fairly useful in non-graphics applications.

Would a large number library really be that hard to do? I suppose arbitrary precision math is realistically impossible, but I don’t see why it would be necessarily so hard to obtain better precision than what is natively supported. It probably wouldn’t take much more than what is already out there, and a bit of cleverness.

-Won

PS JONSKI - The RX-7 is cool because it has a rotary engine, not a piston engine, but I think that’s what you meant. The RX-8 is coming out so the wankel is making a comeback. Yea.

I think that arbitrary precision would be very hard to accomplish on GH. You have to be able to store an arbitrary number of bits for the numerator, and an arbitrary number of bits for the denominator… THEN you have to perform the actual number crunching.

I agree that soon, the floating point computation will probably be to the point of being as accurate as a CPU’s computation… but I highly doubt that GPU’s will go to even double precision any time soon (like less than 10 years).

That being said, there is a large space of problems that are interesting from a GPGPU standpoint… solving ODE’s, PDE’s, linear systems, and much more are all currently plausible problems for a floating point GPU.

Brian

Originally posted by jwatte:
Using the vertex processing part of the GPU for scientific processing seems useless, as it’s mostly a traditional vector processor. Modern pentiums can probably out-process the GPU, especially if you consider CPU memory bus (6.2 GB/s) to AGP 4x (1.0 GB/s) or 8x.

What’s a “traditional vector processor”.
Modern Pentiums can do 4 numbers in parallel using SSE, SSE2 while the GPU probably can do more. It is perhaps 8 or 16 on Geforce class. Who knows.

The fragment end of the GPU might be a better place to do calculations. I imagine there is some kind of balance between the units in the pipeline to maximize performance. As long your triangles have a certain area not larger then number X, and you are not doing anything complex in fp, there wont be any stalls and the GPU will run at peak performance.

and then you take those numbers and send them to marketting.

If I understood, you are advising me to give up of my idea and wait until the next generation of GPUs (maybe when graphic cards return to pci!).

In the middle time, I sent an e-mail to ATI and nVIDIA, but I haven’t received any answer till now.