GPGPU Serialization/Deserialization

Started by
9 comments, last by AndyTX 16 years, 3 months ago
I want to do some GPGPU stuff on my graphics card. Leaving aside the details of the specific problem for the moment, here's what I need to be able to do: Send a structure to the graphics card (which may contain strings, though I may be able to remove those) and deserialize inside a shader (most likely a geometry shader, though potentially a fragment shader) for use. Is this kind of thing possible? Is there reference information somewhere where I could figure out how to do this? Cheers, --Brian
Advertisement
I don't know what you mean by deserialize in this context. The typical method of supplying data to a GPU program is by placing that data in the vertex data stream or encoding it into a texture. I guess you could "deserialize" from there into a pseudo-instance of your struct (there's no string type, so you'd have a hard time doing much of use with that data).

Can you provide more details about what you're trying to do? It sounds offhand like you're trying to do something on the GPU that is going to be exceedingly difficult and likely useless from a pratical perspective (in other words, it's academic).

Have you looked at gpgpu.org and CUDA?
I saw an article in the ShaderX series of books which demonstrated a 'printf' shader for debugging shader code, allowing you to print strings to textures. Everything has its place (even if the place for most things is the bin ;-))

I would start by deciding on a shader model to work from. It would be easy with SM3 and texture arrays. Pass in a 1D texture with each character encoded as an 8bit value, then use that value index the texture array, populated with matching character bitmaps.

Or for backward compatibility you could encode the string as before in a 1D texture, then bind a 2D texture with all the characters encoded such that the x,y coordinate can be decoded from the 1D texture.
Quote:
Can you provide more details about what you're trying to do? It sounds offhand like you're trying to do something on the GPU that is going to be exceedingly difficult and likely useless from a pratical perspective (in other words, it's academic).


It is academic. :-)
To save on providing a whole big description of the goal, here is a sample of want I want to do (assuming for the moment that strings aren't involved):

Given a struct containing a few data members (for the moment, we'll say
int x, float y, and float z), I want to:
(1) Shove the struct into a buffer (texture, I suppose) and send that to the GPU.
(2) On the GPU's end, break apart the buffer to recreate the original structure so that I can use it.

And here's a summary of the goal, if you like:
I have a series of Shape Grammar rewrite rules stored in my structure. I want to send those through along with a set of components to be rewritten (probably in a geometry shader, though I may be able to use a fragment shader for this). But to actually interpret the rules, I'll need them back in their original structure form.

Bit of a messy problem, I know. Basically we're trying to improve the performance of procedural building generation by shifting it to the GPU.

I had a quick look at CUDA, but I didn't get the impression it would help me significantly. I'll give it a more thorough look today.

Thanks,
--Brian

Quote:I had a quick look at CUDA, but I didn't get the impression it would help me significantly. I'll give it a more thorough look today.


CUDA completely eliminates the need for serialization, since you can manipulate structures on the GPU as you would on the CPU, without an intermediary translation step.
Quote:Original post by ToohrVyk

Quote:I had a quick look at CUDA, but I didn't get the impression it would help me significantly. I'll give it a more thorough look today.


CUDA completely eliminates the need for serialization, since you can manipulate structures on the GPU as you would on the CPU, without an intermediary translation step.


Yes, the advantage of CUDA is that everything goes pretty fast as long as it is inside the CUDA memory. However, as far as I have seen, it is always necessary to copy around the results inside the GPU memory. This means copy the computed result to a texture or vertex-buffer-object before it can be used by GL.
Another thing using grammars is that CUDA didnt not allow recursive programs so far (which is the easiest way to implement grammars) - but this might have changed.

-Sven
Quote:Original post by spacerat
Yes, the advantage of CUDA is that everything goes pretty fast as long as it is inside the CUDA memory.

Well it's more that it gives you functionality that the graphics API lacks.

Quote:Original post by spacerat
However, as far as I have seen, it is always necessary to copy around the results inside the GPU memory. This means copy the computed result to a texture or vertex-buffer-object before it can be used by GL.

I'm not sure what you're getting at here. How does CUDA require any "copying" that the graphics API avoids? Anything you can do in a fragment shader you can do in CUDA, so why would you need to copy data back to GL? The only thing that you need to use GL for is the rasterizer, but I don't see why your application would need that.

Quote:Original post by spacerat
Another thing using grammars is that CUDA didnt not allow recursive programs so far (which is the easiest way to implement grammars) - but this might have changed.

GPUs generally don't support recursion, which is true both in GL and in CUDA (so there's no disadvantage of CUDA). However since you *do* have indexed temporaries as of CUDA/GPU_shader4, you can implement a stack and thus you can do recursion "manually".

Basically if your application is non-graphical and doesn't need the rasterizer, there's no reason not to use CUDA.
Quote:Original post by AndyTX
Basically if your application is non-graphical and doesn't need the rasterizer, there's no reason not to use CUDA.

Also assuming you have the budget for a G80+ :)
Quote:Original post by Zipster
Quote:Original post by AndyTX
Basically if your application is non-graphical and doesn't need the rasterizer, there's no reason not to use CUDA.

Also assuming you have the budget for a G80+ :)


That's what research grants are for. :-)

Thanks guys, I'll look into this stuff.
Quote:Original post by AndyTX
Well it's more that it gives you functionality that the graphics API lacks.


Thats true - I even found its possible to have classes and OO-programming in CUDA although it is supposed to be plain C only :-)

Quote:Original post by AndyTX
I'm not sure what you're getting at here. How does CUDA require any "copying" that the graphics API avoids? Anything you can do in a fragment shader you can do in CUDA, so why would you need to copy data back to GL? The only thing that you need to use GL for is the rasterizer, but I don't see why your application would need that.


In my case, I am writing a raycaster and I wanted to render from CUDA to the framebuffer directly - but I didnt find any way to achieve this.
Copying around makes it possible of course, but it slows down the complete process - and I dont see why I need to copy around if its both graphics card memory.

In case of shape grammars, it might depend on the complexity whether the geometry shader or CUDA is faster. In any case - CUDA will be more convinient as the normal C-code can be reused immediately.

Last a little note to my CUDA experience: I found it is most important to prevent branching in the code. Its more costly than anything else. Already a single branch reduced my GPU usage from 100% to 66% (!!!) according tho NVidias performance measurement utilities. After finishing my implementation I found I got a GPU usage of only 16% due to all the branching :(
I still cant explain why this happens and how to prevent it.

-Sven

This topic is closed to new replies.

Advertisement