Archived

This topic is now archived and is closed to further replies.

okonomiyaki

Taking advantage of the GPU

Recommended Posts

I posted something related to this in DirectX forum, but it wasn''t really this same question (more about making sure I was using HAL) and I''ve also posted in Graphics Programming and Theory about specifically using dynamic textures and such (writing a texture every frame) and basically I''ve come down to this point. I just need a general idea, which is why I posted here. I''ve recently upgraded from a Voodoo 3 (heh, yeah, I know) to a Geforce 4. The thing is that now I have absolutely no idea how to transfer a lot of the processing to the GPU. Meaning that my program doesn''t run much faster because it''s still all (or most) being done on the CPU. I''ve looked all over and I can''t find any article or anything that explains how to take advantage of using the GPU processing power. In my situation, I need to render and change a texture almost every frame, and supposedly I can do that on the GPU very quickly. In general, how do I perform operations on the GPU?

Share this post


Link to post
Share on other sites
As far as I know, DirectX and OpenGL commands "automaticly" try to use the GPU as much as possible.

Soo... my advice would be to use their functions as much as possible ^_^

Share this post


Link to post
Share on other sites
In my knowledge, which is minimal cos I''m not a graphics programmer, OpenGL takes advantage of it automatically, whereas at least DirectX 7 requires you to use the correct device - it has a HAL device and a TNL_HAL device, and you need the TNL one. I can''t comment on DirectX 8.

[ MSVC Fixes | STL | SDL | Game AI | Sockets | C++ Faq Lite | Boost | Asking Questions | Organising code files ]

Share this post


Link to post
Share on other sites
quote:
Original post by billybob
use hardware vertex processing, that should give quite a large boost


Yep, that was the first thing I did. But my program still ate up CPU processing.


quote:

In my knowledge, which is minimal cos I''m not a graphics programmer, OpenGL takes advantage of it automatically, whereas at least DirectX 7 requires you to use the correct device - it has a HAL device and a TNL_HAL device, and you need the TNL one. I can''t comment on DirectX 8.



As Cat said, I''m pretty sure that DX 8 does the same thing, about using it automatically as long as you are using HAL. I think you are right about DirectX 7, but I''m using 8.1
Thanks for the comments though. I haven''t really figured out a way to reduce CPU processing yet, but I just have to figure out how to take advantage of DirectX 8''s functions instead of mine (for example instead of locking a texture and writing it myself, somehow render it with DX 8 writing it)

Share this post


Link to post
Share on other sites
If what you're doing now is updating a texture in system RAM via the CPU, procedurally or otherwise, you can instead upload the texture once and use pixel shaders/programs/whatever on it to manipulate it as you please. If your texture manipulation involves rendering and not some procedural function then use render_to_texture. Either way, the texture stays on the GPU, the work gets done on the GPU, and the CPU does something else.

[Edit] Aside from that if you have any vertex transformations that you're doing in software, like reorienting billboards or what have you then you can also do those on the GPU with vertex shaders.

Don't bother using the GL matrix mul functions if you were thinking those would be accelerated too, only matrix-vector transformations are actually done on the GPU in the TNL pipeline, matrix-matrix muls are done by the driver in software.

------------
- outRider -

[edited by - outRider on July 30, 2002 7:12:26 PM]

Share this post


Link to post
Share on other sites
Are you using DX vertex buffers, or are you doing your own processing? As long as your using vertex buffers and have it set to hardware vertexd processing, your fine.

The real key is re-architecting your application to use both processors at the same time. Usually your code flow is so a bunch of stuff then render everything. With hardware TNL you should do some stuff, render some, do some more, render some more, finish doing stuff, then finish rendering. The actual number of steps will vary based on the architecture of you program, but it should be basically the same.

If you do all your rendering at once, your CPU is just spinning doing nohting while the GPU is crunching the scene.

Stephen Manchester
Senior Technical Lead
Virtual Media Vision, Inc.
stephen@virtualmediavision.com
(310) 930-7349

Share this post


Link to post
Share on other sites
http://developer.nvidia.com/view.asp?IO=gdc2001_optimization

i found this to be quite useful. there are several other articles and samples on nvidia''s website that are interesting as well.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
I would imagine that a Pentium 4 is much faster than a GeForce4 at certain things. The point in a GPU is not to speed up calculations, just to share some of the work. So if your program is only graphical, its not much helped by a GPU, because transformation and lighting is of similar speed on a CPU. The idea is to use the GPU for graphics, and the CPU for physics, AI, etc.

Share this post


Link to post
Share on other sites
Yeah, I''m using DX 8.1 with vertex buffers. I guess my program runs fine except for the fact that I''m telling the CPU to change a texture too often. I can think of a few ways around this.. (maybe even learn pixel shaders.. agh!) and I''m hoping that my program will shoot a lot higher than 17 fps. thanks guys

Share this post


Link to post
Share on other sites
quote:

I would imagine that a Pentium 4 is much faster than a GeForce4 at certain things. The point in a GPU is not to speed up calculations, just to share some of the work. So if your program is only graphical, its not much helped by a GPU, because transformation and lighting is of similar speed on a CPU. The idea is to use the GPU for graphics, and the CPU for physics, AI, etc.



Hm, why do mostly anonymous posters post such rubbish?

Lets see - yes, a PIFV is much faster than a PF 4 at certain things. Like at Disc IO, or at executing Win32 programs.

For everything graphics relates, the GPU beats the hell out of a PIV. Relying onto the PIV is exactly why you are CPU bound.

"T&L is of similar speed on a CPU" - where do you guy live? T&L is exactly where the GF4 is going to beat the HELL out of yyour PIV. The number of transformations that can be done on a GF4''s Transformation unit is WAY higher than the number of transformations you can do on a PIV - specialised hardware beating generic hardware.


Regards

Thomas Tomiczek
THONA Consulting Ltd.
(Microsoft MVP C#/.NET)

Share this post


Link to post
Share on other sites
thona is right, i think even a geforce 256 can transform more than a p4. think about all the current games, if you put them in hardware tl they are MUCH faster than software, even on slow tl cards.

Share this post


Link to post
Share on other sites
quote:
Original post by okonomiyaki
Yeah, I''m using DX 8.1 with vertex buffers. I guess my program runs fine except for the fact that I''m telling the CPU to change a texture too often. I can think of a few ways around this.. (maybe even learn pixel shaders.. agh!) and I''m hoping that my program will shoot a lot higher than 17 fps. thanks guys


I guess it''s the fact you are generating a texture every frame is the problem. What size is this texture and what is it used for? Can you reduce the size? Can you update it every 2nd frame? Can you optimise the code that generates the texture? I don''t know much about dynamic textures, but is it possible that it might be faster writing to a texture in System memory if you''re doing every pixel?



Read about my game, project #1

NEW: see my progress from last week - join the mailing list to get this and other updates every week!



John 3:16

Share this post


Link to post
Share on other sites
You guys are right.. doohg, it''s a 1024x1024 texture for my sky The resolution is pretty important. BUT I have found that I''ve achieved amazing results (170 fps!) by transforming the texture on the hardware with texturestagestates! Yes, a geforce256+ could easily beat a P4 in graphics. I''ve found that out first hand.

Share this post


Link to post
Share on other sites