Realtime raytracing

Started by
8 comments, last by AndyTX 16 years, 7 months ago
I'm working now on realtime raytracing. I'm writing some code in assembly (it has very good speed), but i don't really know how to use gpgpu (graphics processor unit for general purposes) under DirectX and OpenGL. I had only one idea to send there a texture and after processing send it out. Isn't there somthing better? By the way the raytracer will work on processor, but if you turn on DirectX10 / OpenGL3 (because there is better work with shaders) - gpu will accelerate raytracing. Does anyone of you know how much it'll accelerate or maybe deaccelerate?

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Advertisement
Have you checked out the Microsoft Research Accelerator project?

http://www.google.com/search?hl=en&q=microsoft+research+accelerator+project&btnG=Search

It is basically an abstraction layer for utilizing GPU's for highly parallel computation .. requires .NET but no need to mess around with DirectX/OpenGL with this thing .. not sure how performant it will be for what you want but I have been very satisfied with it for supersized SIMD
How do you plan to accelerate your ray tracer using the GPU?
You might want to check out CUDA, which is an extension to the C language developed by nVidia for doing GPGPU on G80+ hardware. I used it for a few projects at the beginning of the year and it's very nice and easy to work with. The obvious disadvantage is that it only works on the latest nVidia hardware, but it could be interesting to explore since it's all C-based and API independent.
GPUs incur heavy performance losses for non-sequential memory accessed (such as those found in raytracer applications). I wouldn't expect anything higher than a 2x speed improvement without GPU-specific development.
to Rockoon1 - I don't really like microsoft, but this project is something what i wanted to find (but above all i like messing in OpenGL and DirectX - it's very good api's for realtime rasterizing) - maybe it'll be another port for my raytracing api (it'll support CPU raytracing, CPU + GPU raytracing using OpenGL/Direct3D, and maybe this - but not now, now i'm working on assembler version, so maybe in few months, maybe in a year or maybe sooner - i don't really know now).

to phantomus - i'll send a part of calculations to be done on GPU - like collision, reflection, ... and maybe postprocessing, like HDR, Bloom, Antialiasing (this is really big problem), ...

to Zipster - i've got AMD/Ati Radeon HD 2900 XT, so i'll not support CUDA (maybe in some time, but first will be ASM version, then OpenGL/Direct3D, and maybe it'll be after that.

to ToohrVyk - the main speed acceleration will be in assembly languague (object oriented C is very slow), so ASM is the first step. And if shaders are gonna to be slower then asm, then i'll release only ASM version, else it'll support much more versions.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

I think someone was doing ray tracing with spherical bounding volumes which ran fast on a GPU due to good memory access patterns. Also I would suggest looking at those doing ray casting for volume rendering on the GPU. Might give you a few ideas.

As for Shader Model 4.0 / DirectX10 there is a tremendous amount of general purpose programming which can be done on this new hardware. One thing to keep in mind about these latest NVidia GPUs is that they basically run one shader program on N sets of data (32 I think with G80) in parallel, so they all have to do conditional branches in the same direction or wait on those threads which branch in the other direction.

BTW, there was also a GPU Gems II (I think) article on Ambient Occlusion which you can download online (I think) from the NVidia website, which goes through a hierarchical tree traversal which works good on GPU.
_|imothy Farrar :: www.farrarfocus.com/atom
Quote:Original post by TimothyFarrar
I think someone was doing ray tracing with spherical bounding volumes which ran fast on a GPU due to good memory access patterns. Also I would suggest looking at those doing ray casting for volume rendering on the GPU. Might give you a few ideas.

As for Shader Model 4.0 / DirectX10 there is a tremendous amount of general purpose programming which can be done on this new hardware. One thing to keep in mind about these latest NVidia GPUs is that they basically run one shader program on N sets of data (32 I think with G80) in parallel, so they all have to do conditional branches in the same direction or wait on those threads which branch in the other direction.

BTW, there was also a GPU Gems II (I think) article on Ambient Occlusion which you can download online (I think) from the NVidia website, which goes through a hierarchical tree traversal which works good on GPU.


One could also just use OpenGL so the program's would actually run on older platforms such as Windows XP, or crossplatform( Linux ) Thank you ^^
http://sourceforge.net/projects/pingux/ <-- you know you wanna see my 2D Engine which supports DirectX and OpenGL or insert your renderer here :)
I'm counting more on OpenGL than DirectX (i like OpenGL a little bit more, and I like linux (it's damn good) - but i'm still working under windows), but the main core will be in asm/c - combined assembly and C (not the pure assembly), as i've said, OpenGL shaders (maybe DirectX) will be added later, the first version will be asm/c version.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Quote:Original post by ToohrVyk
GPUs incur heavy performance losses for non-sequential memory accessed (such as those found in raytracer applications).

Actually GPUs rip through that sort of thing nowadays (even hundreds of "random" memory accesses in a single shader are reasonable). Of course they do better when the memory access is coherent (i.e. when nearby rays are tending to travel down the same chunk of the accelerator), but that's also true for CPUs. GPU/CPU are actually pretty similar in this respect, with Cell being the odd-one-out with its local memories.

The biggest efficiency problem with GPU raytracing right now is the branching. For incoherent rays this can be a real killer. CPU raytracers generally don't run their threads in lock-step (although once you add "packet" stuff or SIMD acceleration you get the same problems!) so you don't have a branch coherence problem. However you trade that for a load balancing problem, and the memory coherence still sucks (particularly with modern multi-core shared caches).

Anyways it's not terribly difficult to beat even multi-core SIMD ray tracers using the GPU, but you do have to change your data structure somewhat, although not as much now that GPUs can do "real stacks".

This topic is closed to new replies.

Advertisement