Realtime GI, questions and my findings (big post!)

Started by
21 comments, last by rgoer 15 years, 9 months ago
Hello again,

I implemented sperhical harmonic lighting, but used 1 probe that follows the camera instead of using a huge static grid. I know the GI won't be correct since its only measured at 1 point, but I was hoping to get proper results for nearby/local lighting. But unfortunately...

So what I did is rendering a small (16x16) cubeMap at the camera position. The cubeMap captures the direct lighting around. This cubemap is converted to SH coefficients, and these are used in the final pass to add ambient lighting.



I ran into a problem I had before in the past when making a radiosity lightmap generator. The light 'dies' very quickly. Imagine a red light shines on a wall in front of you. When you render the cubeMap nearby that wall, at least 1 face will be ~100% red. So, walls at the oppisite direction will get this red color via the ambient lighting. But when I move away from that wall, the weight of the red pixels captured by the cubeMap will decrease very fast, since other (black) pixels will come in from other walls, the floors, ceiling, etc.

This simulates attenuation, which is good. But the light influence falls away too quickly. Only very nearby geometry will catch the "reflected" light from the litten wall. I can "fix" this a little bit by making distant pixels more bright, but this is not a real solution of course.

Anyway, the result is a way too dark scene. Only when I walk inside a light beam, the scenery will suddenly light up. Not smoothly neither. I think the small cubeMap resolution and/or relative low-res SH quality makes the lighting change quite abrupt when moving around.


Now I was hoping someone has experience with (local) nearby ambient lighting, by just using 1 probe that moves along with the camera. Maybe there are some tricks to improve the results? And what to do about the attenuation problem? I mean, those who made a lightMap generators that use the GPU instead of raytracing probably faced the same as problem as well. And if there are no real solutions... well, then we at least know this technique does not work for making realtime (local) GI :(

Greetings,
Rick
Advertisement
Hi Spek,

I've been following this post for a while, as i'm attempting something similar, albeit a lot simpler for a demo i'm working on.

I'm working on a low frequency ambient lighting solution based on a paper I saw a while back which stores static ambient occlusion per vertex, along with bent normals. The data is captured in a pre-process step using cosine filtered hemicubes.

Anyway, I render a skydome to a hires floating point cubemap, downsample to a lowres 16x16x16 cubemap, and then render another lowres cubemap from an object/characters point of view, rendering the low res skydome cubemap, and scene geometry, with ambient and direct lighting applied. When rendering this cubemap, I apply a 3x3 blur by rotating the cubemap lookup vector using rotation matrices.

I only implemented this in the last few days, but for my test scene, which is a room with a couple of windows, and a couple of omni-shadowed lights, I think the results are pretty good. I definitely thing the blurring is the thing that makes the difference. I'll try to post some screenshots when I get home from work.




Hi _Lopez

I'd love to see some shots! But let's see if I understand your approach a little bit. You calculate the occlusion factor per vertex (could also do that in a lightMap for more detail eventually). I don't know what bent normals are though. I've heard of them, maybe even used them before, but forget what they exactly are.

To get your (realtime) lighting, you render half a cubeMap. Where exactly do you render these? At the vertex points, using the vertex normal as a direction? You use the cosine map to make the light coming from straight forward have more incluence than light coming from a steep angle? If so, this sounds alot like how I captured data to generate a (static) lightmap long time ago. I had a couple of problems with that though. Like I described one post before, the influence from a (reflected) light source will fade very quickly when the distance between the reflection spot and the receiver (hemicube in this case) grows. 2 other problems are using per pixel normals, and using the data for dynamic/moving objects.

Blurring might help spreading out the light, and making the lighting less abrupt changing when the measuring point moves. But if I understand it so far, this technique is not suitable for normalMapping (unless multiple hemicubes are used to measure from multiple directions, like Halflife2 did). What I don't understand is why to use ambient occlusion values per vertex here. In theory, you won't need them if the hemicubes (or whatever we use to capture the environment) measure properly. Or maybe you only use a few hemicubes, and mix it with ambient occlusion to increase detail? In that case I'd like to know at which points your cubes are. And a couple of screenshots s'il vous plait :) I like to see your results!

Greetings,
Rick

#spek - AFAIK Bent normals are vectors - when I use correct ambient occlusion in my ray tracer (can be even real time in less complex scene). Bent normal is average direction of unoccluded samples, when sampling ambient occlusion. It's mainly used as lookup vector to do some image based lighting.

Anyway when I finish work, which I have to do now (finishing world editor, porting my engine onto another platforms, do some decent documentation on it's library, etc. - it's pretty much work, but I'm not working alone) - I'm thinking about looking deeply into hybrid rendering (or even some pure ray tracing) and using ray tracer to approximate correct ambient occlusion (well almost correct, but it's not a fake solution like in case of SSAO) or even correct indirect illumination. Of course dynamically in real time (This might sound like little crazy idea, but I've experimented with hybrid approaches (and even pure ray tracing) before and I'm still experimenting).

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Hi Villem,

Good solutions always start with crazy ideas. I never did anything with raytracing, so I'm not sure what you're doing. Nevertheless, show us the results when you have something working :) By the way, I'm kinda 'affraid' that I'll have to learn raytracing some day as well. Not that raytracing is a bad thing, but I don't like the idea I need to do and learn some radical changes someday. So much to do, but so little time. Just like you I got to work, and I expect a little copy of me in 2 or 3 weeks. Yep, changing diapers instead of changing rendering techniques :) If I could ask 1 thing to God, I'd like to have 48 hours per day instead of 24.

But in the little time I still have now, I think I'll try realtime radiosity. It's somewhat the same as the "probe grid" approach. I make some sort of lightMap for the static geometry. Not a high-res one, one pixel in the lightMap could easily cover 1 m2, or maybe even more. I render nearby patches on the lightMaps by placing a probe on the patch position, looking into the surface normal direction. I could do this with hemicubes, or with paraboloid mapping. with a paraboloid map, I only need to render 1 time per patch.

First I render the direct lighting on a low resolution. Maybe with a downscale/blur pass in it, like _Lopez said. From this map, 1 average incoming is calculated with the help of the cosine law. The results are stored in a realtime lightMap. That means if I update 8 patches per frame, 8 new dots are drawn onto that map. When I do a second bounce, I render the surrounding environment with the lightMap from pass 1. The final pass outputs not 1, but 3 or 4 colors on 3 or 4 final lightMaps. For each global light direction, I calculate the color and store it in a lightMap. This is similiar to Halflife2's Radiosity NormalMapping. This allows me to do normalMapping in the final pass, which blends between the 3 or 4 lightMaps, depending on the pixel normal.

Advantages
+ No unused probes wasting time/memory. All patches are connected to the static geometry

+ Suitable for large (outdoor) scenes, without requiring gigantic much memory like the 3D textures to store SH coefficients in the ATI demo did. The level of detail in lightmaps is very easy to adjust. A terrain can do with a relative low resolution for example. If we use 512x512 lightmaps and 2 bounces, we need
1(pass1)+3(final pass, 3 light directions) = 4 maps (RGBA 16F) = 6 MB.

+ Less patches means less to update before the entire world is refreshed

+ Suitable for multiple bounces, and relative easy as well since we only have to use the previous lightMap when capturing the environment for a second(or third, or ...) time

+ Final pass is faster than decoding and using SH coefficients. Just pick 3 or 4 pixels from the lightMap, and blend between them based on the (pixel)normal.

+ Can capture the environment for the patches with in a single pass if paraboloid mapping is used. Dual paraboloid is not needed, since the backside will never be used

+ No pre calculations needed, except that we'll have to calculate Atlas texture coordinates for the static geometry. But this can be done very fast. The actual lightmaps will be created realtime. Or... if you use a low-end system, the lightmaps are only updated once when loading the scene. This is not realtime of course, but at least we don't need a completely different approach when switching off realtime ambient lighting.


Disadvantages
- NormalMapping is somewhat less accurate. But... The HL2 results were not bad, were they? These results should be a little bit the same (although less accurate overall lighting, I think)
- Paraboloid mapping requires a high-tesselated version of the static geometry
- Even with less patches, it still takes time before the entire scene has been updated
- If the static geometry changes (not really static :) ), you'll need to recalculate their patch positions as well.
- Lightmaps can't be used on dynamic objects
- Somewhat more difficult to determine which patches are close to the camera and thus need a higher update priority


Biggest problems are the high-tesselated scene, and the lack of support for dynamic objects. I'm figuring out a way to do them realtime as well. Probably by placing cubeMaps (or dual paraboloid maps) nearby objects and update a couple of them per frame. Nice thing about the whole lightmap approach is that we can render these cubeMaps fast as well. Just render the environment with the final lightMaps, and voila, we have a cubeMap that can be used for the dynamic objects. However, placing them dynamically is another story...

Ok, let's go, time to code :)
Rick
So spek - I'm not able to reach my PC, where I have all the data (not at home) - so I created one raytracer in fast on my notebook (took 5 hours).
It shows ambient occlusion in almost real time (just single core) - at least 2 Ghz to be interactive, use just arrow keys to rotate view and WASD for moving. Try to focus on jumping box, which casts dynamic shadow using distributed ray tracing (just 1 sample) and ambient occlusion (just 4 samples) - this laptop has just 1.3Ghz, so it's little nightmarish to try some raytracing here (god bless my home PC with dual core CPU at 2.8 Ghz).
Raytracer
It was built in a hurry to just show - so no optimisations (expect my ray triangle collision, which is highly optimised, no SIMD and no ASM! just pure C code). It's without source.
Anyway I'm again going to bed (it's almost 5am in the morning ... I have to get up in 3 hours :D).

Minimal requierments:
CPU: x86 compatibile 32bit instruction set compatibile (you can try even any old Pentium, it'll run - but very very slowly, I recommend at least 1Ghz for non real time, 2 GHz for interactive and 3.8Ghz for interactive).
Memory: well app and it's sources will have around 2MB in memory :D, but I presume windows kernel will take much more
Graphics card: any - we have to have something what's sending our raytraced images onto screen
System: Windows only (might run even on Win 95, but I tested on XP)

EDIT: Ambient occlusion is true and correct (maybe too strong, but correct), enjoy and comment :D, it's calculated per pixel (And now I'll jump into my bet :-P ... hop, zzzzz zzzzz ...)

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

What to say... it works! Not very fast, but then again, my laptop is not really a "game computer". Dual core, 1.6 Ghz. And a GeForce Go 7600 Videocard, but I guess the videocard is not really important here. The box drops a nice dark shadow beneath it. But the other (big) shadow, does that also come from ambient occlusion, or is the shadow casted from the box by a direct light?


I had a productive night as well, implemented the realtime radiosity. Far from perfect, but it starts working now. And I'm pleased to say that updating the patches is very fast. I always avoided realtime radiosity, because last time I made an offline radiosity lightmap generator, it took ~5 hours to create a nice map. But since I can do with far less detail here (no need for direct lighting in the lightmap), and optimized the generator MUCH better, I can update the entire scene lightMap in ~10 seconds. I'm not sure if it was a 32x32 lightmap, or 64x64. Bigger scenes probably need a bigger map, although the current scene has too much detail in the lightMap. Less patches should work as well.

I update 10 patches per frame now, and the entire thing still runs on 70 FPS, including DoF, SSAO, a few spotlights with shadowMap, paralax mapping, HDR & tone mapping, etc. Not bad? I think I can easily do 20 or even 30 patches per frame as well. But I like to put that energy in multiple bounces.

No matter what GI method you use, I'm starting to think rendering multiple bounces is really one of the keys for succes. With just 1 indirect lighting bounce, the lighting did not look good for any of the techniques I tried so far. Ussually it results in a scene that is still black, except for the floor where the direct lighting is active, and the ceiling above which catches the light indirectly. Luckily the approach I'm trying now is very suitable for multiple bounces. And the final lightMaps can be easily blurred as well, to remove noise and artifacts.

In this case, I only need 4 tiny lightmaps for the ambient lighting with 1 extra bounce. So the huge memory requirements that 3D grids require are fixed here as well. So far, so good. Now I have to make the results better, the results are still ...ugly. 1 or 2 extra bounces, enabling normalMapping, checking if all the patches are really rendered properly, blurring the final lightMap. And using dual paraboloid maps for the patches instead of just lazy rendering with a 90 degree FoV forward.



If this works properly, I must make a proper update method. Updating patches goes lightning fast, but I need to update a whole mountain of them as well. I was thinking about making a 3D grid (on the CPU side, not for the videocard) that stores the indices of nearby patches. This way, I can update patches nearby the camera first. Even better would be updating only where the movement is (big object moved, light changed, etc.). But it's damn hard to determine which patches are affected by that!

And then we still have the dynamic objects. I'll have to find an alternative method for them. However, the final lightmaps still can be used to render the environment around an object. Just catch that light, and we have a probe for a dynamic object. Not 100% accurate, but ow well... A bigger problem here is where to place these probes? In the ideal sistuation each object would have its own probe, but that will be way to much. We can only update a few of them per frame anyway. I was thinking to use a grid again. If an object enters a cell, this cell will be activated, which means a probe is rendered there. All the objects inside that cell share the same probe. Only nearby cells can be activated, otherwise I still have to deal with hundreds of cells in a worst case scenario.

I hope I can post a screenshot soon! A vacation card from Renderland
Rick
So after demo time some screenshot time. I made it home today, so on my computer it looks something like this (shadows - 8 samples, ambient occlusion - 4 samples and 4 samples onto supersampling (somehow I haven't choosed the best value - so supersampling isn't the best) ... summed it gives me 52 ray samples per pixel, plus bilinear texture filtering math, plus some shading math and ambient occlusion math (that's pretty much of divisions)), and it's interactive (Without any optimisation). If I'd do some bounding volume hierarchy - I could get more fps - maybe even ~50 or ~60fps (because I'm testing every ray agains every triangle - trianges=14 (12 on box, 2 on floor)).
About shadow - ye you can see ambient occlusion under the box, another occlusion (shadow) is from omni area light source. With these 8 samples it looks much better (i think), shadows are almost photorealistically soft.





Anyway about your lightmaps - realtime radiosity is good way to go (I'm going to try it with this ray tracer ... and I hope it'll be at least interactive, I'll post results then). Anyway tracer can support normal maps and this is calculated per pixel (I just haven't loaded them and hadn't turned them on, let's say too slow notebook).
~5 hours to update map ... wow, I was using radiosity once, but that took max. 1 minute to update whole map (and on large scenes), heh it was created using rasterizer and then used as texture lookup in my old non real time ray tracer :D (nice hybrid technique).
Anyway I apologize for pretty long post (mainly due to large images).

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

The soft shadows are really nice. And no need to fake them with all kind of crazy GPU tricks. That's what I like about raytracing, its 'pure'. I've been thinking... People made a GPU specialized for all kind of vector/graphic operations. People also made a card specialized on physics operations. But for raytracing, we still use the CPU (usually). Why don't they make a card specialized on this kind of stuff? Collision detections, raytracing, etcetera?


I just had an idea for the dynamic objects in combination with lightmaps. I keep saying that you can't use the lightmaps for dynamic objects, but... why not? If I shoot 6 rays (+x,-x,+y,-y,+z,-z) from an object, it will collide with 6 polygons. With that information, I can calculate 6 lightMap coordinates for the object. Instead of making a probe for the object, it just picks the lightMap patches on the left,right,floor,ceiling,front and backside. Based on the (pixel)normal, we can blend between the 6 patch colors.

It's less accurate than rendering a probe at that position of course, but at least its fast and simple. The energy we save with not rendering probes for objects, can be used for updating even more patches per frame. And since the ambient lightmaps are low-res anyway and (should) use multiple bounces, the overall lighting won't vary that much (making the lighting on an object constantly change when it moves around). Calculating the 6 lightmap coordinates per object could become a problem as well (if you have thousands of objects), although you only have to do this when the object moves.

Keep on the good work!
Rick
Quote:Original post by spek
The soft shadows are really nice. And no need to fake them with all kind of crazy GPU tricks. That's what I like about raytracing, its 'pure'. I've been thinking... People made a GPU specialized for all kind of vector/graphic operations. People also made a card specialized on physics operations. But for raytracing, we still use the CPU (usually). Why don't they make a card specialized on this kind of stuff? Collision detections, raytracing, etcetera?


Larrabee should be coming out soon, hopefully. Still, raytracing isn't the ultimate rendering solution; rasterization has huge benefits.

This topic is closed to new replies.

Advertisement