Jump to content
  • Advertisement
Sign in to follow this  
spek

Realtime GI, questions and my findings (big post!)

This topic is 3721 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, >>> Technique 1, uniform grid with probes & SH (from the ATI demo) Not the first time I ask about realtime ambient lighting, and probably not the last time neither :) With the help from here, I finally managed to do realtime GI by rendering small cubeMaps on alot of positions (a grid), 'compress' them Spherical Harmonic coefficients, and then use it in the final pass. Because I use an uniform grid (10x10x4=400 probes, each probe covers ~2 m^3), I can easily store the SH coefficients in 3D textures, and later access them based on the pixel world position. AND, the pixel will automically blend between 8 coefficients because the 3D texture makes profit of linear filtering. Another big plus is that dynamic objects (characters) can make use of the same ambient lighting and access the SH coefficients in exactly the same way ('impossible' with lightmaps). I update 8 probes per frame now. That will be reduced later, because alot other techniques have to be done as well. On the other hand, this doesn't have to work fast on the current hardware. I'm quite happy with the results, and I can even do multiple bounces (although the framerate dramatically drops for each extra bounce of course). BUT, there are 4 major drawbacks (everyone experimenting with GI for the first time, read with me here): 1.- The 10x10x4 grid is manageable for a relative small scene, but what if the scene is alot bigger (outdoor scene, a large warehouse, etc.)? My 3D textures will explode, since they need to cover the entire scene (so far) 2.- The more probes, the more time needed before all probes are updated. Now it takes ~3 seconds before the entire scene is updated. Not a big problem in this particular case, but what if I suddenly switch a light off? The room could still be litten indirectly for ~2 seconds. I can help a little bit by giving priority to nearby probes. In the ideal situation I only update the probes that are affected somehow (door opened, light switched off, big object moved, ...). But its very hard to know which probe(s) are affected because its INDIRECT lighting. Even more if you want to use multiple bounces as well. 3.- As told, in the final pass, the pixel will pick the SH coefficient just by "tex3D( shCoeffTex, pixelWorldPos )". Works nice and fast, but there is also the risk of picking coefficients behind a wall/below the floor/outside the world. The bigger the distance between 2 probes, the bigger the chance the wrong probe is choosen. For example, a thin wall is placed between 2 chambers with different lighting. The wall pixels are exactly in the middle of 2 probes, 1 behind, and 1 in front of the wall. Because of the linear filtering, ~50% of the lighting on that pixel will actually come from the probe behind that wall -> light from the wrong chamber used. We can fix this somewhat by shifting the pixelposition forward (pixPos += pixNormal * gridSize/2). It doesn't fix all situations though, and could still produce wrong coordinates for narrow environments. 4.- Alot of the probes won't be used (most of the time). Probes are outside the world, or floating in the middle of a chamber while nothing has to be rendered there. waste of memory and time. But I can't just remove them. Maybe there will be an object at the unused probe later on. I could try to detect which probes are outside the world (never update them), which probes are nearby static geometry (always update them), and which are floating in empty space, waiting for an object to come -> when an object moves, render the probes nearby. This helps the update cycle time, but the 3D texture is still filled with useless data, which costs precious memory >>>>> Technique 2, manual placed probes at "smart" positions, with a blend map The biggest problem is how to access all probes, and where to store them? Someone here suggested earlier to manually place probes at "smart" locations. In many cases, we really don't need that much probes for proper results. So, alot of memory can be saved here. But how to access the right probes (and blend between multiple probes)? He was thinking about using a "blend map": R = probe 1 index, G = probe 2 index, B = Blend weight 1, A = Blend weight 2 So, a pixel in the final pass first looks up in this blend map, and the pick the proper probes and blend between them. Could work, and it fixes problem 3 (picking the wrong probes). But, we somehow need to store this blend texture as well. A simple top-down-view 2D texture is not enough, since our scene could have multiple stores or at least have probes placed above each other. So we still need a 3D texture. A big scene still requires alot of memory for that map in that case. Or we use a low-res blend map, but that will introduce "picking the wrong probe" problem again. Another little drawback is that we have to manually place the probes, and generate this blend map. I tried storing probe indices per vertex. Very compact, no need to have a map. But... a "vertex lighting" look is so 1998. >>>>>> Technique 3, probe grid projected on the screen I wanted to get rid of fixed probes/a grid. So I cooked up a complete different plan. Would it be the solution...? Haven't tried it in reality yet, but I'm afraid no. Anyway, how about ONLY placing probes at the geometry you actually see? The camera shoots 9 rays into the screen:
-----------------  probe 5 is the center of the screen (camera focus point)
|  1    2    3  |  probe 1 is topleft of the screen, etcetera
|               |
|  4    5    6  |
|               |
|  7    8    9  |
-----------------
The rays collide at a wall/floor/ceiling/object in front of the camera. At the intersection points, the 9 probes are placed and updated. Why 9? Well, the more the better, but I need to update ALL the probes as fast as possible. Because the camera rotates and moves all the time of course. The 9 SH coefficients are stored in a nice tiny 3x3 texture (with linear filtering to allow blending between coefficients). In the final pass, pixels won't look in a 3D texture, but just pick the SH coefficient corresponding to their screen XY positions (and blend with the neighbour coefficients). Nearby small objects make use of probe 10, which is inside the camera. Not 100% correct, but hey, we're talking about realtime GI here! At least we only need 9 probes (or more if the hardware allows) here, and all the other problems stated in technique 1 are fixed too. This idea gave me a smile on my face when cycling to work. The best ideas always pop-up in me when I cycle in the early fresh morning to work. However, my smile always dissapears when I travel back home, and re-think my ideas. 4 new problems here: 1.- No multiple bounces. You can't reflect light behind the camera... because there are no probes/SH coefficients there. Not a super disaster, since I'm already happy if I can get 1 bounce. But for the future... 2.- The camera rotates alot. If the depth in front of my changes changes much, the lighting can suddenly change as well when I (slightly) rotate. Light on walls should look 'static', it should not change everytime I move or rotate. I could soften this by using a time delay (let the probe move to its new position, instead of directly placing it there). But it still changes. 3.- I'm afraid the 9 probes will produce 9 noticable "squares" on the screen. You won't see it if the lighting is the same everywhere, but as soon as the lighting can change alot locally, I get 9 different results = 9 different filled squares on my screen. Of course, the blending between the probes will blur this, but I doubt if its enough. The only way to fix this is to render MUCH more probes in front of the camera (giving them a smaller section on the screen). But how the heck can I update them all fast enough? Simple answer, I can't. Well, not for now. 4.- Imagine a very long corridor, and a small pillar in the middle nearby the camera. When looking straight forward, the center probe on the screen will collide with the pillar, and thus be rendered there. That's ok for the pillar, but not for the nearby neighbour pixels which are a far distance behind the pillar. When I move a little bit to the left, the center probe shoots into the corridor, which will give wrong lighting on the pillar (and that "changing lighting" effect, discussed in problem #2. I think we'll have to accept there will always be drawbacks. But some of the problems are fatal, making all 3 techniques impractible for somewhat complex scenes (like we whish in games). Unless brilliant minds here have smart solutions for those problems, or a technique # 4.... :) Still awake? Thanks for reading! Rick [Edited by - spek on July 9, 2008 3:47:19 PM]

Share this post


Link to post
Share on other sites
Advertisement
Hi spek,

I'm working for some time on global illumination in real time. Because I'm using hybrid rendering now - rasterization & raytracing, I decided to try some dynamic global illumination using this technology. Evem released demonstration not so long time ago (visit my web pages).
First I'll say something about your described 1st and 3rd technique. I tried them some time ago. 1st technique worked very well for smaller scenes, mainly interiors. I even tried some exteriors with them (but they worked just for small scenes and update wasn't immediate). I presume that you used geometry shaders to calculate this, that's not the fastest way IMHO (well, in rasterizers yes ... in raytracers no). I used some raytracing in some hierarchy - KD tree (scene was static, but I think it can be ported onto dynamic scene using BVHs) to get values into 3D texture and I was able to update more probes ... several hundrets of them per second, though raytracer wasn't optimised that time and had memory leaks, so nothing special (I think that with optimisation it could even do thousands or tens of thousands probes per second). So then it'd be good solution even for huge exterior scenes.
Your third solution would make just nine huge squares on screen, but if you'd use with 320x240 resolution just 160x120 probes, it'd produce pretty realistic solution. Anyway that'd mean at least 10 000 updates per frame (that means 600 000 updates per second) ... and that's really much to todays hardware.

Ok, let's talk about another ... let's call it:
>>> Technique 4, reflective projected maps

This is more fake approach than first two named, but it works (I've tried to use this once, even succesfully, but I had too slow GPU to do that).
Anyway I'll describe this approach just for direct lights, for omni lights it might be little harder (but much more bigger performance eater).
Lets assume we've got three walls and a floor (F.e. left wall is yellow, right wall is red, far wall is blue and floor is white), we're litting everyone of it with 90° spotlight, now we've got projected scene buffer, let's divide screen onto 4 quater screens and take middle point from everyone of it, let's make another new light at any of these middle points and render new texture in hemisphere over it (use paraboloid mapping) with several calculations performed onto it. Then project it again onto scene. And we've got reflected indirect light.
I know this can sound like non-sense babble, I'm not native english speaker ... so be kind and ask if you don't understand something.

And here's my leading horse
>>> Technique 5, dynamic "radiosity" to simulate GI (It's not correct radiosity, but it's similar to it)

Ok, almost everyone knows, what radiosity is - I'm calculating light accessing at some point, it's color and direction (similar to probes). BUT I'm doing it like this:

let's assume that n is number of samples, and m number of bounces (we begin at m = 1)

I trace first ray
|
From hitpoint I trace 'n' rays into scene and accumulate color/light/shadow from hitpoints
|
I've got now another hitpoints, I can do again step two (recursive bounces), and I'll increase 'm'
|
...
|
I'm ending recursion when I reach number of 'm' bounces BUT I'm getting really realistic solution.

This technique is really good and real, but it has several bad things. If we have 160x120 buffer for this "radiosity", we use 8 samples and 2 bounces ... it'll give us 160x120x8x8^2 (hm... I think this isn't okay, correct me please - It's around 1:30 AM here, so I'm not really well slept) rays with calculation. And they're incoherent (so we can't use packet tracer), by calculation I mean calculating N.L + sending shadow ray to each light ... so in final it'll be (numberOfLights+1)*resolutionX*resolutionY*samples^bounces!. Maybe you're asking why is this my leading horse, because this sounds unreal ... but it isn't that unreal. If we'd have enough good hierarchy and enough fast PC (lets assume 16 cpu cores), and enough optimised ray tracer, we could make it. I made this just with 1 bounce (so no other bounce) on pretty simple scene (several thousands of triangles in decent hierarchy) and one light - so ~153600 rays. And that's possible with todays hardware.

*I appologize if theres something not understandable or not correct, I'm more than 48 hours without sleep - so it might be less understandable and there might be some mistakes (I really appologize if there's some), so I'm gonna have some sleep, so please if somethin isn't understandable - please ask, if there's some mistake - please correct it. Anyway good night.

Share this post


Link to post
Share on other sites
Quote:
Original post by Vilem Otte
Even released demonstration not so long time ago (visit my web pages).


Vilem: I tried your demo but it always gives an error:
"Cannot create physics context" and then crashes..

do you need some dlls or something like Ageia installed?

Share this post


Link to post
Share on other sites
technique 2 - I still haven't had enough time to implement a proof-of-concept of this yet :(


I like your "Technique 3" because it's so simple ;)

Re problem 3/4 - The center probe hits the pillar, but the pillar is skinny.
X = far pixel, | = near pixel
depth:  probes:
XXX|XX 112233
XXX|XX 445566
XXX|XX 778899
The X's to the side of the pillar are going to use the same lighting info as the pillar, when the other probes would give a better result.


You've mentioned probe-blending to reduce the appearance of 9-square blocks. Perhaps this blending equation could take pixel/probe depth into account?

E.g. In my pic above, the X's to the left of the column would have depth values that are more similar to probe #1/4/7 than to probe #2/5/8. Using depth in the blending/weight algorithm might let these pixels use probe #1 instead of probe #2, etc...

Share this post


Link to post
Share on other sites
@Villem
48 hours no sleep?! Hop hop, into your bed mister!

It seems your experimenting on the edge with raytracers. Although I like to make an engine that can live through the next ~4/5 years (its really tiring to re-program my hobby engine each 1/2 years), I still want to use some "present" techniques -> rasterizing. I was thinking about using raytracing to check probe/patch incoming light as well. Instead of rendering heavy cubeMaps (or Dual Paraboloid Maps) at each probe, maybe it was possible to simplify and approximate the incoming light, based on other (nearby) sources. However, if you want to do this a little bit properly, you'd still need alot of rays.

In the end, the main focus of my hobby project is to make a nice (horror) game, not concentrate all energy on just 1 technique. Sometimes I'm asking myself "why not just accept pre-calculated lightMaps?", in many cases they even look better than current available GI tricks. But, even the good old lightMap has limitations (static, can't be used for dynamic objects, hard to apply normalMapping, etc.).

Like you said, the uniform grid works ok (except for a longer update cycle time) for (small) indoor scenes. When doing portal culling, I could make a grid for each chamber so I can adjust the grid density only where needed. And for outdoor scenes, well, do they really need such a grid? I think it's perfectly possible to fake it there (GTA IV, Crysis, ...), since you ussually only have 1 light (and maybe 1 nearby lamp post).


But I'm getting a little bit off topic. I think I saw Villem's reflected projected maps in a paper somewhere. Could that be correct (ifso, could you link me there)? The real question is, is it capable of doing complex scene (many lights, moving objects, large scenes) and can it be used on dynamic objects? Alot of papers show a perfect new technique, applied on the Cornell box. I'd like to see that Cornell box replaced by a Halflife2 level or something :)


@Hodgeman
Technique 2 came indeed from that post :)

I was thinking about using the depth too, but I was not sure how. A relative simple first check would be comparing the pixel-camera distance with probe-camera distance. If it's too big, pixel should shift its texcoords to another probe. But which probe? There is no guarantee if the neighbour probe is nearby. It could be even further away, and/or all probes are far away in the worst case.

My very first idea with this grid was to make a string of probes at each ray. So instead of only placing probes at the ray intersection points, probes would be placed in between as well. However, this needs (a whole) lot more of probes of course. And how the heck to blend here?

Come to think of it... I could make 2 screen grids. One at the intersection points, and a second one nearby. "Nearby" = the smallest distance between camera and probe intersection point. For example, if 8 rays are 100 metres away, and ray 9 collides at 2 meters already, the nearby probes will be at 2 meters distance from the camera. Or maybe using a fixed distance is fine as well (the less moving parts, the better). Anyway, when rendering, pixels have to choose/lerp between 2 grid textures (each texture holds the 9 probe SH coefficients). This can be done with the probe distance:

uniform sampler2D probeDistanceMap,
uniform sampler2D probeNearbyMap,
uniform sampler2D probeFarMap,
uniform float mostNearbyProbeDistance
...
camDist = length( pixel(vertex)Pos - cameraPos );
dist = tex2D( probeDistanceMap, tx ).r - mostNearbyProbeDistance
dist = saturate( (camDist - mostNearbyProbeDistance) / dist );

Pixel_SHCoefficient = lerp( probeNearbyMap, probeFarMap, dist );

This could fix the problem of probes using SH coefficients that are on a complete different distance from the camera. You could maybe even adjust it that most of the foreground is using the nearby grid, while the background uses the far grid. We have 18 probes to update now, instead of 9. But the background texture can do with less updates per frame.

Slower hardware could maybe even simply it further: only 1 (yes 1) probe in front of the camera is used for all the surfaces nearby. Use the most nearby collision point, and shoot the center ray forward with that distance. Render the probe there. Everything nearby will use that probe. All the distant objects can make use of the far grid. Of course, less accurate. But that stupid "9-square" grid becomes less visible.

Nevertheless, we still have that problem of lighting that changes all the time when the camera moves/rotates. I don't know if its that much of a problem. Ifso, maybe we could render 4 probes around the camera on stationary positions. When moving to the left far enough, the right probes will be replaced at new positions. The result of the nearby probe is interpolated between the 4 surrounding probes. Maybe... maybe... it works.

Greetings and thanks for the interrest,
Rick

Share this post


Link to post
Share on other sites
#Matt Aufderheide - Sorry for not-writing onto pages, it needs Ageia PhysX installed (respectively new NVidia PhysX).

#spek - About reflected projected maps - I haven't written any paper about it, but I know one paper about reflective shadow maps LINK , athrough I don't know if it's the same solution as I've described here. Anyway dunno how their solution, but solution described by me here can be applied on dynamic objects. Though it's not correct ... but looks much real than without it, because it works just for several planes (mainly those important like walls).

Someone said - "The best way to compute is precompute", so pre-computed light maps will always have highest quality (until we have enough powerful CPUs to do correct photon mapping in real time dynamically). But it's nice to have something technically advanced = dynamic GI and beatiful when you look at it.

About probes and their update in your app - 3 seconds arent that much, but when you open door, you have to wait 3 seconds, my idea is this - what about doing some importance indexing? do those "important" first and those in corners (less important) after? That might work (and setting importance to value between 0 - 1 through pre computed 3D texture). I never tried this, but if you try - then let me know if this work ;-)

Share this post


Link to post
Share on other sites
@Villem
Now the probes are perfectly ordered in strokes. First zys {0,0,0}, last {width,height,depth}. So, when the map is generated, you can see switching the probes on like all lampposts in a street go on one by one. Funny, but not a good solution of course. The ATI demo improved that by just randomly hussling up the update order.

Like you said, you can certainly distinct important probes from less important ones (although I guess corner probes are important as well). I was thinking about putting the probes in 3 classes:
1.- Never used (outside the world, in between walls, etc.). Always skip them
2.- Probes nearby a wall/static geometry. Highest update priority
3.- Probes floating free inside chambers (only used when a dynamic object is nearby). Don't update, unless an object is on that position. When an object moves, it can enable/disable probes.

To make it more smarter, I would give priority to the probes nearby the camera. If I would update 6 probes per frame, I could for example update 4 nearby probes, and update 2 distant probe. Or even better, give priority to probes inside the view frustum. All that kind of stuff can certainly help.

But before I try that, I first try to blow some new life in my "probe Grid" technique. I'm thinking maybe just 1 probe nearby the camera, and a few for the background could do the trick. And even capable of doing multiple bounces. It's less accurate, but the result matters. The average gamer really won't be able to tell if the lighting sucks or not :)



I'd like to see your website/demo you and Matt are talking about. But what is the link? I tried http://www.otte.cz , but the czech language... :) Where should I be?

Greetings,
Rick


Share this post


Link to post
Share on other sites
Hm...
Try it here (it's in english) http://www.otte.cz/engine/index.php
And demo is here - Demonstration

Anyway it's little older version, because it uses light probes to accumulate indirect light (like radiosity, but light probes are probes, not capture points like in described "radiosity" solution) updated using both techniques - render buffers and raytracing (So I can update 'em even faster).
That probe solution in demo has really low quality (I appologize for that, next time I'll create better quality solution), but there is several hundreds of probes updated in ~3 - ~5 frames. Anyway that app will run probably slow, because it uses raytracing. Dunno how many probes update that version a frame.

little OT (but not so much, it still talks about GI):
Anyway I'm now working (and I hope to finish first version in few days) on world editor for engine which is this demo running at. It will in final (matter of weeks to complete final version) include several GI simulations (both precomputed and dynamic) - Precomputed radiosity, precomputed photon mapping (this one will be hard to implement), precomputed probe technique (large storage on HDD - OMG I'll have to debug this technique correctly :-@), dynamic probe technique and dynamic radiosity (described in one of my posts here).
If you'll have some interests in testing that editor (especially some GI parts) let me know then using PM (I'll write this once more when posting some images and info about editor, so best will be after that ;)).

Share this post


Link to post
Share on other sites
Ej, thanks for the links. I'm running into problems with the physics driver though. nvcuda.dll not found... I installed the Ageia driver, but maybe the wrong one?

Success with your next demo, I'm curious!

Rick

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!