• Advertisement
Sign in to follow this  

Realtime GI, questions and my findings (big post!)

This topic is 3476 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi, >>> Technique 1, uniform grid with probes & SH (from the ATI demo) Not the first time I ask about realtime ambient lighting, and probably not the last time neither :) With the help from here, I finally managed to do realtime GI by rendering small cubeMaps on alot of positions (a grid), 'compress' them Spherical Harmonic coefficients, and then use it in the final pass. Because I use an uniform grid (10x10x4=400 probes, each probe covers ~2 m^3), I can easily store the SH coefficients in 3D textures, and later access them based on the pixel world position. AND, the pixel will automically blend between 8 coefficients because the 3D texture makes profit of linear filtering. Another big plus is that dynamic objects (characters) can make use of the same ambient lighting and access the SH coefficients in exactly the same way ('impossible' with lightmaps). I update 8 probes per frame now. That will be reduced later, because alot other techniques have to be done as well. On the other hand, this doesn't have to work fast on the current hardware. I'm quite happy with the results, and I can even do multiple bounces (although the framerate dramatically drops for each extra bounce of course). BUT, there are 4 major drawbacks (everyone experimenting with GI for the first time, read with me here): 1.- The 10x10x4 grid is manageable for a relative small scene, but what if the scene is alot bigger (outdoor scene, a large warehouse, etc.)? My 3D textures will explode, since they need to cover the entire scene (so far) 2.- The more probes, the more time needed before all probes are updated. Now it takes ~3 seconds before the entire scene is updated. Not a big problem in this particular case, but what if I suddenly switch a light off? The room could still be litten indirectly for ~2 seconds. I can help a little bit by giving priority to nearby probes. In the ideal situation I only update the probes that are affected somehow (door opened, light switched off, big object moved, ...). But its very hard to know which probe(s) are affected because its INDIRECT lighting. Even more if you want to use multiple bounces as well. 3.- As told, in the final pass, the pixel will pick the SH coefficient just by "tex3D( shCoeffTex, pixelWorldPos )". Works nice and fast, but there is also the risk of picking coefficients behind a wall/below the floor/outside the world. The bigger the distance between 2 probes, the bigger the chance the wrong probe is choosen. For example, a thin wall is placed between 2 chambers with different lighting. The wall pixels are exactly in the middle of 2 probes, 1 behind, and 1 in front of the wall. Because of the linear filtering, ~50% of the lighting on that pixel will actually come from the probe behind that wall -> light from the wrong chamber used. We can fix this somewhat by shifting the pixelposition forward (pixPos += pixNormal * gridSize/2). It doesn't fix all situations though, and could still produce wrong coordinates for narrow environments. 4.- Alot of the probes won't be used (most of the time). Probes are outside the world, or floating in the middle of a chamber while nothing has to be rendered there. waste of memory and time. But I can't just remove them. Maybe there will be an object at the unused probe later on. I could try to detect which probes are outside the world (never update them), which probes are nearby static geometry (always update them), and which are floating in empty space, waiting for an object to come -> when an object moves, render the probes nearby. This helps the update cycle time, but the 3D texture is still filled with useless data, which costs precious memory >>>>> Technique 2, manual placed probes at "smart" positions, with a blend map The biggest problem is how to access all probes, and where to store them? Someone here suggested earlier to manually place probes at "smart" locations. In many cases, we really don't need that much probes for proper results. So, alot of memory can be saved here. But how to access the right probes (and blend between multiple probes)? He was thinking about using a "blend map": R = probe 1 index, G = probe 2 index, B = Blend weight 1, A = Blend weight 2 So, a pixel in the final pass first looks up in this blend map, and the pick the proper probes and blend between them. Could work, and it fixes problem 3 (picking the wrong probes). But, we somehow need to store this blend texture as well. A simple top-down-view 2D texture is not enough, since our scene could have multiple stores or at least have probes placed above each other. So we still need a 3D texture. A big scene still requires alot of memory for that map in that case. Or we use a low-res blend map, but that will introduce "picking the wrong probe" problem again. Another little drawback is that we have to manually place the probes, and generate this blend map. I tried storing probe indices per vertex. Very compact, no need to have a map. But... a "vertex lighting" look is so 1998. >>>>>> Technique 3, probe grid projected on the screen I wanted to get rid of fixed probes/a grid. So I cooked up a complete different plan. Would it be the solution...? Haven't tried it in reality yet, but I'm afraid no. Anyway, how about ONLY placing probes at the geometry you actually see? The camera shoots 9 rays into the screen:
-----------------  probe 5 is the center of the screen (camera focus point)
|  1    2    3  |  probe 1 is topleft of the screen, etcetera
|               |
|  4    5    6  |
|               |
|  7    8    9  |
-----------------
The rays collide at a wall/floor/ceiling/object in front of the camera. At the intersection points, the 9 probes are placed and updated. Why 9? Well, the more the better, but I need to update ALL the probes as fast as possible. Because the camera rotates and moves all the time of course. The 9 SH coefficients are stored in a nice tiny 3x3 texture (with linear filtering to allow blending between coefficients). In the final pass, pixels won't look in a 3D texture, but just pick the SH coefficient corresponding to their screen XY positions (and blend with the neighbour coefficients). Nearby small objects make use of probe 10, which is inside the camera. Not 100% correct, but hey, we're talking about realtime GI here! At least we only need 9 probes (or more if the hardware allows) here, and all the other problems stated in technique 1 are fixed too. This idea gave me a smile on my face when cycling to work. The best ideas always pop-up in me when I cycle in the early fresh morning to work. However, my smile always dissapears when I travel back home, and re-think my ideas. 4 new problems here: 1.- No multiple bounces. You can't reflect light behind the camera... because there are no probes/SH coefficients there. Not a super disaster, since I'm already happy if I can get 1 bounce. But for the future... 2.- The camera rotates alot. If the depth in front of my changes changes much, the lighting can suddenly change as well when I (slightly) rotate. Light on walls should look 'static', it should not change everytime I move or rotate. I could soften this by using a time delay (let the probe move to its new position, instead of directly placing it there). But it still changes. 3.- I'm afraid the 9 probes will produce 9 noticable "squares" on the screen. You won't see it if the lighting is the same everywhere, but as soon as the lighting can change alot locally, I get 9 different results = 9 different filled squares on my screen. Of course, the blending between the probes will blur this, but I doubt if its enough. The only way to fix this is to render MUCH more probes in front of the camera (giving them a smaller section on the screen). But how the heck can I update them all fast enough? Simple answer, I can't. Well, not for now. 4.- Imagine a very long corridor, and a small pillar in the middle nearby the camera. When looking straight forward, the center probe on the screen will collide with the pillar, and thus be rendered there. That's ok for the pillar, but not for the nearby neighbour pixels which are a far distance behind the pillar. When I move a little bit to the left, the center probe shoots into the corridor, which will give wrong lighting on the pillar (and that "changing lighting" effect, discussed in problem #2. I think we'll have to accept there will always be drawbacks. But some of the problems are fatal, making all 3 techniques impractible for somewhat complex scenes (like we whish in games). Unless brilliant minds here have smart solutions for those problems, or a technique # 4.... :) Still awake? Thanks for reading! Rick [Edited by - spek on July 9, 2008 3:47:19 PM]

Share this post


Link to post
Share on other sites
Advertisement
Hi spek,

I'm working for some time on global illumination in real time. Because I'm using hybrid rendering now - rasterization & raytracing, I decided to try some dynamic global illumination using this technology. Evem released demonstration not so long time ago (visit my web pages).
First I'll say something about your described 1st and 3rd technique. I tried them some time ago. 1st technique worked very well for smaller scenes, mainly interiors. I even tried some exteriors with them (but they worked just for small scenes and update wasn't immediate). I presume that you used geometry shaders to calculate this, that's not the fastest way IMHO (well, in rasterizers yes ... in raytracers no). I used some raytracing in some hierarchy - KD tree (scene was static, but I think it can be ported onto dynamic scene using BVHs) to get values into 3D texture and I was able to update more probes ... several hundrets of them per second, though raytracer wasn't optimised that time and had memory leaks, so nothing special (I think that with optimisation it could even do thousands or tens of thousands probes per second). So then it'd be good solution even for huge exterior scenes.
Your third solution would make just nine huge squares on screen, but if you'd use with 320x240 resolution just 160x120 probes, it'd produce pretty realistic solution. Anyway that'd mean at least 10 000 updates per frame (that means 600 000 updates per second) ... and that's really much to todays hardware.

Ok, let's talk about another ... let's call it:
>>> Technique 4, reflective projected maps

This is more fake approach than first two named, but it works (I've tried to use this once, even succesfully, but I had too slow GPU to do that).
Anyway I'll describe this approach just for direct lights, for omni lights it might be little harder (but much more bigger performance eater).
Lets assume we've got three walls and a floor (F.e. left wall is yellow, right wall is red, far wall is blue and floor is white), we're litting everyone of it with 90° spotlight, now we've got projected scene buffer, let's divide screen onto 4 quater screens and take middle point from everyone of it, let's make another new light at any of these middle points and render new texture in hemisphere over it (use paraboloid mapping) with several calculations performed onto it. Then project it again onto scene. And we've got reflected indirect light.
I know this can sound like non-sense babble, I'm not native english speaker ... so be kind and ask if you don't understand something.

And here's my leading horse
>>> Technique 5, dynamic "radiosity" to simulate GI (It's not correct radiosity, but it's similar to it)

Ok, almost everyone knows, what radiosity is - I'm calculating light accessing at some point, it's color and direction (similar to probes). BUT I'm doing it like this:

let's assume that n is number of samples, and m number of bounces (we begin at m = 1)

I trace first ray
|
From hitpoint I trace 'n' rays into scene and accumulate color/light/shadow from hitpoints
|
I've got now another hitpoints, I can do again step two (recursive bounces), and I'll increase 'm'
|
...
|
I'm ending recursion when I reach number of 'm' bounces BUT I'm getting really realistic solution.

This technique is really good and real, but it has several bad things. If we have 160x120 buffer for this "radiosity", we use 8 samples and 2 bounces ... it'll give us 160x120x8x8^2 (hm... I think this isn't okay, correct me please - It's around 1:30 AM here, so I'm not really well slept) rays with calculation. And they're incoherent (so we can't use packet tracer), by calculation I mean calculating N.L + sending shadow ray to each light ... so in final it'll be (numberOfLights+1)*resolutionX*resolutionY*samples^bounces!. Maybe you're asking why is this my leading horse, because this sounds unreal ... but it isn't that unreal. If we'd have enough good hierarchy and enough fast PC (lets assume 16 cpu cores), and enough optimised ray tracer, we could make it. I made this just with 1 bounce (so no other bounce) on pretty simple scene (several thousands of triangles in decent hierarchy) and one light - so ~153600 rays. And that's possible with todays hardware.

*I appologize if theres something not understandable or not correct, I'm more than 48 hours without sleep - so it might be less understandable and there might be some mistakes (I really appologize if there's some), so I'm gonna have some sleep, so please if somethin isn't understandable - please ask, if there's some mistake - please correct it. Anyway good night.

Share this post


Link to post
Share on other sites
Quote:
Original post by Vilem Otte
Even released demonstration not so long time ago (visit my web pages).


Vilem: I tried your demo but it always gives an error:
"Cannot create physics context" and then crashes..

do you need some dlls or something like Ageia installed?

Share this post


Link to post
Share on other sites
technique 2 - I still haven't had enough time to implement a proof-of-concept of this yet :(


I like your "Technique 3" because it's so simple ;)

Re problem 3/4 - The center probe hits the pillar, but the pillar is skinny.
X = far pixel, | = near pixel
depth:  probes:
XXX|XX 112233
XXX|XX 445566
XXX|XX 778899
The X's to the side of the pillar are going to use the same lighting info as the pillar, when the other probes would give a better result.


You've mentioned probe-blending to reduce the appearance of 9-square blocks. Perhaps this blending equation could take pixel/probe depth into account?

E.g. In my pic above, the X's to the left of the column would have depth values that are more similar to probe #1/4/7 than to probe #2/5/8. Using depth in the blending/weight algorithm might let these pixels use probe #1 instead of probe #2, etc...

Share this post


Link to post
Share on other sites
@Villem
48 hours no sleep?! Hop hop, into your bed mister!

It seems your experimenting on the edge with raytracers. Although I like to make an engine that can live through the next ~4/5 years (its really tiring to re-program my hobby engine each 1/2 years), I still want to use some "present" techniques -> rasterizing. I was thinking about using raytracing to check probe/patch incoming light as well. Instead of rendering heavy cubeMaps (or Dual Paraboloid Maps) at each probe, maybe it was possible to simplify and approximate the incoming light, based on other (nearby) sources. However, if you want to do this a little bit properly, you'd still need alot of rays.

In the end, the main focus of my hobby project is to make a nice (horror) game, not concentrate all energy on just 1 technique. Sometimes I'm asking myself "why not just accept pre-calculated lightMaps?", in many cases they even look better than current available GI tricks. But, even the good old lightMap has limitations (static, can't be used for dynamic objects, hard to apply normalMapping, etc.).

Like you said, the uniform grid works ok (except for a longer update cycle time) for (small) indoor scenes. When doing portal culling, I could make a grid for each chamber so I can adjust the grid density only where needed. And for outdoor scenes, well, do they really need such a grid? I think it's perfectly possible to fake it there (GTA IV, Crysis, ...), since you ussually only have 1 light (and maybe 1 nearby lamp post).


But I'm getting a little bit off topic. I think I saw Villem's reflected projected maps in a paper somewhere. Could that be correct (ifso, could you link me there)? The real question is, is it capable of doing complex scene (many lights, moving objects, large scenes) and can it be used on dynamic objects? Alot of papers show a perfect new technique, applied on the Cornell box. I'd like to see that Cornell box replaced by a Halflife2 level or something :)


@Hodgeman
Technique 2 came indeed from that post :)

I was thinking about using the depth too, but I was not sure how. A relative simple first check would be comparing the pixel-camera distance with probe-camera distance. If it's too big, pixel should shift its texcoords to another probe. But which probe? There is no guarantee if the neighbour probe is nearby. It could be even further away, and/or all probes are far away in the worst case.

My very first idea with this grid was to make a string of probes at each ray. So instead of only placing probes at the ray intersection points, probes would be placed in between as well. However, this needs (a whole) lot more of probes of course. And how the heck to blend here?

Come to think of it... I could make 2 screen grids. One at the intersection points, and a second one nearby. "Nearby" = the smallest distance between camera and probe intersection point. For example, if 8 rays are 100 metres away, and ray 9 collides at 2 meters already, the nearby probes will be at 2 meters distance from the camera. Or maybe using a fixed distance is fine as well (the less moving parts, the better). Anyway, when rendering, pixels have to choose/lerp between 2 grid textures (each texture holds the 9 probe SH coefficients). This can be done with the probe distance:

uniform sampler2D probeDistanceMap,
uniform sampler2D probeNearbyMap,
uniform sampler2D probeFarMap,
uniform float mostNearbyProbeDistance
...
camDist = length( pixel(vertex)Pos - cameraPos );
dist = tex2D( probeDistanceMap, tx ).r - mostNearbyProbeDistance
dist = saturate( (camDist - mostNearbyProbeDistance) / dist );

Pixel_SHCoefficient = lerp( probeNearbyMap, probeFarMap, dist );

This could fix the problem of probes using SH coefficients that are on a complete different distance from the camera. You could maybe even adjust it that most of the foreground is using the nearby grid, while the background uses the far grid. We have 18 probes to update now, instead of 9. But the background texture can do with less updates per frame.

Slower hardware could maybe even simply it further: only 1 (yes 1) probe in front of the camera is used for all the surfaces nearby. Use the most nearby collision point, and shoot the center ray forward with that distance. Render the probe there. Everything nearby will use that probe. All the distant objects can make use of the far grid. Of course, less accurate. But that stupid "9-square" grid becomes less visible.

Nevertheless, we still have that problem of lighting that changes all the time when the camera moves/rotates. I don't know if its that much of a problem. Ifso, maybe we could render 4 probes around the camera on stationary positions. When moving to the left far enough, the right probes will be replaced at new positions. The result of the nearby probe is interpolated between the 4 surrounding probes. Maybe... maybe... it works.

Greetings and thanks for the interrest,
Rick

Share this post


Link to post
Share on other sites
#Matt Aufderheide - Sorry for not-writing onto pages, it needs Ageia PhysX installed (respectively new NVidia PhysX).

#spek - About reflected projected maps - I haven't written any paper about it, but I know one paper about reflective shadow maps LINK , athrough I don't know if it's the same solution as I've described here. Anyway dunno how their solution, but solution described by me here can be applied on dynamic objects. Though it's not correct ... but looks much real than without it, because it works just for several planes (mainly those important like walls).

Someone said - "The best way to compute is precompute", so pre-computed light maps will always have highest quality (until we have enough powerful CPUs to do correct photon mapping in real time dynamically). But it's nice to have something technically advanced = dynamic GI and beatiful when you look at it.

About probes and their update in your app - 3 seconds arent that much, but when you open door, you have to wait 3 seconds, my idea is this - what about doing some importance indexing? do those "important" first and those in corners (less important) after? That might work (and setting importance to value between 0 - 1 through pre computed 3D texture). I never tried this, but if you try - then let me know if this work ;-)

Share this post


Link to post
Share on other sites
@Villem
Now the probes are perfectly ordered in strokes. First zys {0,0,0}, last {width,height,depth}. So, when the map is generated, you can see switching the probes on like all lampposts in a street go on one by one. Funny, but not a good solution of course. The ATI demo improved that by just randomly hussling up the update order.

Like you said, you can certainly distinct important probes from less important ones (although I guess corner probes are important as well). I was thinking about putting the probes in 3 classes:
1.- Never used (outside the world, in between walls, etc.). Always skip them
2.- Probes nearby a wall/static geometry. Highest update priority
3.- Probes floating free inside chambers (only used when a dynamic object is nearby). Don't update, unless an object is on that position. When an object moves, it can enable/disable probes.

To make it more smarter, I would give priority to the probes nearby the camera. If I would update 6 probes per frame, I could for example update 4 nearby probes, and update 2 distant probe. Or even better, give priority to probes inside the view frustum. All that kind of stuff can certainly help.

But before I try that, I first try to blow some new life in my "probe Grid" technique. I'm thinking maybe just 1 probe nearby the camera, and a few for the background could do the trick. And even capable of doing multiple bounces. It's less accurate, but the result matters. The average gamer really won't be able to tell if the lighting sucks or not :)



I'd like to see your website/demo you and Matt are talking about. But what is the link? I tried http://www.otte.cz , but the czech language... :) Where should I be?

Greetings,
Rick


Share this post


Link to post
Share on other sites
Hm...
Try it here (it's in english) http://www.otte.cz/engine/index.php
And demo is here - Demonstration

Anyway it's little older version, because it uses light probes to accumulate indirect light (like radiosity, but light probes are probes, not capture points like in described "radiosity" solution) updated using both techniques - render buffers and raytracing (So I can update 'em even faster).
That probe solution in demo has really low quality (I appologize for that, next time I'll create better quality solution), but there is several hundreds of probes updated in ~3 - ~5 frames. Anyway that app will run probably slow, because it uses raytracing. Dunno how many probes update that version a frame.

little OT (but not so much, it still talks about GI):
Anyway I'm now working (and I hope to finish first version in few days) on world editor for engine which is this demo running at. It will in final (matter of weeks to complete final version) include several GI simulations (both precomputed and dynamic) - Precomputed radiosity, precomputed photon mapping (this one will be hard to implement), precomputed probe technique (large storage on HDD - OMG I'll have to debug this technique correctly :-@), dynamic probe technique and dynamic radiosity (described in one of my posts here).
If you'll have some interests in testing that editor (especially some GI parts) let me know then using PM (I'll write this once more when posting some images and info about editor, so best will be after that ;)).

Share this post


Link to post
Share on other sites
Ej, thanks for the links. I'm running into problems with the physics driver though. nvcuda.dll not found... I installed the Ageia driver, but maybe the wrong one?

Success with your next demo, I'm curious!

Rick

Share this post


Link to post
Share on other sites
Hello again,

I implemented sperhical harmonic lighting, but used 1 probe that follows the camera instead of using a huge static grid. I know the GI won't be correct since its only measured at 1 point, but I was hoping to get proper results for nearby/local lighting. But unfortunately...

So what I did is rendering a small (16x16) cubeMap at the camera position. The cubeMap captures the direct lighting around. This cubemap is converted to SH coefficients, and these are used in the final pass to add ambient lighting.



I ran into a problem I had before in the past when making a radiosity lightmap generator. The light 'dies' very quickly. Imagine a red light shines on a wall in front of you. When you render the cubeMap nearby that wall, at least 1 face will be ~100% red. So, walls at the oppisite direction will get this red color via the ambient lighting. But when I move away from that wall, the weight of the red pixels captured by the cubeMap will decrease very fast, since other (black) pixels will come in from other walls, the floors, ceiling, etc.

This simulates attenuation, which is good. But the light influence falls away too quickly. Only very nearby geometry will catch the "reflected" light from the litten wall. I can "fix" this a little bit by making distant pixels more bright, but this is not a real solution of course.

Anyway, the result is a way too dark scene. Only when I walk inside a light beam, the scenery will suddenly light up. Not smoothly neither. I think the small cubeMap resolution and/or relative low-res SH quality makes the lighting change quite abrupt when moving around.


Now I was hoping someone has experience with (local) nearby ambient lighting, by just using 1 probe that moves along with the camera. Maybe there are some tricks to improve the results? And what to do about the attenuation problem? I mean, those who made a lightMap generators that use the GPU instead of raytracing probably faced the same as problem as well. And if there are no real solutions... well, then we at least know this technique does not work for making realtime (local) GI :(

Greetings,
Rick

Share this post


Link to post
Share on other sites
Hi Spek,

I've been following this post for a while, as i'm attempting something similar, albeit a lot simpler for a demo i'm working on.

I'm working on a low frequency ambient lighting solution based on a paper I saw a while back which stores static ambient occlusion per vertex, along with bent normals. The data is captured in a pre-process step using cosine filtered hemicubes.

Anyway, I render a skydome to a hires floating point cubemap, downsample to a lowres 16x16x16 cubemap, and then render another lowres cubemap from an object/characters point of view, rendering the low res skydome cubemap, and scene geometry, with ambient and direct lighting applied. When rendering this cubemap, I apply a 3x3 blur by rotating the cubemap lookup vector using rotation matrices.

I only implemented this in the last few days, but for my test scene, which is a room with a couple of windows, and a couple of omni-shadowed lights, I think the results are pretty good. I definitely thing the blurring is the thing that makes the difference. I'll try to post some screenshots when I get home from work.




Share this post


Link to post
Share on other sites
Hi _Lopez

I'd love to see some shots! But let's see if I understand your approach a little bit. You calculate the occlusion factor per vertex (could also do that in a lightMap for more detail eventually). I don't know what bent normals are though. I've heard of them, maybe even used them before, but forget what they exactly are.

To get your (realtime) lighting, you render half a cubeMap. Where exactly do you render these? At the vertex points, using the vertex normal as a direction? You use the cosine map to make the light coming from straight forward have more incluence than light coming from a steep angle? If so, this sounds alot like how I captured data to generate a (static) lightmap long time ago. I had a couple of problems with that though. Like I described one post before, the influence from a (reflected) light source will fade very quickly when the distance between the reflection spot and the receiver (hemicube in this case) grows. 2 other problems are using per pixel normals, and using the data for dynamic/moving objects.

Blurring might help spreading out the light, and making the lighting less abrupt changing when the measuring point moves. But if I understand it so far, this technique is not suitable for normalMapping (unless multiple hemicubes are used to measure from multiple directions, like Halflife2 did). What I don't understand is why to use ambient occlusion values per vertex here. In theory, you won't need them if the hemicubes (or whatever we use to capture the environment) measure properly. Or maybe you only use a few hemicubes, and mix it with ambient occlusion to increase detail? In that case I'd like to know at which points your cubes are. And a couple of screenshots s'il vous plait :) I like to see your results!

Greetings,
Rick

Share this post


Link to post
Share on other sites
#spek - AFAIK Bent normals are vectors - when I use correct ambient occlusion in my ray tracer (can be even real time in less complex scene). Bent normal is average direction of unoccluded samples, when sampling ambient occlusion. It's mainly used as lookup vector to do some image based lighting.

Anyway when I finish work, which I have to do now (finishing world editor, porting my engine onto another platforms, do some decent documentation on it's library, etc. - it's pretty much work, but I'm not working alone) - I'm thinking about looking deeply into hybrid rendering (or even some pure ray tracing) and using ray tracer to approximate correct ambient occlusion (well almost correct, but it's not a fake solution like in case of SSAO) or even correct indirect illumination. Of course dynamically in real time (This might sound like little crazy idea, but I've experimented with hybrid approaches (and even pure ray tracing) before and I'm still experimenting).

Share this post


Link to post
Share on other sites
Hi Villem,

Good solutions always start with crazy ideas. I never did anything with raytracing, so I'm not sure what you're doing. Nevertheless, show us the results when you have something working :) By the way, I'm kinda 'affraid' that I'll have to learn raytracing some day as well. Not that raytracing is a bad thing, but I don't like the idea I need to do and learn some radical changes someday. So much to do, but so little time. Just like you I got to work, and I expect a little copy of me in 2 or 3 weeks. Yep, changing diapers instead of changing rendering techniques :) If I could ask 1 thing to God, I'd like to have 48 hours per day instead of 24.

But in the little time I still have now, I think I'll try realtime radiosity. It's somewhat the same as the "probe grid" approach. I make some sort of lightMap for the static geometry. Not a high-res one, one pixel in the lightMap could easily cover 1 m2, or maybe even more. I render nearby patches on the lightMaps by placing a probe on the patch position, looking into the surface normal direction. I could do this with hemicubes, or with paraboloid mapping. with a paraboloid map, I only need to render 1 time per patch.

First I render the direct lighting on a low resolution. Maybe with a downscale/blur pass in it, like _Lopez said. From this map, 1 average incoming is calculated with the help of the cosine law. The results are stored in a realtime lightMap. That means if I update 8 patches per frame, 8 new dots are drawn onto that map. When I do a second bounce, I render the surrounding environment with the lightMap from pass 1. The final pass outputs not 1, but 3 or 4 colors on 3 or 4 final lightMaps. For each global light direction, I calculate the color and store it in a lightMap. This is similiar to Halflife2's Radiosity NormalMapping. This allows me to do normalMapping in the final pass, which blends between the 3 or 4 lightMaps, depending on the pixel normal.

Advantages
+ No unused probes wasting time/memory. All patches are connected to the static geometry

+ Suitable for large (outdoor) scenes, without requiring gigantic much memory like the 3D textures to store SH coefficients in the ATI demo did. The level of detail in lightmaps is very easy to adjust. A terrain can do with a relative low resolution for example. If we use 512x512 lightmaps and 2 bounces, we need
1(pass1)+3(final pass, 3 light directions) = 4 maps (RGBA 16F) = 6 MB.

+ Less patches means less to update before the entire world is refreshed

+ Suitable for multiple bounces, and relative easy as well since we only have to use the previous lightMap when capturing the environment for a second(or third, or ...) time

+ Final pass is faster than decoding and using SH coefficients. Just pick 3 or 4 pixels from the lightMap, and blend between them based on the (pixel)normal.

+ Can capture the environment for the patches with in a single pass if paraboloid mapping is used. Dual paraboloid is not needed, since the backside will never be used

+ No pre calculations needed, except that we'll have to calculate Atlas texture coordinates for the static geometry. But this can be done very fast. The actual lightmaps will be created realtime. Or... if you use a low-end system, the lightmaps are only updated once when loading the scene. This is not realtime of course, but at least we don't need a completely different approach when switching off realtime ambient lighting.


Disadvantages
- NormalMapping is somewhat less accurate. But... The HL2 results were not bad, were they? These results should be a little bit the same (although less accurate overall lighting, I think)
- Paraboloid mapping requires a high-tesselated version of the static geometry
- Even with less patches, it still takes time before the entire scene has been updated
- If the static geometry changes (not really static :) ), you'll need to recalculate their patch positions as well.
- Lightmaps can't be used on dynamic objects
- Somewhat more difficult to determine which patches are close to the camera and thus need a higher update priority


Biggest problems are the high-tesselated scene, and the lack of support for dynamic objects. I'm figuring out a way to do them realtime as well. Probably by placing cubeMaps (or dual paraboloid maps) nearby objects and update a couple of them per frame. Nice thing about the whole lightmap approach is that we can render these cubeMaps fast as well. Just render the environment with the final lightMaps, and voila, we have a cubeMap that can be used for the dynamic objects. However, placing them dynamically is another story...

Ok, let's go, time to code :)
Rick

Share this post


Link to post
Share on other sites
So spek - I'm not able to reach my PC, where I have all the data (not at home) - so I created one raytracer in fast on my notebook (took 5 hours).
It shows ambient occlusion in almost real time (just single core) - at least 2 Ghz to be interactive, use just arrow keys to rotate view and WASD for moving. Try to focus on jumping box, which casts dynamic shadow using distributed ray tracing (just 1 sample) and ambient occlusion (just 4 samples) - this laptop has just 1.3Ghz, so it's little nightmarish to try some raytracing here (god bless my home PC with dual core CPU at 2.8 Ghz).
Raytracer
It was built in a hurry to just show - so no optimisations (expect my ray triangle collision, which is highly optimised, no SIMD and no ASM! just pure C code). It's without source.
Anyway I'm again going to bed (it's almost 5am in the morning ... I have to get up in 3 hours :D).

Minimal requierments:
CPU: x86 compatibile 32bit instruction set compatibile (you can try even any old Pentium, it'll run - but very very slowly, I recommend at least 1Ghz for non real time, 2 GHz for interactive and 3.8Ghz for interactive).
Memory: well app and it's sources will have around 2MB in memory :D, but I presume windows kernel will take much more
Graphics card: any - we have to have something what's sending our raytraced images onto screen
System: Windows only (might run even on Win 95, but I tested on XP)

EDIT: Ambient occlusion is true and correct (maybe too strong, but correct), enjoy and comment :D, it's calculated per pixel (And now I'll jump into my bet :-P ... hop, zzzzz zzzzz ...)

Share this post


Link to post
Share on other sites
What to say... it works! Not very fast, but then again, my laptop is not really a "game computer". Dual core, 1.6 Ghz. And a GeForce Go 7600 Videocard, but I guess the videocard is not really important here. The box drops a nice dark shadow beneath it. But the other (big) shadow, does that also come from ambient occlusion, or is the shadow casted from the box by a direct light?


I had a productive night as well, implemented the realtime radiosity. Far from perfect, but it starts working now. And I'm pleased to say that updating the patches is very fast. I always avoided realtime radiosity, because last time I made an offline radiosity lightmap generator, it took ~5 hours to create a nice map. But since I can do with far less detail here (no need for direct lighting in the lightmap), and optimized the generator MUCH better, I can update the entire scene lightMap in ~10 seconds. I'm not sure if it was a 32x32 lightmap, or 64x64. Bigger scenes probably need a bigger map, although the current scene has too much detail in the lightMap. Less patches should work as well.

I update 10 patches per frame now, and the entire thing still runs on 70 FPS, including DoF, SSAO, a few spotlights with shadowMap, paralax mapping, HDR & tone mapping, etc. Not bad? I think I can easily do 20 or even 30 patches per frame as well. But I like to put that energy in multiple bounces.

No matter what GI method you use, I'm starting to think rendering multiple bounces is really one of the keys for succes. With just 1 indirect lighting bounce, the lighting did not look good for any of the techniques I tried so far. Ussually it results in a scene that is still black, except for the floor where the direct lighting is active, and the ceiling above which catches the light indirectly. Luckily the approach I'm trying now is very suitable for multiple bounces. And the final lightMaps can be easily blurred as well, to remove noise and artifacts.

In this case, I only need 4 tiny lightmaps for the ambient lighting with 1 extra bounce. So the huge memory requirements that 3D grids require are fixed here as well. So far, so good. Now I have to make the results better, the results are still ...ugly. 1 or 2 extra bounces, enabling normalMapping, checking if all the patches are really rendered properly, blurring the final lightMap. And using dual paraboloid maps for the patches instead of just lazy rendering with a 90 degree FoV forward.



If this works properly, I must make a proper update method. Updating patches goes lightning fast, but I need to update a whole mountain of them as well. I was thinking about making a 3D grid (on the CPU side, not for the videocard) that stores the indices of nearby patches. This way, I can update patches nearby the camera first. Even better would be updating only where the movement is (big object moved, light changed, etc.). But it's damn hard to determine which patches are affected by that!

And then we still have the dynamic objects. I'll have to find an alternative method for them. However, the final lightmaps still can be used to render the environment around an object. Just catch that light, and we have a probe for a dynamic object. Not 100% accurate, but ow well... A bigger problem here is where to place these probes? In the ideal sistuation each object would have its own probe, but that will be way to much. We can only update a few of them per frame anyway. I was thinking to use a grid again. If an object enters a cell, this cell will be activated, which means a probe is rendered there. All the objects inside that cell share the same probe. Only nearby cells can be activated, otherwise I still have to deal with hundreds of cells in a worst case scenario.

I hope I can post a screenshot soon! A vacation card from Renderland
Rick

Share this post


Link to post
Share on other sites
So after demo time some screenshot time. I made it home today, so on my computer it looks something like this (shadows - 8 samples, ambient occlusion - 4 samples and 4 samples onto supersampling (somehow I haven't choosed the best value - so supersampling isn't the best) ... summed it gives me 52 ray samples per pixel, plus bilinear texture filtering math, plus some shading math and ambient occlusion math (that's pretty much of divisions)), and it's interactive (Without any optimisation). If I'd do some bounding volume hierarchy - I could get more fps - maybe even ~50 or ~60fps (because I'm testing every ray agains every triangle - trianges=14 (12 on box, 2 on floor)).
About shadow - ye you can see ambient occlusion under the box, another occlusion (shadow) is from omni area light source. With these 8 samples it looks much better (i think), shadows are almost photorealistically soft.





Anyway about your lightmaps - realtime radiosity is good way to go (I'm going to try it with this ray tracer ... and I hope it'll be at least interactive, I'll post results then). Anyway tracer can support normal maps and this is calculated per pixel (I just haven't loaded them and hadn't turned them on, let's say too slow notebook).
~5 hours to update map ... wow, I was using radiosity once, but that took max. 1 minute to update whole map (and on large scenes), heh it was created using rasterizer and then used as texture lookup in my old non real time ray tracer :D (nice hybrid technique).
Anyway I apologize for pretty long post (mainly due to large images).

Share this post


Link to post
Share on other sites
The soft shadows are really nice. And no need to fake them with all kind of crazy GPU tricks. That's what I like about raytracing, its 'pure'. I've been thinking... People made a GPU specialized for all kind of vector/graphic operations. People also made a card specialized on physics operations. But for raytracing, we still use the CPU (usually). Why don't they make a card specialized on this kind of stuff? Collision detections, raytracing, etcetera?


I just had an idea for the dynamic objects in combination with lightmaps. I keep saying that you can't use the lightmaps for dynamic objects, but... why not? If I shoot 6 rays (+x,-x,+y,-y,+z,-z) from an object, it will collide with 6 polygons. With that information, I can calculate 6 lightMap coordinates for the object. Instead of making a probe for the object, it just picks the lightMap patches on the left,right,floor,ceiling,front and backside. Based on the (pixel)normal, we can blend between the 6 patch colors.

It's less accurate than rendering a probe at that position of course, but at least its fast and simple. The energy we save with not rendering probes for objects, can be used for updating even more patches per frame. And since the ambient lightmaps are low-res anyway and (should) use multiple bounces, the overall lighting won't vary that much (making the lighting on an object constantly change when it moves around). Calculating the 6 lightmap coordinates per object could become a problem as well (if you have thousands of objects), although you only have to do this when the object moves.

Keep on the good work!
Rick

Share this post


Link to post
Share on other sites
Quote:
Original post by spek
The soft shadows are really nice. And no need to fake them with all kind of crazy GPU tricks. That's what I like about raytracing, its 'pure'. I've been thinking... People made a GPU specialized for all kind of vector/graphic operations. People also made a card specialized on physics operations. But for raytracing, we still use the CPU (usually). Why don't they make a card specialized on this kind of stuff? Collision detections, raytracing, etcetera?


Larrabee should be coming out soon, hopefully. Still, raytracing isn't the ultimate rendering solution; rasterization has huge benefits.

Share this post


Link to post
Share on other sites
I don't know too much about larabee - If you would connect two PCs with dual socket and Intel Xeon (hexa core - which is gonna be out this fall), you'll get 24 cpu cores to do some raytracing and believe me that you'll be able to do supersampling, diffuse recursive reflections, recursive refractions/reflections, area shadows, global illumination, etc. in real time (I mean all of this in one app).
The bad thing is, that every programmer would like api/library to do this - but there isn't any (and don't say OpenRT, that api is slow and pretty useless ... I've tried it, till then I'm working with my own ray tracers).
The only big rasterization benefit over ray tracing are demonstrations and big companies behind it (NVidia and ATi).

Share this post


Link to post
Share on other sites
So, another time I'm here posting results :D (= bothering you all MUHAHA :D)

I've managed to get that described real time radiosity (described by me here - that pseudo radiosity for ray tracing) and it's working dynamically.
I just used 1 sample for shadows and no antialiasing, 4 samples for radiosity per every pixel. Still on that unoptimised ray tracer (this time it's even a little slower), but here are the results:



Whole scene is slightly darker (to get some horror atmosphere in this 14-triangle heaven). Anyway you can see ambient occlusion still here (who would expect that - with this RADIOSITY) and finally color bleeding is here too :-) It's not so strong (I don't wanna kill that dark atmosphere), but I think it can be seen easily.
Anyway I can post demo if someone would like it - just ask me (here on forum, you don't have to use PM).

Share this post


Link to post
Share on other sites
Vilem, what hardware are you using, and what framerates are you achieving?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement