Jump to content

  • Log In with Google      Sign In   
  • Create Account


[Theory] Unraveling the Unlimited Detail plausibility


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
168 replies to this topic

#101 bwhiting   Members   -  Reputation: 618

Like
0Likes
Like

Posted 13 August 2011 - 04:47 AM

demo


i thought their demos aren't too bad!

but you right about lighting! it makes the world of difference to the look of a scene.
I am not very clued up on these methods am I wrongs in assuming if they can do shadows, they can also do lighting no? Surely they can?

Sponsor:

#102 zoborg   Members   -  Reputation: 139

Like
3Likes
Like

Posted 13 August 2011 - 05:37 AM

demo


i thought their demos aren't too bad!

but you right about lighting! it makes the world of difference to the look of a scene.
I am not very clued up on these methods am I wrongs in assuming if they can do shadows, they can also do lighting no? Surely they can?


Well, I did mention that for the trees and the elephant, but it's the same thing. The shadows are baked in. What that means is that they pre-multiplied lighting and shadows to the voxel colors when they generated the voxel model. The light cannot move, and every instance of, say, a tree or that elephant will have to be at the same orientation with respect to the light.

But now that you mention it, I see that the world only looks unlit because they're using a purely diffuse top-down light with high ambient term. It's no coincidence they used top-down lighting, because that allows them to rotate any of these objects about the up axis without it affecting lighting. So you won't see one of those elephants knocked over on its side, for instance.

That isn't to say that they can't have unique lighting conditions for some objects, it's just that every time they do they have to replicate the voxel model for that orientation. Which costs more memory. And that places limits on the supposed unlimited detail.

And just to be clear, I'm not trying to be pedantic - the only reason they're able to have an island with a billion blades of grass (or whatever) is because it's all the same grass, even down to the lighting and placement respective to other objects. Exactly how interesting is a world where you see the same 20 things millions of times? There's another great Carmack interview here that touches on just this subject.

#103 D_Tr   Members   -  Reputation: 362

Like
0Likes
Like

Posted 13 August 2011 - 05:54 AM

+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per second (and the geometry is totally unique). Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models". Unlimited detail octree = octgraph = scam...

#104 zoborg   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 August 2011 - 06:30 AM

+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per sesond. Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models".


Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, very fast. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.

#105 Ezbez   Crossbones+   -  Reputation: 1164

Like
0Likes
Like

Posted 13 August 2011 - 07:10 AM


-Misinterpretting Notch's post as saying "This is super easy". The actual words Notch used were "It’s a very pretty and very impressive piece of technology."

It’s a very pretty and very impressive piece of technology, but they’re carefully avoiding to mention any of the drawbacks, and they’re pretending like what they’re doing is something new and impressive. In reality, it’s been done several times before.

-Notch
The point made was that their exact technique hasn't been used yet.

That's now what he was using Notch's quote for. He kept paraphrasing it as "what UD is doing is unimpressive" and the Carmack quote as being "this is impossible". Clearly that's not what Notch was saying.


-His tesselation explanation. Was it just me or was he just describing parallax mapping? TBH, I don't know much about this

Tesselation often uses a displacement map input. It takes a patch and generates more triangles as the camera gets closer. His explanation was right of the current usage. (Unigine uses tesselation in this way).

Fair point, and you probably know more about this than me, but I did feel like his description missed the point of tesselation and if you didn't know what it was already you wouldn't know what to think. He shows one static picture of a mesh which slight bumps and implies that that's all tessellation can do.


-Him counting or estimating the number of polygons in games these days.

20 polygons per meter? That's a pretty close estimation. Turn on wireframe on a game and you'll notice how triangulate things really are. Characters are usually the exception to this.

Sure that's accurate if you don't look at much of the grass, but the denser grass areas directly above where he was looking look to me like they'd be several times that and 20 certainly isn't "throwing them a bone". But when I made that comment I was thinking of some other number he threw out some other time, but I can't remember what it was or when now.


-Acting as if the 500K triangle mesh he scanned from the elephant is unfeasible for games and as if normal mapping it to reduce polygons would be particularly difficult

You need POM or QDM would really be needed to get the grooves right including self-shadowing. It's not as cheap as it sounds. I agree it would be nice to see the comparison between the two techniques when it's done.

Fair point, but he really just dismisses this as impossible. The really impressive thing with their technology is that it doesn't matter how many of these elephants they put on the screen - that's what he should be emphasizing, not making it sound as if current games couldn't possibly show such a detailed model. No, it's that they choose to use those polygons for the parts of the game we play closest attention to - usually humans.

I've realized what has been *really* bugging me about his talk now though: He refuses to show us so much because he thinks we'll react badly to it. But this only makes sense to do if we'd react even worse to seeing it than to not seeing it. Our negative reaction to not seeing animation isn't strong enough - he's not showing us the animation because we're currently underestimating how bad it is. Showing us 7 year old animation with a "don't worry it's gotten better" gives a better reaction than showing us the current animation. Ouch. Or maybe he's irrational or doesn't care what reaction he gets.

#106 Hodgman   Moderators   -  Reputation: 26991

Like
2Likes
Like

Posted 13 August 2011 - 07:33 AM

I think this thread is starting to look more like theology then theory. Maybe it would be a good idea to start a new one where no 'he's full of bs, no you are full of bs' is allowed and where we make a list of claims who may lead to what this guy is doing. I remember that somewhere in his old technology preview he states that there are 64 atoms per cubic mm, and he says no raytracing... If I have some spare time tomorrow I might just look at all the video's he's made and compose such a list. As I think that was the original intention of this thread.

QFT.

#107 zoborg   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 August 2011 - 07:58 AM


I think this thread is starting to look more like theology then theory. Maybe it would be a good idea to start a new one where no 'he's full of bs, no you are full of bs' is allowed and where we make a list of claims who may lead to what this guy is doing. I remember that somewhere in his old technology preview he states that there are 64 atoms per cubic mm, and he says no raytracing... If I have some spare time tomorrow I might just look at all the video's he's made and compose such a list. As I think that was the original intention of this thread.

QFT.


From what I've seen, it really just looks like a less-sophisticated version Nvidia's SVO, but with instancing. All of the rest is hyperbole, stomach-churning marketing spiel, and redefining terminology (it's not voxels, it's atoms, it's not ray-tracing, it's the non-union Mexican equivalent). So anyone seriously interested in this should just start from the paper or any of the other copious research that pops up from a quick google search.

#108 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 13 August 2011 - 08:56 AM


+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per sesond. Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models".


Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, very fast. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.


Features and quality is just a matter of more performance and optimizations.
What we need more than anything else is "unlimited memory/storage" or this technology has a very limited usefulness.

+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per second (and the geometry is totally unique). Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models". Unlimited detail octree = octgraph = scam...


SSD are also limited in their ability to randomly access data, which could be a huge issue since you're streaming nodes of an octree. Meaning, unless you find very smart ways of packing data, every node might be a random access. And you'll also likely run into problems with predicting nodes that must be streamed in the future. Regardless, being able to stream 1TB of data is only useful if you actually find a way to distribute 1TB of data (or whatever amount you fancy).

In the end, I really doubt the "general usefulness" of SVOs, they surely have a purpose and there might actually be genuine uses for it.



#109 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 13 August 2011 - 09:17 AM

Alot of people seem to be drawing conclusions from what is being demoed now, which sure is impressive, but really lacks all of the visual quality we see in modern games, as well as lacking performance... and then that is compared to year old games that are meant to be able to run on year old computers (hell, just compare it to what the 3DMark developers are demoing). If you were to go 3 years into the future ("when SVOs are practical"), and be allowed to create a demo that too only targets modern hardware and which showcases the potential of triangles, then I'm pretty SVOs wouldn't be all what it is cranked up to be, and without the rediculous memory issues.

Also, the only thing I've seen that UD really impresses with is up-close details (but funny enough, suffers from poor quality in the distance), which surely is nice but is something you rarely are bothered by in games unless you actually specifically look for it. I would be more concerned with walking around in a world that is completely static and solid, that sure would break immersion for me before I even started playing. What many people also forget is that a modern GPU can crank out rediculous amounts of unshaded triangles and pixels, we are talking billions of them, every second. The main reason we don't do that is because somewhere along the way smart people realized that shading and effects were more important than extreme details alone (also, storage and memory limitations!)... so, we spend most of the GPU performance on shading and effects. So why are people hyping a technology that doesn't perform well yet, and doesn't even do shading!

EDIT: To be clear, the benefits of the performance being "independent" from geometry complexity is great, like we went from Forwarding Shading to Deferred Shading. But unless people find a way to get rid of the rediculous memory constrains then I don't see how it could ever really work out. Perhaps SVOs could be used to find which triangles may be visible for a given pixel and use that to reduce geometry, and similar "ideas" that wouldn't trade "geometry complexity" for "rediculous amounts of data".



And since some seem to have forgotten what games look like today:

Posted Image

Posted Image



#110 D_Tr   Members   -  Reputation: 362

Like
0Likes
Like

Posted 13 August 2011 - 09:57 AM

@Syranide: But still aren't SSDs much faster than HDDs at random reads? Especialy if you read chunks that have a size of several hundreds of KB? As for the distribution and storage concerns, you are right that 1TB is a lot of data to download or distribute on retail stores, but storage media are getting denser and connection speeds are getting faster. 1TB would take about 1 day to download on a 100 Mbps connection. This does not seem too long considering that you don't even need 1 TB to have impressive detailed voxel graphics in a game (along with polygonal ones) and that the voxel data can be distributed in a format quite a bit more compact than the one used during the execution of the program. I totally agree with your last comment about the 'general usefulness' of the technology. There is room for advancements in polygon technology too. Moore's law still holds and GPUs are getting more general purpose which is great, because programmers will be able to use whatever technique is better for every situation.

-EDIT: Just saw your last post where you make very good points in the first 2 paragraphs.

#111 Hodgman   Moderators   -  Reputation: 26991

Like
2Likes
Like

Posted 13 August 2011 - 10:31 AM

So anyone seriously interested in this should just start from the [Efficient SVO] paper or any of the other copious research that pops up from a quick google search.

That's not quite the same thing as what Chargh was pointing out, or what the title of this thread asks for though... The very first reply to the OP contains these kinds of existing research, but it would be nice to actually analyze the clues that UD have inadvertently revealed (seeing as they're so intent on being secretive...)

All UD is, is a data structure, which may well be something akin to an SVO (which is where the 'it's nothing special' point is true), but it's likely conceptually different somewhat -- having been developed by someone who has no idea what they're on about, and who started as long as 15 years ago.

There's been a few attempts in this thread to collect Dell's claims and actually try to analyze them and come up with possibilities. Some kind of SVO is a good guess, but if we actually investigate what he's said/shown, there's a lot of interesting clues. Chargh was pointing out that this interesting analysys has been drowned out by the 'religious' discussion about Dell being a 'scammer' vs 'marketer', UD being simple vs revolutionary, etc, etc...

For example, In bwhiting's link , you can clearly see aliasing and bad filtering in the shadows, which is likely caused by the use of shadow-mapping and a poor quality PCF filter. This leads me to believe that the shadows aren't baked in, and are actually done via a regular real-time shadow-mapping implementation, albeit in software.
Posted Image

Also, around this same part of the video, he accidentally flies though a leaf, and a near clipping-plane is revealed. If he were using regular ray-tracing/ray-casting, there'd be no need for him to implement this clipping-plane, and when combined with other other statements, this implies the traversal/projection is based on a frustum, not individual rays. Also, unlike rasterized polygons, the plane doesn't make a clean cut through the geometry, telling us something about the voxel structure and the way the clipping tests are implemented.
Posted Image

It's this kind of analysis / reverse-engineering that's been largely downed out.

The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.

This doesn't mean it doesn't work, or isn't what they're doing, it just implies a big down-side (something Dell doesn't like talking about).
For example, in current games, we might bake a 1million polygon model down to a 1000 polygon model. In doing so we bake all the missing details into texture maps. On every 1 low-poly triangle, it's textured with the data of 1000 high-poly triangles. Thanks to mip-mapping, if the model is far enough away that the low-poly triangle covers a single pixel, then the data from all 1000 of those high-poly triangles is averaged together.Yes, often this makes no sense, like you point out with normals and specularity, yet we do it anyway in current games. It causes artifacts for sure, but we still do it and so can Dell.

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering.

They already have
in modern cards. Bit-twiddling is common in DX11. It's also possible to implement your own software 'caches' nowadays to accelerate this kind of stuff.
Too bad UD haven't even started on their GPU implementation yet though!Posted Image

Tesselation often uses a displacement map input. It takes a patch and generates more triangles as the camera gets closer. His explanation was right of the current usage. (Unigine uses tesselation in this way).

No, height-displacement is not the *only* current usage of tessellation.Posted Image
He also confuses the issue deliberately by comparing a height-displaced plane with a scene containing a variety of different models. It would've been fairer to compare a scene of tessellated meshes with a scene of voxel meshes...

#112 Sirisian   Crossbones+   -  Reputation: 1626

Like
0Likes
Like

Posted 13 August 2011 - 12:04 PM

Too bad UD haven't even started on their GPU implementation yet though!

In the first video they say they have started one though. I guess since we didn't see a video of it then it's hard to take their word for it.

we're also running at 20 frames a second in software, but we have versions that are running much faster than that aren't quite complete yet

Not sure if that means GPU or still software.

Also drop the "it's not unlimited". They clearly said 64 atoms per cubic mm. That is a very specific level of detail. :lol:

Also regarding the lighting. This clip explains a lot about why it looks so bad.

#113 bwhiting   Members   -  Reputation: 618

Like
0Likes
Like

Posted 13 August 2011 - 12:05 PM

wooo go hodgman that's what we wanna see, someone who will look at what evidence we have and make an educated guess as to what is going on behind the scenes.

much more interesting that cries of "fake" or "bullshit", the fact is its impressive, it might not be as impressive as his bold claims make it out to be, but it is more detail than I have ever seen in a demo.. and that warrants trying to figure out what he is doing Posted Image

#114 Chargh   Members   -  Reputation: 110

Like
0Likes
Like

Posted 13 August 2011 - 01:37 PM


Too bad UD haven't even started on their GPU implementation yet though!

In the first video they say they have started one though. I guess since we didn't see a video of it then it's hard to take their word for it.

we're also running at 20 frames a second in software, but we have versions that are running much faster than that aren't quite complete yet

Not sure if that means GPU or still software.

Also drop the "it's not unlimited". They clearly said 64 atoms per cubic mm. That is a very specific level of detail. :lol:

Also regarding the lighting. This clip explains a lot about why it looks so bad.


If all time goes to this search algorithm what could he do on the gpu that would increase its speed? He could add more post processing but that wouldn't make it any faster.

#115 forsandifs   Members   -  Reputation: 154

Like
0Likes
Like

Posted 13 August 2011 - 01:51 PM

Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, very fast. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.


This reminds me of a question I have on the subject of hardware and ray casting. Isn't the new AMD Fusion chip what you describe? The GPU and CPU have shared memory with the GPU being programmable in a C++ like way, if I'm not mistaken.

#116 Sirisian   Crossbones+   -  Reputation: 1626

Like
0Likes
Like

Posted 13 August 2011 - 03:11 PM

If all time goes to this search algorithm what could he do on the gpu that would increase its speed? He could add more post processing but that wouldn't make it any faster.

I have to imagine his search algorithm is a per-pixel algorithm. The GPU is really good at doing those kinds of operations. Also he'll be grabbing back the data into a g-buffer of sorts to perform post-processing the deferred way probably so you're right on that part. This should look much nicer with actual shading hopefully.

#117 rouncer   Members   -  Reputation: 355

Like
0Likes
Like

Posted 13 August 2011 - 03:18 PM

the reasons why this project is going to flop guaranteed.

* the models you see arent unique, they are just duplications of the exact same objects

* the models all have their OWN level of detail, the only way he gets 64 atoms a millimetre is by SCALING SOME OF THEM SMALLER, the rest have SHIT detail.

* he cant paint his world uniquely like what happens in megatexture

* he cant perform csg operations, all he can do is soup yet more and more disjointed models together

* theres no way he could bake lighting at all, so the lighting all has to be dynamic and eat processing power

* this has nothing to do with voxels, you could get a similar effect just by rastering together lots of displacement mapped
models!!!

#118 zoborg   Members   -  Reputation: 139

Like
2Likes
Like

Posted 13 August 2011 - 03:53 PM

So anyone seriously interested in this should just start from the [Efficient SVO] paper or any of the other copious research that pops up from a quick google search.

That's not quite the same thing as what Chargh was pointing out, or what the title of this thread asks for though... The very first reply to the OP contains these kinds of existing research, but it would be nice to actually analyze the clues that UD have inadvertently revealed (seeing as they're so intent on being secretive...)

All UD is, is a data structure, which may well be something akin to an SVO (which is where the 'it's nothing special' point is true), but it's likely conceptually different somewhat -- having been developed by someone who has no idea what they're on about, and who started as long as 15 years ago.

Well, if you started 15 years ago from scratch, you'd have 15 years of experience in the topic. And it's not like you'd do that research in a complete vacuum. It's quite possible that he's invented something different, but I have no particular reason to believe that while he shows things that could definitely be done using well documented techniques.

There's been a few attempts in this thread to collect Dell's claims and actually try to analyze them and come up with possibilities. Some kind of SVO is a good guess, but if we actually investigate what he's said/shown, there's a lot of interesting clues. Chargh was pointing out that this interesting analysys has been drowned out by the 'religious' discussion about Dell being a 'scammer' vs 'marketer', UD being simple vs revolutionary, etc, etc...

For example, In bwhiting's link , you can clearly see aliasing and bad filtering in the shadows, which is likely caused by the use of shadow-mapping and a poor quality PCF filter. This leads me to believe that the shadows aren't baked in, and are actually done via a regular real-time shadow-mapping implementation, albeit in software.

Where do you think baked-in shadows come from? They have to be rendered sometime, and any offline shadow baking performed can be subject to similar quality issues. I'm just saying there's no way to infer from a shot that the lighting is dynamic, because any preprocess could generate the lighting in the same exact way with the same exact artifacts.

So I obviously don't know if it's baked or not, right? Well, there are several reasons to suspect this, and I prefer to take the tack that until given evidence otherwise, the simplest answer is correct.

Why do I think the shadows are baked?
1) First and foremost, the light never moves. This guy goes on and on about how magical everything else is, so why doesn't he ever mention lighting? Why doesn't he just move the light?
2) The light is top-down - the most convenient position for baked-in light and shadows because it allows for arbitrary orientation about the up axis. Why else would you choose this orientation since it makes the world so flat looking?
3) No specular. That's another reason the lighting looks terrible.
4) It fits in perfectly with the most obvious theory of the implementation.

Also, around this same part of the video, he accidentally flies though a leaf, and a near clipping-plane is revealed. If he were using regular ray-tracing/ray-casting, there'd be no need for him to implement this clipping-plane, and when combined with other other statements, this implies the traversal/projection is based on a frustum, not individual rays. Also, unlike rasterized polygons, the plane doesn't make a clean cut through the geometry, telling us something about the voxel structure and the way the clipping tests are implemented.
Posted Image

Well, when you're ray-casting you don't need to explicitly implement a clipping plane to get that effect. You'd get that effect if you projected each ray from the near plane instead of the eye. But an irregular cut like that just suggests to me that yes, they're using voxels and raycasting and not triangle rasterization, so any discontinuities would be at voxel instead of pixel granularity.

It's this kind of analysis / reverse-engineering that's been largely downed out.

The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.

This doesn't mean it doesn't work, or isn't what they're doing, it just implies a big down-side (something Dell doesn't like talking about).
For example, in current games, we might bake a 1million polygon model down to a 1000 polygon model. In doing so we bake all the missing details into texture maps. On every 1 low-poly triangle, it's textured with the data of 1000 high-poly triangles. Thanks to mip-mapping, if the model is far enough away that the low-poly triangle covers a single pixel, then the data from all 1000 of those high-poly triangles is averaged together.Yes, often this makes no sense, like you point out with normals and specularity, yet we do it anyway in current games. It causes artifacts for sure, but we still do it and so can Dell.

I think you're understating the potential artifacts. In their demo, a single pixel could contain ground, thousands of clumps of grass, dozens of trees, and even a few spare elephants. How do you approximate a light value for that that's good enough? We do approximations all the time in games, but we do that by throwing away perceptually unimportant details. The direction of a surface with respect to the light is something that can be approximated (e.g. normal-maps), but not if the surface is a chaotic mess. At best, your choice of normal would be arbitrary (say, up). But if they did that, you'd see noticeable lighting changes as the LoD reduces, whereas in the demo it's a continuous blend.

That's not to say dynamic lighting can't be implemented, just that they haven't demonstrated it. Off hand, if I were to attempt dynamic lighting for instanced voxels, I would probably approach it as a screen-space problem. I.e.
  • Render the scene, but output a depth value along with each color pixel.
  • Generate surface normals using depth gradients from adjacent pixels (with some fudge-factor to eliminate silhouette discontinuities).
  • Perform lighting in view-space, as with typical gbuffer techniques.
To render shadows, you could do the same thing, but first render the scene depth-only from the light's perspective (with some screen-based warp to improve effective resolution). Off hand, I couldn't say how good the results of this technique would be, as generating surface normals from depth may result in a lot of noise and/or muted detail. But it is something ideally suited to a GPU implementation (which they insist they don't use for anything other than splatting the results on-screen).

But there's nothing in any of the demos to suggest they're doing this or any other form of dynamic lighting. I prefer to just take the simplest explanation: that his avoidance is intentional because he knows full well what the limitations of his technique are. They haven't shown anything that couldn't be baked-in, so I have no reason to believe they've done anything more complicated than that.

#119 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 13 August 2011 - 04:20 PM

I did a little math earlier today to see what I could find out, and I'll hope you'll find the answer as interesting as I found it (or I've made a fool of myself :P).

First off, the computer he runs the demo on has 8 GB of memory, the resolution is 64 voxels per mm^3 (4*4*4), I estimate the size of the base of each block to be 1m^2, and let's assume that color is stored as a single byte (either through compression or by being palletized, which could actually even be the case). Since octrees are used, we very loosely assume that memory consumption doubles because of octree overhead for nodes, and that the shell of each block can be approximated by taking the 1m^2 base block, multiplying by 6 for each side of the new 3D-block, and then multiplying by 2 because the sides obviously aren't flat but has a rugged surface. (Yes, some are estimates, some may be high, some may be low, and some factor may be missing, but assume for now that it balances out)


8 GB = 8 * 1024 * 1024 * 1024 = 8589934592 bytes
sqrt(8589934592) = 92681 (side length in units for the entire square)
92681 / 4 / 1000 = 23 m (4 from 4x4x4, 64 voxels per mm^3, 1000 from meter)
23 * 23 = 529 m^2 blocks
529 / 6 / 2 = 44 final blocks (converting from flat 2D to 3D)
44 / 2 = 22 final blocks (compensating for the octree cost)

= 22 blocks (UD uses 24 blocks)


Now, there are a bunch of approximations and guesses here... but the fact that I even came within an order of magnitude of the actual 24 known models UD shows in their demo... says to me that they have indeed not made any significant progress, and even if I've made an error it apparently balances out. They might not even have made anything at all except possibly some optimizations to the SVO algorithm. Please correct me if I've made a serious mistake somewhere, but again, if my calculation would have said 2 or 200 (that would be bad for UD), it would still mean that they are flat out lying and memory consumption is most definately an issue they haven't solved, not even in the slightest.

EDIT: To clarify, this wasn't meant to show the potential of SVO memory optimizations, but rather that it is likely that UD is not using any fancy algorithms at all to mimize their memory consumption (I only assume the colors are palletized)... and that indeed, enormous memory consumption is the real reason why they only have 24 blocks, because those 24 blocks consume all 8GB of memory. This being meant to debunk their "Nono, memory is not the issue! Our artists are!"-ish statement.



#120 zoborg   Members   -  Reputation: 139

Like
2Likes
Like

Posted 13 August 2011 - 04:53 PM


Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, very fast. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.


This reminds me of a question I have on the subject of hardware and ray casting. Isn't the new AMD Fusion chip what you describe? The GPU and CPU have shared memory with the GPU being programmable in a C++ like way, if I'm not mistaken.


Yes, though we'll have to wait to see if it yet approaches the level of practical. But ray-tracing (voxels or otherwise) is bound by memory accesses just as much (if not more-so) than processor speed and quantity.

The basic problem is O(N*K), where N is the number of pixels on screen, and K is average cost of intersecting a ray with the world. Ideally, K is log(M), where M is number of objects in the world. A spatial hierarchy such as an octree provides such a search algorithm.

However, the larger the database, the more spread out the results in memory. In a naive implementation, each ray through each pixel could incur multiple cache misses as it traverses nodes through the tree. This effect gets even worse as you increase the data size such that it exceeds memory and has to be streamed off-disk (or even from the internet). (BTW, this is another issue UD conveniently sidesteps - there is so little unique content it easily fits in a small amount of memory).

This can be improved by using more intelligent data structures that are structured for coherent memory accesses (a rather huge topic in and of itself). But that alone is not enough. No matter how the data is structured, you will still have loads of cache misses (unless your whole world manages to fit into just your cache memory). You need some way to hide the cost of those misses.

On a modern GPU, cache misses are a common occurrence (to the frame-buffer, texture units, vertex units, etc). It cleverly hides the cost of most of these misses by queuing up the reads and writes from a massive number of parallel threads. For instance, the pixel shader unit may be running a shader program for hundreds of pixel quads at a time. Each pixel unit cycle, the same instruction is processed for each in-flight quad. If that instruction happens to be a texture read, all the reads from all those hundreds of quad threads will be batched up for processing by the texture unit. Then hopefully, by the time the read results are needed in the next cycle or few, they'll already be in the cache execution can continue immediately.

This latency-hiding is critical for the speed of modern GPUs. Memory latency doesn't go down very much compared to processing speed or bandwidth increases. In fact, in relative cycle terms, cache miss penalties have only increased over the last decade (or longer).

To get comparable performance from ray-tracing (and ray-traced voxels), we'll need a similar method of latency hiding. With a general purpose collection of cores, you can do a whole lot of this work in software. But current PC cores are designed more for flexible bullet-proof caching than for massively parallel designs. This is why GPUs still blow CPUs out of the water for any algorithm that can be directly adapted to a gather (as opposed to scatter) approach.

To my knowledge, AMD's Fusion just combines the CPU and GPU cores onto a single chip, but the two are still separate. That has the potential to greatly improve memory latency for certain things (such as texture update as mentioned by Carmack), and reductions in chip sizes and costs. But as long as the main latency-hiding hardware is still fixed-function designed for things like 2D/3D texture accesses, we can't optimally implement latency-hiding for custom non-linear things, such as ray collision searches. But all these changes designed to make GPUs more general-purpose get us closer to the goal.




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS