# [Theory] Unraveling the Unlimited Detail plausibility

This topic is 2053 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Disregarding the obvious hyperbole of [i]unlimited[/i] detail (read: lies), the biggest problem is that their demo looks like shit. He passes this off as being due to programmer art, but there's a more fundamental reason for it: they cannot support lighting with this method. Not even simple 1 directional lighting.

To explain why, here's a basic outline of how you could brute-force render a voxel scene:

For each pixel:
[list][*] Cast a ray (or frustum, if you like) from the camera through the pixel into the scene to find the set of voxels visible to that pixel.[*] Get the lit color of each voxel, and blend the results to get the final pixel color.[/list]
Now for a large scene where there are hundreds, thousands, or millions of voxels per pixel (like in the demo), that's way too slow. To get around that they use a hierarchical spatial representation, likely an octree. The algorithm becomes:

For each pixel:
[list][*] Same ray, but this time find the cell in the spatial partition that is roughly the size of a single pixel. You only need to find 1 cell instead of however many millions of voxels it may contain.[*] Get the lit color of that cell, using umm, magic?[/list]
The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.

So, dynamic lighting is out. But I guess we can use baked-in lighting? The rest of their technique depends on baked-in static environments, so why not do the same for lighting? Well, you'll notice they (mostly) don't have baked in lighting either. There's a reason for that too: recursive instancing.

Recursive instancing is how they manage to have huge worlds (i.e. [i]unlimited[/i] detail) in a reasonable amount of memory. Let's take as an example the clumps of grass in the demo. There are millions (conceivably billions) of them in the world. Even just storing a simple compressed transform for each would cost a huge amount of memory. So instead of storing individual instances, they have the following: several clumps of grass and trees (and elephants) and ground instanced in a group. That group is then instanced multiple times to form larger plots of land. And then that group is instanced several more times to form the entire world (along with a few variations for other things, such as rivers).

This allows for very efficient storage, but the world is necessarily repetitive (as they demonstrate). It's no coincidence that their earlier demos were showing off Sierpinski's pyramids, as that is fundamentally the same method except for procedural placement.

But the biggest drawback of instancing is that all instances of a model are identical. This means, for instance, that the world cannot be dynamically destructible (barring some form of copy-on-write which would cost memory and cpu time proportional to the amount of world modification). More importantly, they cannot bake-in lighting unless all instances happen to have the same exact light (i.e. all at the same absolute world rotation and with identical shadows). The only fix for this is to just duplicate models for each unique lighting condition.

Notice in the Sierpinski's pyramid demo they have lighting, but all instances have the exact same orientation and lighting and the light never moves. And in the latest demo, most of the world is completely unlit. There are a few exceptions to this, such as the elephant with some approximation of AO, and the trees with a ground shadow directly beneath them. Both cases do not vary because in both cases the light is just baked into the color.

So, despite the guy's protestations, these are not trivial issues that can be addressed in the future (or are apparently already fixed but for some reason can't be shown). They are [i]fundamental limitations[/i] of the technique. Now I'm not saying it's impossible to fix these, but it will be very difficult and there will be inevitable compromises to make it work. Such compromises as: limit the world size (see atomontage), or stream in data from vast archives on disk (see Id tech). And that's just if you want to do decent baked-in lighting. Implementing dynamic lighting or destructible terrain is an even bigger problem.

I'll change my tune the moment they have something that looks even comparable to a modern game, but definitely not while they keep showing videos that look worse than many PS1 games (which at least had lighting).

##### Share on other sites

i thought their demos aren't too bad!

but you right about lighting! it makes the world of difference to the look of a scene.
I am not very clued up on these methods am I wrongs in assuming if they can do shadows, they can also do lighting no? Surely they can?

##### Share on other sites
[quote name='bwhiting' timestamp='1313232426' post='4848584']

i thought their demos aren't too bad!

but you right about lighting! it makes the world of difference to the look of a scene.
I am not very clued up on these methods am I wrongs in assuming if they can do shadows, they can also do lighting no? Surely they can?
[/quote]

Well, I did mention that for the trees and the elephant, but it's the same thing. The shadows are baked in. What that means is that they pre-multiplied lighting and shadows to the voxel colors when they generated the voxel model. The light cannot move, and every instance of, say, a tree or that elephant will have to be at the same orientation with respect to the light.

But now that you mention it, I see that the world only [i]looks[/i] unlit because they're using a purely diffuse top-down light with high ambient term. It's no coincidence they used top-down lighting, because that allows them to rotate any of these objects about the up axis without it affecting lighting. So you won't see one of those elephants knocked over on its side, for instance.

That isn't to say that they can't have unique lighting conditions for some objects, it's just that every time they do they have to replicate the voxel model for that orientation. Which costs more memory. And that places [i]limits[/i] on the supposed [i]unlimited[/i] detail.

And just to be clear, I'm not trying to be pedantic - the only reason they're able to have an island with a billion blades of grass (or whatever) is because it's all the same grass, even down to the lighting and placement respective to other objects. Exactly how interesting is a world where you see the same 20 things millions of times? There's another great Carmack interview [url="http://www.pcper.com/reviews/Editorial/John-Carmack-Interview-GPU-Race-Intel-Graphics-Ray-Tracing-Voxels-and-more"]here[/url] that touches on just this subject.

##### Share on other sites
+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per second (and the geometry is totally unique). Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models". Unlimited detail octree = octgraph = scam...

##### Share on other sites
[quote name='D_Tr' timestamp='1313236455' post='4848601']
+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per sesond. Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models".
[/quote]

Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, [i]very fast[/i]. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.

##### Share on other sites
[quote name='Sirisian' timestamp='1313077391' post='4847723']
[quote name='Ezbez' timestamp='1313074776' post='4847708']
-Misinterpretting Notch's post as saying "This is super easy". The actual words Notch used were "It’s a very pretty and very impressive piece of technology."
[/quote]
[quote]It’s a very pretty and very impressive piece of technology, but they’re carefully avoiding to mention any of the drawbacks, and they’re pretending like what they’re doing is something new and impressive. In reality, it’s been done several times before.[/quote]-Notch
The point made was that their exact technique hasn't been used yet.[/quote]
That's now what he was using Notch's quote for. He kept paraphrasing it as "what UD is doing is unimpressive" and the Carmack quote as being "this is impossible". Clearly that's not what Notch was saying.

[quote][quote name='Ezbez' timestamp='1313074776' post='4847708']
-His tesselation explanation. Was it just me or was he just describing parallax mapping? TBH, I don't know much about this
[/quote]
Tesselation often uses a displacement map input. It takes a patch and generates more triangles as the camera gets closer. His explanation was right of the current usage. (Unigine uses tesselation in this way).
[/quote]
Fair point, and you probably know more about this than me, but I did feel like his description missed the point of tesselation and if you didn't know what it was already you wouldn't know what to think. He shows one static picture of a mesh which slight bumps and implies that that's all tessellation can do.

[quote][quote name='Ezbez' timestamp='1313074776' post='4847708']
-Him counting or estimating the number of polygons in games these days.
[/quote]
20 polygons per meter? That's a pretty close estimation. Turn on wireframe on a game and you'll notice how triangulate things really are. Characters are usually the exception to this.[/quote]
Sure that's accurate if you don't look at much of the grass, but the denser grass areas directly above where he was looking look to me like they'd be several times that and 20 certainly isn't "throwing them a bone". But when I made that comment I was thinking of some other number he threw out some other time, but I can't remember what it was or when now.

[quote][quote name='Ezbez' timestamp='1313074776' post='4847708']
-Acting as if the 500K triangle mesh he scanned from the elephant is unfeasible for games and as if normal mapping it to reduce polygons would be particularly difficult
[/quote]
You need POM or QDM would really be needed to get the grooves right including self-shadowing. It's not as cheap as it sounds. I agree it would be nice to see the comparison between the two techniques when it's done.[/quote]
Fair point, but he really just dismisses this as impossible. The really impressive thing with their technology is that it doesn't matter how many of these elephants they put on the screen - that's what he should be emphasizing, not making it sound as if current games couldn't possibly show such a detailed model. No, it's that they choose to use those polygons for the parts of the game we play closest attention to - usually humans.

I've realized what has been *really* bugging me about his talk now though: He refuses to show us so much because he thinks we'll react badly to it. But this only makes sense to do [i]if we'd react even worse to seeing it than to not seeing it.[/i] Our negative reaction to not seeing animation isn't strong enough - he's not showing us the animation because we're currently underestimating how bad it is. Showing us 7 year old animation with a "don't worry it's gotten better" gives a better reaction than showing us the current animation. Ouch. Or maybe he's irrational or doesn't care what reaction he gets.

##### Share on other sites
[quote name='Chargh' timestamp='1313168004' post='4848304']
I think this thread is starting to look more like theology then theory. Maybe it would be a good idea to start a new one where no 'he's full of bs, no you are full of bs' is allowed and where we make a list of claims who may lead to what this guy is doing. I remember that somewhere in his old technology preview he states that there are 64 atoms per cubic mm, and he says no raytracing... If I have some spare time tomorrow I might just look at all the video's he's made and compose such a list. As I think that was the original intention of this thread.[/quote]QFT.

##### Share on other sites
[quote name='Hodgman' timestamp='1313242426' post='4848626']
[quote name='Chargh' timestamp='1313168004' post='4848304']
I think this thread is starting to look more like theology then theory. Maybe it would be a good idea to start a new one where no 'he's full of bs, no you are full of bs' is allowed and where we make a list of claims who may lead to what this guy is doing. I remember that somewhere in his old technology preview he states that there are 64 atoms per cubic mm, and he says no raytracing... If I have some spare time tomorrow I might just look at all the video's he's made and compose such a list. As I think that was the original intention of this thread.[/quote]QFT.
[/quote]

From what I've seen, it really just looks like a less-sophisticated version [url="http://www.tml.tkk.fi/%7Esamuli/publications/laine2010tr1_paper.pdf"]Nvidia's SVO[/url], but with instancing. All of the rest is hyperbole, stomach-churning marketing spiel, and redefining terminology (it's not voxels, it's atoms, it's not ray-tracing, it's the non-union Mexican equivalent). So anyone seriously interested in this should just start from the paper or any of the other copious research that pops up from a quick google search.

##### Share on other sites
[quote name='zoborg' timestamp='1313238605' post='4848610']
[quote name='D_Tr' timestamp='1313236455' post='4848601']
+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per sesond. Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models".
[/quote]

Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, [i]very fast[/i]. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.
[/quote]

Features and quality is just a matter of more performance and optimizations.
What we need more than anything else is "unlimited memory/storage" or this technology has a very limited usefulness.

[quote name='D_Tr' timestamp='1313236455' post='4848601']
+1 zoborg

I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering. Nvidia's demo already casts tens of millions rays per second (and the geometry is totally unique). Furthermore, when cheap solid state storage becomes common, fast streaming of many gigabytes worth of data will be possible. But all this is going to take some time, and that time surely has not arrived when someone comes in your face and insults your intelligence by saying "I got 512 trillions atoms here" when really meaning "I have a spaggetti octree with a ton of nodes pointing to the same 10 models". Unlimited detail octree = octgraph = scam...
[/quote]

SSD are also limited in their ability to randomly access data, which could be a huge issue since you're streaming nodes of an octree. Meaning, unless you find very smart ways of packing data, every node might be a random access. And you'll also likely run into problems with predicting nodes that must be streamed in the future. Regardless, being able to stream 1TB of data is only useful if you actually find a way to distribute 1TB of data (or whatever amount you fancy).

In the end, I really doubt the "general usefulness" of SVOs, they surely have a purpose and there might actually be genuine uses for it.

##### Share on other sites
Alot of people seem to be drawing conclusions from what is being [b]demoed[/b] now, which sure is impressive, but really lacks all of the visual quality we see in modern games, as well as lacking performance... and then that is compared to year old [b]games[/b] that are meant to be able to run on year old computers (hell, just compare it to what the 3DMark developers are demoing). If you were to go 3 years into the future ("when SVOs are practical"), and be allowed to create a demo that too only targets modern hardware and which showcases the potential of triangles, then I'm pretty SVOs wouldn't be all what it is cranked up to be, and without the rediculous memory issues.

Also, the only thing I've seen that UD really impresses with is up-close details (but funny enough, suffers from poor quality in the distance), which surely is nice but is something you rarely are bothered by in games unless you actually specifically look for it. I would be more concerned with walking around in a world that is completely static and solid, that sure would break immersion for me before I even started playing. What many people also forget is that a modern GPU can crank out rediculous amounts of unshaded triangles and pixels, we are talking billions of them, every second. The main reason we don't do that is because somewhere along the way smart people realized that shading and effects were more important than extreme details alone (also, storage and memory limitations!)... so, we spend most of the GPU performance on shading and effects. So why are people hyping a technology that doesn't perform well yet, and doesn't even do shading!

EDIT: To be clear, the benefits of the performance being "independent" from geometry complexity is great, like we went from Forwarding Shading to Deferred Shading. But unless people find a way to get rid of the rediculous memory constrains then I don't see how it could ever really work out. Perhaps SVOs could be used to find which triangles may be visible for a given pixel and use that to reduce geometry, and similar "ideas" that wouldn't trade "geometry complexity" for "rediculous amounts of data".

And since some seem to have forgotten what games look like today:

[img]http://sphotos.ak.fbcdn.net/hphotos-ak-ash1/hs764.ash1/165531_10150146309707501_301712412500_8129250_5194790_n.jpg[/img]

##### Share on other sites
@Syranide: But still aren't SSDs much faster than HDDs at random reads? Especialy if you read chunks that have a size of several hundreds of KB? As for the distribution and storage concerns, you are right that 1TB is a lot of data to download or distribute on retail stores, but storage media are getting denser and connection speeds are getting faster. 1TB would take about 1 day to download on a 100 Mbps connection. This does not seem too long considering that you don't even need 1 TB to have impressive detailed voxel graphics in a game (along with polygonal ones) and that the voxel data can be distributed in a format quite a bit more compact than the one used during the execution of the program. I totally agree with your last comment about the 'general usefulness' of the technology. There is room for advancements in polygon technology too. Moore's law still holds and GPUs are getting more general purpose which is great, because programmers will be able to use whatever technique is better for every situation.

-EDIT: Just saw your last post where you make very good points in the first 2 paragraphs.

##### Share on other sites
[quote name='zoborg' timestamp='1313243885' post='4848634']So anyone seriously interested in this should just start from the [Efficient SVO] paper or any of the other copious research that pops up from a quick google search.[/quote]That's not quite the same thing as what Chargh was pointing out, or what the title of this thread asks for though... The very first reply to the OP contains these kinds of existing research, but it would be nice to actually analyze the clues that UD have inadvertently revealed ([i]seeing as they're so intent on being secretive...[/i])

All UD is, is a data structure, which may well be something akin to an SVO ([i]which is where the 'it's nothing special' point is true[/i]), but it's likely conceptually different somewhat -- having been developed by someone who [i]has no idea what they're on about, [/i]and who started as long as 15 years ago.

There's been a few attempts in this thread to collect Dell's claims and actually try to analyze them and come up with possibilities. Some kind of SVO is a good guess, but if we actually investigate what he's said/shown, there's a lot of interesting clues. Chargh was pointing out that this interesting analysys has been drowned out by the 'religious' discussion about Dell being a 'scammer' vs 'marketer', UD being simple vs revolutionary, etc, etc...

For example, In bwhiting's link , you can clearly see aliasing and bad filtering in the shadows, which is likely caused by the use of shadow-mapping and a poor quality PCF filter. This leads me to believe that the shadows aren't baked in, and are actually done via a regular real-time shadow-mapping implementation, albeit in software.
[img]http://farm7.static.flickr.com/6074/6038279727_f6387a3e16.jpg[/img]

Also, around this same part of the video, he accidentally flies though a leaf, and a near clipping-plane is revealed. If he were using regular ray-tracing/ray-casting, there'd be no need for him to implement this clipping-plane, and when combined with other other statements, this implies the traversal/projection is based on a frustum, not individual rays. Also, unlike rasterized polygons, the plane doesn't make a clean cut through the geometry, telling us something about the voxel structure and the way the clipping tests are implemented.
[img]http://farm7.static.flickr.com/6086/6038855530_3353f5d92a.jpg[/img]

It's this kind of analysis / reverse-engineering that's been largely downed out.

[font="arial, verdana, tahoma, sans-serif"][size="2"][quote name='zoborg' timestamp='1313231216' post='4848578']The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.[/quote]This doesn't mean it doesn't work, or isn't what they're doing, it just implies a big down-side ([i]something Dell doesn't like talking about[/i]).
[/size][/font][font="arial, verdana, tahoma, sans-serif"][size="2"]For example, in current games, we might bake a 1million polygon model down to a 1000 polygon model. In doing so we bake all the missing details into texture maps. On every 1 low-poly triangle, it's textured with the data of 1000 high-poly triangles. Thanks to mip-mapping, if the model is far enough away that the low-poly triangle covers a single pixel, then the data from all 1000 of those high-poly triangles is averaged together.[/size][/font][size="2"]Yes, often this makes no sense, like you point out with normals and specularity, yet we do it anyway in current games. It causes [/size]artifacts[size="2"] for sure, but we still do it and so can Dell.[/size]
[size="2"][font="arial, verdana, tahoma, sans-serif"][size="2"][quote name='D_Tr' timestamp='1313236455' post='4848601']I believe that at some point nVidia, AMD and other GPU makers will add some bit twiddling functionality in their cards (probably as instructions initially) in order to accelerate voxel rendering.[/quote]They [url="http://msdn.microsoft.com/en-us/library/bb509631(v=VS.85).aspx#Bitwise_Operators"]already have[/url][/size][/font][/size][size="2"][font="arial, verdana, tahoma, sans-serif"][size="2"] in modern cards. Bit-twiddling is common in DX11. It's also possible to implement your own software 'caches' nowadays to accelerate this kind of stuff.[/size][/font][/size]
Too bad UD haven't even started on their GPU implementation yet though![img]http://public.gamedev.net/public/style_emoticons/default/laugh.gif[/img][quote name='Sirisian' timestamp='1313077391' post='4847723']Tesselation often uses a displacement map input. It takes a patch and generates more triangles as the camera gets closer. His explanation was right of the current usage. (Unigine uses tesselation in this way).[/quote]No, height-displacement is not the [b][i]*only*[/i][/b] current usage of tessellation.[img]http://public.gamedev.net/public/style_emoticons/default/rolleyes.gif[/img]
He also confuses the issue deliberately by comparing a height-displaced plane with a scene containing a variety of different models. It would've been fairer to compare a scene of tessellated meshes with a scene of voxel meshes...

##### Share on other sites
[quote name='Hodgman' timestamp='1313253079' post='4848674']
Too bad UD haven't even started on their GPU implementation yet though!
[/quote]
In the first video they say they have started one though. I guess since we didn't see a video of it then it's hard to take their word for it.

[quote]we're also running at 20 frames a second in software, but we have versions that are running much faster than that aren't quite complete yet[/quote] Not sure if that means GPU or still software.

Also drop the "it's not unlimited". They clearly said 64 atoms per cubic mm. That is a very specific level of detail.

##### Share on other sites
wooo go hodgman that's what we wanna see, someone who will look at what evidence we have and make an educated guess as to what is going on behind the scenes.

much more interesting that cries of "fake" or "bullshit", the fact is its impressive, it might not be as impressive as his bold claims make it out to be, but it is more detail than I have ever seen in a demo.. and that warrants trying to figure out what he is doing [img]http://public.gamedev.net/public/style_emoticons/default/dry.gif[/img]

##### Share on other sites
[quote name='Sirisian' timestamp='1313258676' post='4848697']
[quote name='Hodgman' timestamp='1313253079' post='4848674']
Too bad UD haven't even started on their GPU implementation yet though!
[/quote]
In the first video they say they have started one though. I guess since we didn't see a video of it then it's hard to take their word for it.

[quote]we're also running at 20 frames a second in software, but we have versions that are running much faster than that aren't quite complete yet[/quote] Not sure if that means GPU or still software.

Also drop the "it's not unlimited". They clearly said 64 atoms per cubic mm. That is a very specific level of detail.

[/quote]

If all time goes to this search algorithm what could he do on the gpu that would increase its speed? He could add more post processing but that wouldn't make it any faster.

##### Share on other sites
[quote name='zoborg' timestamp='1313238605' post='4848610']
Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, [i]very fast[/i]. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.
[/quote]

This reminds me of a question I have on the subject of hardware and ray casting. Isn't the new AMD Fusion chip what you describe? The GPU and CPU have shared memory with the GPU being programmable in a C++ like way, if I'm not mistaken.

##### Share on other sites
[quote name='Chargh' timestamp='1313264267' post='4848723']
If all time goes to this search algorithm what could he do on the gpu that would increase its speed? He could add more post processing but that wouldn't make it any faster.
[/quote]
I have to imagine his search algorithm is a per-pixel algorithm. The GPU is really good at doing those kinds of operations. Also he'll be grabbing back the data into a g-buffer of sorts to perform post-processing the deferred way probably so you're right on that part. This should look much nicer with actual shading hopefully.

##### Share on other sites
the reasons why this project is going to flop guaranteed.

* the models you see arent unique, they are just duplications of the exact same objects

* the models all have their OWN level of detail, the only way he gets 64 atoms a millimetre is by SCALING SOME OF THEM SMALLER, the rest have SHIT detail.

* he cant paint his world uniquely like what happens in megatexture

* he cant perform csg operations, all he can do is soup yet more and more disjointed models together

* theres no way he could bake lighting at all, so the lighting all has to be dynamic and eat processing power

* this has nothing to do with voxels, you could get a similar effect just by rastering together lots of displacement mapped
models!!!

##### Share on other sites
[quote name='Hodgman' timestamp='1313253079' post='4848674']
[quote name='zoborg' timestamp='1313243885' post='4848634']So anyone seriously interested in this should just start from the [Efficient SVO] paper or any of the other copious research that pops up from a quick google search.[/quote]That's not quite the same thing as what Chargh was pointing out, or what the title of this thread asks for though... The very first reply to the OP contains these kinds of existing research, but it would be nice to actually analyze the clues that UD have inadvertently revealed ([i]seeing as they're so intent on being secretive...[/i])

All UD is, is a data structure, which may well be something akin to an SVO ([i]which is where the 'it's nothing special' point is true[/i]), but it's likely conceptually different somewhat -- having been developed by someone who [i]has no idea what they're on about, [/i]and who started as long as 15 years ago.
[/quote]
Well, if you started 15 years ago from scratch, you'd have 15 years of experience in the topic. And it's not like you'd do that research in a complete vacuum. It's quite possible that he's invented something different, but I have no particular reason to believe that while he shows things that could definitely be done using well documented techniques.

[quote]
There's been a few attempts in this thread to collect Dell's claims and actually try to analyze them and come up with possibilities. Some kind of SVO is a good guess, but if we actually investigate what he's said/shown, there's a lot of interesting clues. Chargh was pointing out that this interesting analysys has been drowned out by the 'religious' discussion about Dell being a 'scammer' vs 'marketer', UD being simple vs revolutionary, etc, etc...

For example, In bwhiting's link , you can clearly see aliasing and bad filtering in the shadows, which is likely caused by the use of shadow-mapping and a poor quality PCF filter. This leads me to believe that the shadows aren't baked in, and are actually done via a regular real-time shadow-mapping implementation, albeit in software.
[/quote]
Where do you think baked-in shadows come from? They have to be rendered sometime, and any offline shadow baking performed can be subject to similar quality issues. I'm just saying there's no way to infer from a shot that the lighting is dynamic, because any preprocess could generate the lighting in the same exact way with the same exact artifacts.

So I obviously don't know if it's baked or not, right? Well, there are several reasons to suspect this, and I prefer to take the tack that until given evidence otherwise, the simplest answer is correct.

Why do I think the shadows are baked?
1) First and foremost, the light never moves. This guy goes on and on about how magical everything else is, so why doesn't he ever mention lighting? Why doesn't he just move the light?
2) The light is top-down - the most convenient position for baked-in light and shadows because it allows for arbitrary orientation about the up axis. Why else would you choose this orientation since it makes the world so flat looking?
3) No specular. That's another reason the lighting looks terrible.
4) It fits in perfectly with the most obvious theory of the implementation.

[quote]
Also, around this same part of the video, he accidentally flies though a leaf, and a near clipping-plane is revealed. If he were using regular ray-tracing/ray-casting, there'd be no need for him to implement this clipping-plane, and when combined with other other statements, this implies the traversal/projection is based on a frustum, not individual rays. Also, unlike rasterized polygons, the plane doesn't make a clean cut through the geometry, telling us something about the voxel structure and the way the clipping tests are implemented.
[img]http://farm7.static.flickr.com/6086/6038855530_3353f5d92a.jpg[/img]

[/quote]
Well, when you're ray-casting you don't need to explicitly implement a clipping plane to get that effect. You'd get that effect if you projected each ray from the near plane instead of the eye. But an irregular cut like that just suggests to me that yes, they're using voxels and raycasting and not triangle rasterization, so any discontinuities would be at voxel instead of pixel granularity.

[quote]
It's this kind of analysis / reverse-engineering that's been largely downed out.

[font="arial, verdana, tahoma, sans-serif"][size="2"][quote name='zoborg' timestamp='1313231216' post='4848578']The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.[/quote]This doesn't mean it doesn't work, or isn't what they're doing, it just implies a big down-side ([i]something Dell doesn't like talking about[/i]).
[/size][/font][font="arial, verdana, tahoma, sans-serif"][size="2"]For example, in current games, we might bake a 1million polygon model down to a 1000 polygon model. In doing so we bake all the missing details into texture maps. On every 1 low-poly triangle, it's textured with the data of 1000 high-poly triangles. Thanks to mip-mapping, if the model is far enough away that the low-poly triangle covers a single pixel, then the data from all 1000 of those high-poly triangles is averaged together.[/size][/font][size="2"]Yes, often this makes no sense, like you point out with normals and specularity, yet we do it anyway in current games. It causes [/size]artifacts[size="2"] for sure, but we still do it and so can Dell.[/size]
[/quote]
I think you're understating the potential artifacts. In their demo, a single pixel could contain ground, thousands of clumps of grass, dozens of trees, and even a few spare elephants. How do you approximate a light value for that that's good enough? We do approximations all the time in games, but we do that by throwing away perceptually unimportant details. The direction of a surface with respect to the light is something that can be approximated (e.g. normal-maps), but not if the surface is a chaotic mess. At best, your choice of normal would be arbitrary (say, up). But if they did that, you'd see noticeable lighting changes as the LoD reduces, whereas in the demo it's a continuous blend.

That's not to say dynamic lighting can't be implemented, just that they haven't demonstrated it. Off hand, if I were to attempt dynamic lighting for instanced voxels, I would probably approach it as a screen-space problem. I.e.
[list=1][*]Render the scene, but output a depth value along with each color pixel.[*]Generate surface normals using depth gradients from adjacent pixels (with some fudge-factor to eliminate silhouette discontinuities).[*]Perform lighting in view-space, as with typical gbuffer techniques.[/list]
To render shadows, you could do the same thing, but first render the scene depth-only from the light's perspective (with some screen-based warp to improve effective resolution). Off hand, I couldn't say how good the results of this technique would be, as generating surface normals from depth may result in a lot of noise and/or muted detail. But it is something ideally suited to a GPU implementation (which they insist they don't use for anything other than splatting the results on-screen).

But there's nothing in any of the demos to suggest they're doing this or any other form of dynamic lighting. I prefer to just take the simplest explanation: that his avoidance is intentional because he knows full well what the limitations of his technique are. They haven't shown anything that couldn't be baked-in, so I have no reason to believe they've done anything more complicated than that.

##### Share on other sites
I did a little math earlier today to see what I could find out, and I'll hope you'll find the answer as interesting as I found it (or I've made a fool of myself ).

First off, the computer he runs the demo on has 8 GB of memory, the resolution is 64 voxels per mm^3 (4*4*4), I estimate the size of the base of each block to be 1m^2, and let's assume that color is stored as a single byte (either through compression or by being palletized, which could actually even be the case). Since octrees are used, we very loosely assume that memory consumption doubles because of octree overhead for nodes, and that the shell of each block can be approximated by taking the 1m^2 base block, multiplying by 6 for each side of the new 3D-block, and then multiplying by 2 because the sides obviously aren't flat but has a rugged surface. (Yes, some are estimates, some may be high, some may be low, and some factor may be missing, but assume for now that it balances out)

8 GB = 8 * 1024 * 1024 * 1024 = 8589934592 bytes
sqrt(8589934592) = 92681 (side length in units for the entire square)
92681 / 4 / 1000 = 23 m (4 from 4x4x4, 64 voxels per mm^3, 1000 from meter)
23 * 23 = 529 m^2 blocks
529 / 6 / 2 = 44 final blocks (converting from flat 2D to 3D)
44 / 2 = 22 final blocks (compensating for the octree cost)

[size="4"][b]= 22 blocks [/b](UD uses 24 blocks)[/size]

[b]Now, there are a bunch of approximations and guesses here[/b]... but the fact that I even came within an order of magnitude of the actual 24 known models UD shows in their demo... says to me that they have indeed not made any significant progress, and even if I've made an error it apparently balances out. They might not even have made anything at all except possibly some optimizations to the SVO algorithm. Please correct me if I've made a [b]serious[/b] mistake somewhere, but again, if my calculation would have said 2 or 200 (that would be bad for UD), it would still mean that they are flat out lying and memory consumption is most definately an issue they haven't solved, not even in the slightest.

EDIT: To clarify, this wasn't meant to show the potential of SVO memory optimizations, but rather that it is likely that UD is not using any fancy algorithms at all to mimize their memory consumption (I only assume the colors are palletized)... and that indeed, enormous memory consumption is the real reason why they only have 24 blocks, because those 24 blocks consume all 8GB of memory. [b]This being meant to debunk their "Nono, memory is not the issue! Our artists are!"-ish statement.[/b]

##### Share on other sites
[quote name='forsandifs' timestamp='1313265063' post='4848727']
[quote name='zoborg' timestamp='1313238605' post='4848610']
Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, [i]very fast[/i]. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.
[/quote]

This reminds me of a question I have on the subject of hardware and ray casting. Isn't the new AMD Fusion chip what you describe? The GPU and CPU have shared memory with the GPU being programmable in a C++ like way, if I'm not mistaken.
[/quote]

Yes, though we'll have to wait to see if it yet approaches the level of practical. But ray-tracing (voxels or otherwise) is bound by memory accesses just as much (if not more-so) than processor speed and quantity.

The basic problem is O(N*K), where N is the number of pixels on screen, and K is average cost of intersecting a ray with the world. Ideally, K is log(M), where M is number of objects in the world. A spatial hierarchy such as an octree provides such a search algorithm.

However, the larger the database, the more spread out the results in memory. In a naive implementation, each ray through each pixel could incur multiple cache misses as it traverses nodes through the tree. This effect gets even worse as you increase the data size such that it exceeds memory and has to be streamed off-disk (or even from the internet). (BTW, this is another issue UD conveniently sidesteps - there is so little unique content it easily fits in a small amount of memory).

This can be improved by using more intelligent data structures that are structured for coherent memory accesses (a rather huge topic in and of itself). But that alone is not enough. No matter how the data is structured, you will still have loads of cache misses (unless your whole world manages to fit into just your cache memory). You need some way to hide the cost of those misses.

On a modern GPU, cache misses are a common occurrence (to the frame-buffer, texture units, vertex units, etc). It cleverly hides the cost of most of these misses by queuing up the reads and writes from a massive number of parallel threads. For instance, the pixel shader unit may be running a shader program for hundreds of pixel quads at a time. Each pixel unit cycle, the same instruction is processed for each in-flight quad. If that instruction happens to be a texture read, all the reads from all those hundreds of quad threads will be batched up for processing by the texture unit. Then hopefully, by the time the read results are needed in the next cycle or few, they'll already be in the cache execution can continue immediately.

This latency-hiding is critical for the speed of modern GPUs. Memory latency doesn't go down very much compared to processing speed or bandwidth increases. In fact, in relative cycle terms, cache miss penalties have only increased over the last decade (or longer).

To get comparable performance from ray-tracing (and ray-traced voxels), we'll need a similar method of latency hiding. With a general purpose collection of cores, you can do a whole lot of this work in software. But current PC cores are designed more for flexible bullet-proof caching than for massively parallel designs. This is why GPUs still blow CPUs out of the water for any algorithm that can be directly adapted to a gather (as opposed to scatter) approach.

To my knowledge, AMD's Fusion just combines the CPU and GPU cores onto a single chip, but the two are still separate. That has the potential to greatly improve memory latency for certain things (such as texture update as mentioned by Carmack), and reductions in chip sizes and costs. But as long as the main latency-hiding hardware is still fixed-function designed for things like 2D/3D texture accesses, we can't optimally implement latency-hiding for custom non-linear things, such as ray collision searches. But all these changes designed to make GPUs more general-purpose get us closer to the goal.

##### Share on other sites
[quote name='Syranide' timestamp='1313274057' post='4848772']
I did a little math earlier today to see what I could find out, and I'll hope you'll find the answer as interesting as I found it (or I've made a fool of myself ).

First off, the computer he runs the demo on has 8 GB of memory, the resolution is 64 voxels per mm^3 (4*4*4), I estimate the size of the base of each block to be 1m^2, and let's assume that color is stored as a single byte (either through compression or by being palletized, which could actually even be the case). Since octrees are used, we very loosely assume that memory consumption doubles because of octree overhead for nodes, and that the shell of each block can be approximated by taking the 1m^2 base block, multiplying by 6 for each side of the new 3D-block, and then multiplying by 2 because the sides obviously aren't flat but has a rugged surface. (Yes, some are estimates, some may be high, some may be low, and some factor may be missing, but assume for now that it balances out)

8 GB = 8 * 1024 * 1024 * 1024 = 8589934592 bytes
sqrt(8589934592) = 92681 (^2) (side length in units for the entire square)
92681 / 4 / 1000 = 23 m^2 (4 from 4x4x4, 64 voxels per mm^3, 1000 from meter)
23 * 23 = 529 m^2 blocks
529 / 6 / 2 = 44 final blocks (converting from flat 2D to 3D)
44 / 2 = 22 final blocks (compensating for the octree cost)

[size="4"][b]= 22 blocks [/b](UD uses 24 blocks)[/size]

[b]Now, there are a bunch of approximations and guesses here[/b]... but the fact that I even came within an order of magnitude of the actual 24 known models UD shows in their demo... says to me that they have indeed not made any significant progress, and even if I've made an error it apparently balances out. They might not even have made anything at all except possibly some optimizations to the SVO algorithm. Please correct me if I've made a [b]serious[/b] mistake somewhere, but again, if my calculation would have said 2 or 200 (that would be bad for UD), it would still mean that they are flat out lying and memory consumption is most definately an issue they haven't solved, not even in the slightest.
[/quote]

I'm not saying you're incorrect, but it's possible to do quite a lot better than that once you take into account recursive instancing.

Say you're right about each block of land being 1 meter on a side. If you were to fully populate the tree at that granularity, you'd get those results (or similar since it's an estimate). But now, imagine instead of fully populating the tree, you create a group of 100 of those blocks 10 meters on a side, then instance that over the entire world. Your tree just references that block of 100 ground plots rather than duplicating them. So now you've reduced the size requirement by approximately 100.

There's no limit to how far you can take this. The Sierpinski's pyramid is an excellent example of this - you can describe that whole world to an arbitrary size with a simple recursive function. The only unique data storage required for that demo is the model of the pink monster thingy.

As someone mentioned earlier, the storage requirement is more appropriately measured by the entropy of the world (how much unique stuff there is, including relative placement). The repetitive nature of the demo suggests very little of that, and thus very little actual storage requirement.

##### Share on other sites
[quote name='zoborg' timestamp='1313277665' post='4848796']
I'm not saying you're incorrect, but it's possible to do quite a lot better than that once you take into account recursive instancing.

Say you're right about each block of land being 1 meter on a side. If you were to fully populate the tree at that granularity, you'd get those results (or similar since it's an estimate). But now, imagine instead of fully populating the tree, you create a group of 100 of those blocks 10 meters on a side, then instance that over the entire world. Your tree just references that block of 100 ground plots rather than duplicating them. So now you've reduced the size requirement by approximately 100.

There's no limit to how far you can take this. The Sierpinski's pyramid is an excellent example of this - you can describe that whole world to an arbitrary size with a simple recursive function. The only unique data storage required for that demo is the model of the pink monster thingy.

As someone mentioned earlier, the storage requirement is more appropriately measured by the entropy of the world (how much unique stuff there is, including relative placement). The repetitive nature of the demo suggests very little of that, and thus very little actual storage requirement.
[/quote]

I'm not doubting you even one bit, what I meant to show was that with some very basic assumptions, some reasonable approximations and no real optimizations... I computed the number of blocks they could be using in their demo, and arrived at the same number of blocks that they are using in their demo. My point being, unless I've made a serious mistake, they aren't using anything fancy at all... like I mention, for all we know, they might even be using an 8-bit palette for the blocks. If I would have arrived at 2, then yeah, they would have used some fancy algoritms, but that memory consumption most likely is the actual reason they aren't showing more unique blocks.

##### Share on other sites
[quote name='Syranide' timestamp='1313277921' post='4848799']
I'm not disagreeing with you one bit, what I meant to show was that with some very basic assumptions, some reasonable approximations and no real optimizations... I arrived at the same number of blocks that they are using in their demo. My point being, unless I've made a serious mistake, they aren't using anything fancy at all... like I mention, for all we know, they might even be using an 8-bit palette for the blocks. If I would have arrived at 2, then yeah, they would have used some fancy algoritms, but sure enough, memory would still be a major issue regardless of what they say.
[/quote]

OK, then sorry for the misunderstanding. I do agree that there's no [i]particular reason[/i] to assume they're doing anything fancy with compression. Likewise, if someone were to show me a ray-traced sphere above an infinite checkerboard plane, I wouldn't think "Wow! How did they manage to to store an infinite texture in finite memory?!"

##### Share on other sites
Also, all this talk of instancing and compression and [i]unlimited[/i] detail are just aspects of procedural content generation.

It's just a question of degree:
[list=1][*]A repeated texture. That's an incredibly simple function that's both obvious and boring, but it has [i]unlimited[/i] detail (at least in the respect they're using the term, which is up to the precision constraints of the rendering system).[*]A fractal image or environment. This function can be arbitrarily complex and the results can be spectacular. You just have very little input into the final results.[*]Guided procedural content. The simplest example of this is just instancing. But it can be quite a bit more sophisticated, such as composing environments out of recursive functions in a [url="http://www.iquilezles.org/www/material/nvscene2008/rwwtt.pdf"]4k demo[/url].[*]Fully unique artist-modeled (or scanned) textures and environments, but with discretionary reuse of assets to save time and memory.[/list]Procedural content saves us time and memory allowing us to make things that wouldn't otherwise be possible. But the drawback is loss of control - you get what the procedure gives you. If that's a tiled texture, or a fractal, or a huge environment of repetitive chunks of land, you just have to live with it. Or write a new procedure closer to what you want. Or add new content which consumes precious development time and hardware resources (thus making the content decidedly [i]limited[/i]).

Again I want to point out this [url="http://www.pcper.com/reviews/Editorial/John-Carmack-Interview-GPU-Race-Intel-Graphics-Ray-Tracing-Voxels-and-more"]interview with Carmack[/url], because I feel I'm just parroting him at this point. To paraphrase, "with proceduralism you get [i]something[/i], just not necessarily what you want."