Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

[Theory] Unraveling the Unlimited Detail plausibility


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
171 replies to this topic

#121 zoborg   Members   -  Reputation: 139

Like
2Likes
Like

Posted 13 August 2011 - 03:53 PM

So anyone seriously interested in this should just start from the [Efficient SVO] paper or any of the other copious research that pops up from a quick google search.

That's not quite the same thing as what Chargh was pointing out, or what the title of this thread asks for though... The very first reply to the OP contains these kinds of existing research, but it would be nice to actually analyze the clues that UD have inadvertently revealed (seeing as they're so intent on being secretive...)

All UD is, is a data structure, which may well be something akin to an SVO (which is where the 'it's nothing special' point is true), but it's likely conceptually different somewhat -- having been developed by someone who has no idea what they're on about, and who started as long as 15 years ago.

Well, if you started 15 years ago from scratch, you'd have 15 years of experience in the topic. And it's not like you'd do that research in a complete vacuum. It's quite possible that he's invented something different, but I have no particular reason to believe that while he shows things that could definitely be done using well documented techniques.

There's been a few attempts in this thread to collect Dell's claims and actually try to analyze them and come up with possibilities. Some kind of SVO is a good guess, but if we actually investigate what he's said/shown, there's a lot of interesting clues. Chargh was pointing out that this interesting analysys has been drowned out by the 'religious' discussion about Dell being a 'scammer' vs 'marketer', UD being simple vs revolutionary, etc, etc...

For example, In bwhiting's link , you can clearly see aliasing and bad filtering in the shadows, which is likely caused by the use of shadow-mapping and a poor quality PCF filter. This leads me to believe that the shadows aren't baked in, and are actually done via a regular real-time shadow-mapping implementation, albeit in software.

Where do you think baked-in shadows come from? They have to be rendered sometime, and any offline shadow baking performed can be subject to similar quality issues. I'm just saying there's no way to infer from a shot that the lighting is dynamic, because any preprocess could generate the lighting in the same exact way with the same exact artifacts.

So I obviously don't know if it's baked or not, right? Well, there are several reasons to suspect this, and I prefer to take the tack that until given evidence otherwise, the simplest answer is correct.

Why do I think the shadows are baked?
1) First and foremost, the light never moves. This guy goes on and on about how magical everything else is, so why doesn't he ever mention lighting? Why doesn't he just move the light?
2) The light is top-down - the most convenient position for baked-in light and shadows because it allows for arbitrary orientation about the up axis. Why else would you choose this orientation since it makes the world so flat looking?
3) No specular. That's another reason the lighting looks terrible.
4) It fits in perfectly with the most obvious theory of the implementation.

Also, around this same part of the video, he accidentally flies though a leaf, and a near clipping-plane is revealed. If he were using regular ray-tracing/ray-casting, there'd be no need for him to implement this clipping-plane, and when combined with other other statements, this implies the traversal/projection is based on a frustum, not individual rays. Also, unlike rasterized polygons, the plane doesn't make a clean cut through the geometry, telling us something about the voxel structure and the way the clipping tests are implemented.
Posted Image

Well, when you're ray-casting you don't need to explicitly implement a clipping plane to get that effect. You'd get that effect if you projected each ray from the near plane instead of the eye. But an irregular cut like that just suggests to me that yes, they're using voxels and raycasting and not triangle rasterization, so any discontinuities would be at voxel instead of pixel granularity.

It's this kind of analysis / reverse-engineering that's been largely downed out.

The latter algorithm works for unlit geometry simply because each cell in the hierarchy can store the average color of all of the (potentially millions of) voxels it contains. But add in lighting, and there's no simple way to precompute the lighting function for all of those contained voxels. They can all have normals in different directions - there's no guarantee they're even close to one another (imagine if the cell contained a sphere - it would have a normal in every direction). You also wouldn't be able to blend surface properties such as specularity.

This doesn't mean it doesn't work, or isn't what they're doing, it just implies a big down-side (something Dell doesn't like talking about).
For example, in current games, we might bake a 1million polygon model down to a 1000 polygon model. In doing so we bake all the missing details into texture maps. On every 1 low-poly triangle, it's textured with the data of 1000 high-poly triangles. Thanks to mip-mapping, if the model is far enough away that the low-poly triangle covers a single pixel, then the data from all 1000 of those high-poly triangles is averaged together.Yes, often this makes no sense, like you point out with normals and specularity, yet we do it anyway in current games. It causes artifacts for sure, but we still do it and so can Dell.

I think you're understating the potential artifacts. In their demo, a single pixel could contain ground, thousands of clumps of grass, dozens of trees, and even a few spare elephants. How do you approximate a light value for that that's good enough? We do approximations all the time in games, but we do that by throwing away perceptually unimportant details. The direction of a surface with respect to the light is something that can be approximated (e.g. normal-maps), but not if the surface is a chaotic mess. At best, your choice of normal would be arbitrary (say, up). But if they did that, you'd see noticeable lighting changes as the LoD reduces, whereas in the demo it's a continuous blend.

That's not to say dynamic lighting can't be implemented, just that they haven't demonstrated it. Off hand, if I were to attempt dynamic lighting for instanced voxels, I would probably approach it as a screen-space problem. I.e.
  • Render the scene, but output a depth value along with each color pixel.
  • Generate surface normals using depth gradients from adjacent pixels (with some fudge-factor to eliminate silhouette discontinuities).
  • Perform lighting in view-space, as with typical gbuffer techniques.
To render shadows, you could do the same thing, but first render the scene depth-only from the light's perspective (with some screen-based warp to improve effective resolution). Off hand, I couldn't say how good the results of this technique would be, as generating surface normals from depth may result in a lot of noise and/or muted detail. But it is something ideally suited to a GPU implementation (which they insist they don't use for anything other than splatting the results on-screen).

But there's nothing in any of the demos to suggest they're doing this or any other form of dynamic lighting. I prefer to just take the simplest explanation: that his avoidance is intentional because he knows full well what the limitations of his technique are. They haven't shown anything that couldn't be baked-in, so I have no reason to believe they've done anything more complicated than that.

Ad:

#122 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 13 August 2011 - 04:20 PM

I did a little math earlier today to see what I could find out, and I'll hope you'll find the answer as interesting as I found it (or I've made a fool of myself :P).

First off, the computer he runs the demo on has 8 GB of memory, the resolution is 64 voxels per mm^3 (4*4*4), I estimate the size of the base of each block to be 1m^2, and let's assume that color is stored as a single byte (either through compression or by being palletized, which could actually even be the case). Since octrees are used, we very loosely assume that memory consumption doubles because of octree overhead for nodes, and that the shell of each block can be approximated by taking the 1m^2 base block, multiplying by 6 for each side of the new 3D-block, and then multiplying by 2 because the sides obviously aren't flat but has a rugged surface. (Yes, some are estimates, some may be high, some may be low, and some factor may be missing, but assume for now that it balances out)


8 GB = 8 * 1024 * 1024 * 1024 = 8589934592 bytes
sqrt(8589934592) = 92681 (side length in units for the entire square)
92681 / 4 / 1000 = 23 m (4 from 4x4x4, 64 voxels per mm^3, 1000 from meter)
23 * 23 = 529 m^2 blocks
529 / 6 / 2 = 44 final blocks (converting from flat 2D to 3D)
44 / 2 = 22 final blocks (compensating for the octree cost)

= 22 blocks (UD uses 24 blocks)


Now, there are a bunch of approximations and guesses here... but the fact that I even came within an order of magnitude of the actual 24 known models UD shows in their demo... says to me that they have indeed not made any significant progress, and even if I've made an error it apparently balances out. They might not even have made anything at all except possibly some optimizations to the SVO algorithm. Please correct me if I've made a serious mistake somewhere, but again, if my calculation would have said 2 or 200 (that would be bad for UD), it would still mean that they are flat out lying and memory consumption is most definately an issue they haven't solved, not even in the slightest.

EDIT: To clarify, this wasn't meant to show the potential of SVO memory optimizations, but rather that it is likely that UD is not using any fancy algorithms at all to mimize their memory consumption (I only assume the colors are palletized)... and that indeed, enormous memory consumption is the real reason why they only have 24 blocks, because those 24 blocks consume all 8GB of memory. This being meant to debunk their "Nono, memory is not the issue! Our artists are!"-ish statement.



#123 zoborg   Members   -  Reputation: 139

Like
2Likes
Like

Posted 13 August 2011 - 04:53 PM


Agreed. There's definitely some good research being done in this area. One of the main things preventing it from becoming mainstream is that modern GPU hardware is designed to render triangles, very fast. Large voxel worlds (and ray-tracing for that matter) require non-linear memory access patterns that GPUs just weren't designed for. Any significant sea-change in how rendering is performed is going to require collaboration with the GPU vendors.

CUDA is a step in the right direction, but what we really need is some custom hardware that's good at handling intersections against large spatial databases (think texture unit, but for ray-casting). It's a shame Larrabee didn't work out, but it'll happen eventually. And it'll be a hardware vendor to do it, not some upstart with a magical new algorithm they can't describe or even show working well.


This reminds me of a question I have on the subject of hardware and ray casting. Isn't the new AMD Fusion chip what you describe? The GPU and CPU have shared memory with the GPU being programmable in a C++ like way, if I'm not mistaken.


Yes, though we'll have to wait to see if it yet approaches the level of practical. But ray-tracing (voxels or otherwise) is bound by memory accesses just as much (if not more-so) than processor speed and quantity.

The basic problem is O(N*K), where N is the number of pixels on screen, and K is average cost of intersecting a ray with the world. Ideally, K is log(M), where M is number of objects in the world. A spatial hierarchy such as an octree provides such a search algorithm.

However, the larger the database, the more spread out the results in memory. In a naive implementation, each ray through each pixel could incur multiple cache misses as it traverses nodes through the tree. This effect gets even worse as you increase the data size such that it exceeds memory and has to be streamed off-disk (or even from the internet). (BTW, this is another issue UD conveniently sidesteps - there is so little unique content it easily fits in a small amount of memory).

This can be improved by using more intelligent data structures that are structured for coherent memory accesses (a rather huge topic in and of itself). But that alone is not enough. No matter how the data is structured, you will still have loads of cache misses (unless your whole world manages to fit into just your cache memory). You need some way to hide the cost of those misses.

On a modern GPU, cache misses are a common occurrence (to the frame-buffer, texture units, vertex units, etc). It cleverly hides the cost of most of these misses by queuing up the reads and writes from a massive number of parallel threads. For instance, the pixel shader unit may be running a shader program for hundreds of pixel quads at a time. Each pixel unit cycle, the same instruction is processed for each in-flight quad. If that instruction happens to be a texture read, all the reads from all those hundreds of quad threads will be batched up for processing by the texture unit. Then hopefully, by the time the read results are needed in the next cycle or few, they'll already be in the cache execution can continue immediately.

This latency-hiding is critical for the speed of modern GPUs. Memory latency doesn't go down very much compared to processing speed or bandwidth increases. In fact, in relative cycle terms, cache miss penalties have only increased over the last decade (or longer).

To get comparable performance from ray-tracing (and ray-traced voxels), we'll need a similar method of latency hiding. With a general purpose collection of cores, you can do a whole lot of this work in software. But current PC cores are designed more for flexible bullet-proof caching than for massively parallel designs. This is why GPUs still blow CPUs out of the water for any algorithm that can be directly adapted to a gather (as opposed to scatter) approach.

To my knowledge, AMD's Fusion just combines the CPU and GPU cores onto a single chip, but the two are still separate. That has the potential to greatly improve memory latency for certain things (such as texture update as mentioned by Carmack), and reductions in chip sizes and costs. But as long as the main latency-hiding hardware is still fixed-function designed for things like 2D/3D texture accesses, we can't optimally implement latency-hiding for custom non-linear things, such as ray collision searches. But all these changes designed to make GPUs more general-purpose get us closer to the goal.

#124 zoborg   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 August 2011 - 05:21 PM

I did a little math earlier today to see what I could find out, and I'll hope you'll find the answer as interesting as I found it (or I've made a fool of myself :P).

First off, the computer he runs the demo on has 8 GB of memory, the resolution is 64 voxels per mm^3 (4*4*4), I estimate the size of the base of each block to be 1m^2, and let's assume that color is stored as a single byte (either through compression or by being palletized, which could actually even be the case). Since octrees are used, we very loosely assume that memory consumption doubles because of octree overhead for nodes, and that the shell of each block can be approximated by taking the 1m^2 base block, multiplying by 6 for each side of the new 3D-block, and then multiplying by 2 because the sides obviously aren't flat but has a rugged surface. (Yes, some are estimates, some may be high, some may be low, and some factor may be missing, but assume for now that it balances out)


8 GB = 8 * 1024 * 1024 * 1024 = 8589934592 bytes
sqrt(8589934592) = 92681 (^2) (side length in units for the entire square)
92681 / 4 / 1000 = 23 m^2 (4 from 4x4x4, 64 voxels per mm^3, 1000 from meter)
23 * 23 = 529 m^2 blocks
529 / 6 / 2 = 44 final blocks (converting from flat 2D to 3D)
44 / 2 = 22 final blocks (compensating for the octree cost)

= 22 blocks (UD uses 24 blocks)


Now, there are a bunch of approximations and guesses here... but the fact that I even came within an order of magnitude of the actual 24 known models UD shows in their demo... says to me that they have indeed not made any significant progress, and even if I've made an error it apparently balances out. They might not even have made anything at all except possibly some optimizations to the SVO algorithm. Please correct me if I've made a serious mistake somewhere, but again, if my calculation would have said 2 or 200 (that would be bad for UD), it would still mean that they are flat out lying and memory consumption is most definately an issue they haven't solved, not even in the slightest.


I'm not saying you're incorrect, but it's possible to do quite a lot better than that once you take into account recursive instancing.

Say you're right about each block of land being 1 meter on a side. If you were to fully populate the tree at that granularity, you'd get those results (or similar since it's an estimate). But now, imagine instead of fully populating the tree, you create a group of 100 of those blocks 10 meters on a side, then instance that over the entire world. Your tree just references that block of 100 ground plots rather than duplicating them. So now you've reduced the size requirement by approximately 100.

There's no limit to how far you can take this. The Sierpinski's pyramid is an excellent example of this - you can describe that whole world to an arbitrary size with a simple recursive function. The only unique data storage required for that demo is the model of the pink monster thingy.

As someone mentioned earlier, the storage requirement is more appropriately measured by the entropy of the world (how much unique stuff there is, including relative placement). The repetitive nature of the demo suggests very little of that, and thus very little actual storage requirement.

#125 Syranide   Members   -  Reputation: 375

Like
1Likes
Like

Posted 13 August 2011 - 05:25 PM

I'm not saying you're incorrect, but it's possible to do quite a lot better than that once you take into account recursive instancing.

Say you're right about each block of land being 1 meter on a side. If you were to fully populate the tree at that granularity, you'd get those results (or similar since it's an estimate). But now, imagine instead of fully populating the tree, you create a group of 100 of those blocks 10 meters on a side, then instance that over the entire world. Your tree just references that block of 100 ground plots rather than duplicating them. So now you've reduced the size requirement by approximately 100.

There's no limit to how far you can take this. The Sierpinski's pyramid is an excellent example of this - you can describe that whole world to an arbitrary size with a simple recursive function. The only unique data storage required for that demo is the model of the pink monster thingy.

As someone mentioned earlier, the storage requirement is more appropriately measured by the entropy of the world (how much unique stuff there is, including relative placement). The repetitive nature of the demo suggests very little of that, and thus very little actual storage requirement.


I'm not doubting you even one bit, what I meant to show was that with some very basic assumptions, some reasonable approximations and no real optimizations... I computed the number of blocks they could be using in their demo, and arrived at the same number of blocks that they are using in their demo. My point being, unless I've made a serious mistake, they aren't using anything fancy at all... like I mention, for all we know, they might even be using an 8-bit palette for the blocks. If I would have arrived at 2, then yeah, they would have used some fancy algoritms, but that memory consumption most likely is the actual reason they aren't showing more unique blocks.



#126 zoborg   Members   -  Reputation: 139

Like
1Likes
Like

Posted 13 August 2011 - 05:33 PM

I'm not disagreeing with you one bit, what I meant to show was that with some very basic assumptions, some reasonable approximations and no real optimizations... I arrived at the same number of blocks that they are using in their demo. My point being, unless I've made a serious mistake, they aren't using anything fancy at all... like I mention, for all we know, they might even be using an 8-bit palette for the blocks. If I would have arrived at 2, then yeah, they would have used some fancy algoritms, but sure enough, memory would still be a major issue regardless of what they say.


OK, then sorry for the misunderstanding. I do agree that there's no particular reason to assume they're doing anything fancy with compression. Likewise, if someone were to show me a ray-traced sphere above an infinite checkerboard plane, I wouldn't think "Wow! How did they manage to to store an infinite texture in finite memory?!"

#127 zoborg   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 August 2011 - 06:10 PM

Also, all this talk of instancing and compression and unlimited detail are just aspects of procedural content generation.

It's just a question of degree:
  • A repeated texture. That's an incredibly simple function that's both obvious and boring, but it has unlimited detail (at least in the respect they're using the term, which is up to the precision constraints of the rendering system).
  • A fractal image or environment. This function can be arbitrarily complex and the results can be spectacular. You just have very little input into the final results.
  • Guided procedural content. The simplest example of this is just instancing. But it can be quite a bit more sophisticated, such as composing environments out of recursive functions in a 4k demo.
  • Fully unique artist-modeled (or scanned) textures and environments, but with discretionary reuse of assets to save time and memory.
Procedural content saves us time and memory allowing us to make things that wouldn't otherwise be possible. But the drawback is loss of control - you get what the procedure gives you. If that's a tiled texture, or a fractal, or a huge environment of repetitive chunks of land, you just have to live with it. Or write a new procedure closer to what you want. Or add new content which consumes precious development time and hardware resources (thus making the content decidedly limited).

Again I want to point out this interview with Carmack, because I feel I'm just parroting him at this point. To paraphrase, "with proceduralism you get something, just not necessarily what you want."

#128 rouncer   Members   -  Reputation: 244

Like
1Likes
Like

Posted 13 August 2011 - 06:19 PM

its a hacked together piece of crud of an environment, and i dont see it getting much better, and it just makes me want to use a true unique world (like atomontage), with the storage/scale problem, instead of this repetitive crap.

its unlimited repetition, not unlimited detail.

#129 Sirisian   Members   -  Reputation: 1281

Like
0Likes
Like

Posted 13 August 2011 - 07:09 PM

the reasons why this project is going to flop guaranteed.

[1]* the models you see arent unique, they are just duplications of the exact same objects

[2]* the models all have their OWN level of detail, the only way he gets 64 atoms a millimetre is by SCALING SOME OF THEM SMALLER, the rest have SHIT detail.

[3]* he cant paint his world uniquely like what happens in megatexture

[4]* he cant perform csg operations, all he can do is soup yet more and more disjointed models together

[5]* theres no way he could bake lighting at all, so the lighting all has to be dynamic and eat processing power

[6]* this has nothing to do with voxels, you could get a similar effect just by rastering together lots of displacement mapped
models!!!

1) This is mentioned by Dell that they resorted to scanning objects in to get content for the video. Is this for saving memory via instancing? I personally can't tell. I mean they could have loaded a sponza model in to show things off.
2) Not sure what you mean. Some of the objects are polygon modeled and some are scanned in which utilize the full 64 atoms per cubed mm.
3) That's an assumption. Remember most of this is just surface detail. Meaning there is no data stored for the inside of the models. This brings us to your next complaint.
4) He mentioned that which is why he said he would like to work together with Atomontage. However, that's not saying implementing CSG is impossible via their model format. They just said it's not their goal.
5) A lot of engines choose to do dynamic lighting via SSAO along with other techniques (like crytek's radiosity). However, if they did bake the lighting into the models people would flip this around on them and go "they can't do dynamic lighting. It's all baked in" so it's a catch-22 unless they can do both really. (They didn't even say if they could bake the lighting).
6) Probably. You need DX11 for that to run well though. This technology is rendering the same detailed grooves a POM/QDM/Tesselation renderer would be doing except it's running on the CPU. QDM would probably run at the same performance though on the CPU as these effects, but the others require some serious hardware support.

[calculations]

lol, you did pretty much identical calculations I did a while when I saw the video. Yeah that's a pretty good approximation for the amount of data in a lossless format. Compression and streaming the data in are probably where their method will excel.

its a hacked together piece of crud of an environment, and i dont see it getting much better, and it just makes me want to use a true unique world (like atomontage), with the storage/scale problem, instead of this repetitive crap.

its unlimited repetition, not unlimited detail.

You don't think their GPU implementation will be much better than a 15-20 fps CPU version? That's kind of pessimistic. I mean the shading alone on the GPU will open up most every deferred/forward rendering post-processing effect. It's just a different way to populate the g-buffers. HDR alone would probably help along with demoing specular objects. The reason for the repetition at the moment is mostly just speculation.

The problem I see with atomontage is that his effect for rendering voxels when he lacks the detail is to blur them. This ends up looking really bad even in his newer videos. The UD system even when they went very close to objects has a very nice interpolation.

#130 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 13 August 2011 - 07:16 PM


[calculations]

lol, you did pretty much identical calculations I did a while when I saw the video. Yeah that's a pretty good approximation for the amount of data in a lossless format. Compression and streaming the data in are probably where their method will excel.


Problem is though, they aren't showing any streaming, and streaming is probably a monstrous issue for UD. And they aren't showing any compression either, which is also a monstrous issue... GPU textures today are at 1:4 and 1:6 with terribly lossy compression. In-fact, they aren't showing much at all really other than something rendering at ~20FPS ... everything other than that is "been there, done that".


its a hacked together piece of crud of an environment, and i dont see it getting much better, and it just makes me want to use a true unique world (like atomontage), with the storage/scale problem, instead of this repetitive crap.

its unlimited repetition, not unlimited detail.

You don't think their GPU implementation will be much better than a 15-20 fps CPU version? That's kind of pessimistic. I mean the shading alone on the GPU will open up most every deferred/forward rendering post-processing effect. It's just a different way to populate the g-buffers. HDR alone would probably help along with demoing specular objects. The reason for the repetition at the moment is mostly just speculation.

The problem I see with atomontage is that his effect for rendering voxels when he lacks the detail is to blur them. This ends up looking really bad even in his newer videos. The UD system even when they went very close to objects has a very nice interpolation.


Nvidia has their own GPU implementation, it runs at ~20FPS "sometimes" on modern hardware, with virtually no shading because the "raytracing" consumes all the shader performance. And shading is the hugely expensive part in modern games.



#131 zoborg   Members   -  Reputation: 139

Like
0Likes
Like

Posted 13 August 2011 - 08:34 PM

5) A lot of engines choose to do dynamic lighting via SSAO along with other techniques (like crytek's radiosity). However, if they did bake the lighting into the models people would flip this around on them and go "they can't do dynamic lighting. It's all baked in" so it's a catch-22 unless they can do both really. (They didn't even say if they could bake the lighting).

To quote Carl Sagan: "It pays to keep an open mind, but not so open your brains fall out."

First, SSAO is not lighting, it's just an approximation of ambient occlusion (the hint is in the name). AO itself is a means of approximating the local effects of global illumination (i.e. dark areas in creases). It just modifies the ambient term of the typical lighting equation. You still need diffuse, specular, and shadows (which Crytek's engine renders as well).

Second, baking in lighting does not preclude dynamic lighting - most games use a combination of techniques. In order for this technique to be competitive with modern rendering, it also needs to support both. But they only support baked in lighting, and only a severely limited form at that (because it prevents instances from being placed in arbitrary lighting conditions).

If they support good baked-in lighting, why didn't they use it? If they support dynamic lighting, why don't they show it? Extraordinary claims require extraordinary evidence, not promises.



its a hacked together piece of crud of an environment, and i dont see it getting much better, and it just makes me want to use a true unique world (like atomontage), with the storage/scale problem, instead of this repetitive crap.

its unlimited repetition, not unlimited detail.

You don't think their GPU implementation will be much better than a 15-20 fps CPU version? That's kind of pessimistic. I mean the shading alone on the GPU will open up most every deferred/forward rendering post-processing effect. It's just a different way to populate the g-buffers. HDR alone would probably help along with demoing specular objects. The reason for the repetition at the moment is mostly just speculation.

If you take their current rendering, but just apply post-effects to it, it will probably look better. But that will still leave it miles behind a modern game engine, because lighting is the most important tool available for simulating realism. But this falls back into the category of hollow promises. We can only evaluate the tech on what they've shown us, and they certainly haven't shown us any of that.


The problem I see with atomontage is that his effect for rendering voxels when he lacks the detail is to blur them. This ends up looking really bad even in his newer videos. The UD system even when they went very close to objects has a very nice interpolation.

And that just demonstrates the true limitation of any rendering system: Everything can be arbitrarily unique, or the world can be arbitrarily huge - you can't have both. Atomontage takes the former approach, which allows it to support nice baked-in lighting and modifiable terrain. UD is the latter, so it has vasts amounts of repetitious content. There is of course an entire spectrum in-between, which is where modern games lie.

And it's a bit unfair to compare the two engines as they're doing very different things, but if you insist: Atomontage only looks good when viewed from a distance - UD doesn't look good at any distance. Atomontage supports completely dynamic worlds - UD supports completely static worlds. I personally don't have a problem picking a winner from those two.

#132 rouncer   Members   -  Reputation: 244

Like
2Likes
Like

Posted 13 August 2011 - 09:04 PM

Remember, the guy at the start of this thread initially asked "how do we compress all this unique voxel data?" with UD in mind, none of its unique, thats how.
Im sick of making such a big deal out of it though, if they wind up making a game with it, good for them, (it will be pretty kooky) but I honestly would prefer to play an Atomontage game, and carve up some delicious looking unique mountains. :)

#133 Hodgman   Moderators   -  Reputation: 13471

Like
0Likes
Like

Posted 13 August 2011 - 10:24 PM

Where do you think baked-in shadows come from? They have to be rendered sometime, and any offline shadow baking performed can be subject to similar quality issues.

Yeah, it's entirely possible that they're still baked into the static data using shadow-mapping during the baking process, which would be disappointing because the demonstrated shadow technique is cutting a lot of corners as you'd do in a bare-bones real-time version.

However, if they were storing voxel colours that are pre-multiplied with shadow values, then it would severely complicate the instancing. For example, the rock asset is sometimes instanced underneath a tree, and sometimes instanced in direct sunlight. Every (or many) unique instance would need to store unique shadow values. If these shadow values were pre-baked into the colour data, then suddenly all of these instances have become unique assets... which if it's true, dispels a lot of the criticisms about the data-set actually being quite small due to instancing, right?

Well, when you're ray-casting you don't need to explicitly implement a clipping plane to get that effect. You'd get that effect if you projected each ray from the near plane instead of the eye.

Yeah, but you don't need a near plane. So either, they're using a technique that doesn't need a near-plane, but decided to use one anyway, or their "ray-casting" technique actually requires a near plane for some reason.

In their demo, a single pixel could contain ground, thousands of clumps of grass, dozens of trees, and even a few spare elephants. How do you approximate a light value for that that's good enough?

That's assuming that every visible voxel inside the bounds of the pixel is retrieved and averaged, as is common in voxel renderers?
In some other videos, he actually says that their "search algorithm" only returns a single 'atom', implying they're not using this kind of anti-aliasing technique.
i.e. similar to non-anti-aliased rasterization, where each pixel can only end up holding data from a single triangle -- the chosen triangle can then return averaged surface data via mip-mapped textures.

I'm assuming their tech works a similar way, where only a single (hierarchical) voxel from a particular instance is selected, and only data down the hierarchy from the chosen point is averaged. It only has to be 'good enough' to not shimmer excessively under movement (it actually does shimmer a bit) and to look ok after having been blurred in screen space (which they seem to be doing in the distance).

We do approximations all the time in games, but we do that by throwing away perceptually unimportant details. The direction of a surface with respect to the light is something that can be approximated (e.g. normal-maps), but not if the surface is a chaotic mess. At best, your choice of normal would be arbitrary (say, up). But if they did that, you'd see noticeable lighting changes as the LoD reduces, whereas in the demo it's a continuous blend.

No, we do exactly this in games with a continuous blend. We bake highly chaotic normals into a normal map and use it on a low-poly model. Usually, as you go down the mip-chain and average the normals together, they'll converge towards 'up', but not always. Due to trilinear filtering it's a continuous blend through the mip levels.

#134 bwhiting   Members   -  Reputation: 400

Like
0Likes
Like

Posted 14 August 2011 - 02:19 AM

the bit that interests me most in this is the "search algorithm"

can anyone explain to me how they think this works.

given and pixel p.. what goes on to derive its colour?
(and please bare in mind that am not a very clever chap)

if they are not ray tracing, how the fudge do they know/work out what objects (of the potential thousands - instanced or not) are "behind" that pixel, and of those,which is closest, and then of that one which "atom"??

okay obviously there must be some kind of hierarchy going on and data structures to speed it up but it still seems like a mammoth task. take a stone from the floor... must be a fudge load of them! still seems like a hell of a lot of work to do per pixel and I think there is credit due there for how fast it is.

#135 Chargh   Members   -  Reputation: 110

Like
0Likes
Like

Posted 14 August 2011 - 08:02 AM

<clip>

You don't think their GPU implementation will be much better than a 15-20 fps CPU version? That's kind of pessimistic. I mean the shading alone on the GPU will open up most every deferred/forward rendering post-processing effect. It's just a different way to populate the g-buffers. HDR alone would probably help along with demoing specular objects. The reason for the repetition at the moment is mostly just speculation.

The problem I see with atomontage is that his effect for rendering voxels when he lacks the detail is to blur them. This ends up looking really bad even in his newer videos. The UD system even when they went very close to objects has a very nice interpolation.


Since many people pointed out the lack of lighting in the current video's the things you mention will make the demo look nicer. And since the searching it will be cpu bound and the steps you mention will be done on the gpu they might be 'free' I don't think that will make the demo faster, just nicer.

#136 bwhiting   Members   -  Reputation: 400

Like
0Likes
Like

Posted 14 August 2011 - 08:21 AM

... I don't think that will make the demo faster, just nicer.

Mr Dell says in his video that they already have versions working faster that utilize the GPU, so should be nicer and faster





#137 Syranide   Members   -  Reputation: 375

Like
0Likes
Like

Posted 15 August 2011 - 02:57 AM


... I don't think that will make the demo faster, just nicer.

Mr Dell says in his video that they already have versions working faster that utilize the GPU, so should be nicer and faster


And funny enough, so does nVidia, by actual researchers, supported by actual public research, without bullshit claims. And they've scaled down the quality to be able to store the entire inside of the church in memory, and made a bunch of optimizations to improve the quality... what FPS do they get on modern hardware? ~25FPS in SOME scenes, and again, without any significant shading which would ruin the FPS. They have a published two articles too, you guys should read them.

http://www.youtube.c...h?v=Mi-mNGz0YMk
http://research.nvid...e-voxel-octrees
http://www.nvidia.co...ch_pub_018.html

UD is in my opinion already certified bullshitters™ and have claimed whatever they need to save their ass and have shown absolutely nothing to prove what they claim is even realistic or at all possible. What UD has shown today is nothing really technically impressive, from what we can tell it's pretty much a straight up implementation of SVO... that is optimized to be faster than naive implementations, and possibly interpolation. That is all really. Or am I missing something?

And I still don't get why people are still defending this technology at its current state, when the biggest non-instanced scene shown to date is the inside of a small low-detail church, and still consumes 2.7GB of memory.

PS. And if they have a working demo for the GPU, then they should show it, or we can just pin this to the list of bullshit claims without proof. nVidia published their GPU implementation 1.5 years ago... meaning, even if UD have it running on the GPU and show it, unless they show something faster or better than nVidia then we can just assume they straight up copied nVidia or are a bunch of amateurs.

To be blunt, they even claim their demo is only running on 1 core, why on earth would they do that? Scaling it up to large amounts of cores is TRIVIAL, should have been easy to implement in a single day. So either they are hiding behind that statement to hide critical issues (like memory performance) or they really are a bunch of amateurs... ?



#138 D_Tr   Members   -  Reputation: 333

Like
0Likes
Like

Posted 15 August 2011 - 03:30 AM

In a year or so we will probably know for sure :wink:

#139 schupf   Members   -  Reputation: 213

Like
0Likes
Like

Posted 15 August 2011 - 04:40 AM

So now we have to wait another year to see the next crappy youtube video :D

#140 GFalcon   Members   -  Reputation: 240

Like
0Likes
Like

Posted 16 August 2011 - 10:04 AM

here is my take on this thing having also now watched the interview..

1. forget the "unlimited" bit... nothing in the universe is so just see it as just a "AWESOME AMOUNTS OF" instead, which is what he means methinks. so don't waste your energy on that, we all know its not actually unlimited. That is if you are taking the word unlimited to mean infinite.... but the two are different, unlimited could be the same as when another engine says it supports an unlimited number of lights.... which it true... the engine supports it.... your machine might just not be able to handle it (not a limit imposed by the engine but by the users computer)
either way I wouldn't get hung up on it.


2. he is the guy who came up with the technology and he was a hobby programmer, this could explain how he gets some terms wrong (level of distance??!) and why he may seem quite condescending... if he has no background in traditional graphics then that would make sense. His lack of knowledge of current methodologies is what I think lead to him going about it however he has done.

3. I am more and more thinking that this will lead somewhere and may indeed be the future of graphics (the guy who interviewed him was blown away) and from the sounds of it its only going to get better and faster

4. It still "boggles my mind"!!!

5. - 10. not included as I should really be working

:)


That boggles my mind also, so I did some research over internet about their algorithm. Didn't find much, but this post is quite interesting :

http://www.somedude.....php?f=12&t=419

To quote the post :


I'd like to mention Unlimited Detail, a technology developed by Bruce Dell, which does in fact do exactly what he claims it does... Render incredibly detailed 3D scenes at interactive frame rates... without 3D hardware acceleration. It accomplishes this using a novel traversal of a octree style data structure. The effect is perfect occlusion without retracing tree nodes. The result is tremendous efficiency.

I have seen the system in action and I have seen the C++ source code of his inner loop. What is more impressive is that the core algorithm does not need complex math instructions like square root or trig, in fact it does not use floating point instructions or do multiplies and divides!


So it seems they are relying on some "octree like" data structure (as many supposed). What is boggling me the most is the fact their algorithm isn't using multiplies or divides or any other floating point instructions (as they say). Is there a way to traverse an octree (doing tree nodes intersection tests) only with simple instructions ? I don't see how (I only know raycasting, and it seems difficult for me to do this without divides, I know that other ways to render an octree exist but I do not know how they work).
--
GFalcon
0x5f3759df




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS