• Create Account

# Dark Helmet

Member Since 15 May 2012
Offline Last Active Aug 14 2014 06:09 AM

### In Topic: Bones and Skeletons

19 February 2014 - 05:50 PM

Can anyone give me some pointer on how to store bone structures and import animation in assimp, directx ?

Beyond the assimp manual, see these for descriptions and code:

* <a href="http://ogldev.atspace.co.uk/www/tutorial38/tutorial38.html">Tutorial 38 - Skeletal Animation With Assimp</a>
* <a href="http://ephenationopengl.blogspot.com/2012/06/doing-animations-in-opengl.html">Doing animations in OpenGL (Ephenation OpenGL)</a>

### In Topic: reconstruct depth from z/w ?

19 February 2014 - 04:39 PM

&amp;amp;nbsp;

...in the postprocessing pass I try to reconstruct it using this method&amp;amp;nbsp;http://www.geeks3d.com/20091216/geexlab-how-to-visualize-the-depth-buffer-in-glsl/.

Re the geekslab page, yeah I tried that years ago and it doesn't work.

Try this (for an arbitrary perspective frustum, with glDepthRange of 0..1):

```vec3 PositionFromDepth_DarkHelmet(in float depth)
{
vec2 ndc;             // Reconstructed NDC-space position
vec3 eye;             // Reconstructed EYE-space position

eye.z = near * far / ((depth * (far - near)) - far);

ndc.x = ((gl_FragCoord.x * widthInv) - 0.5) * 2.0;
ndc.y = ((gl_FragCoord.y * heightInv) - 0.5) * 2.0;

eye.x = ( (-ndc.x * eye.z) * (right-left)/(2*near)
- eye.z * (right+left)/(2*near) );
eye.y = ( (-ndc.y * eye.z) * (top-bottom)/(2*near)
- eye.z * (top+bottom)/(2*near) );

return eye;
}
```
Note: "depth" is your 0..1 window-space depth. Of course, if you assume a "symmetric" perspective frustum (but not necessarily one that is 90 deg FOV), the eye.x/.y lines simplify down to:

```  eye.x = (-ndc.x * eye.z) * right/near;
eye.y = (-ndc.y * eye.z) * top/near;
```
Now of course for mere depth buffer visualization, all you really want from this is "eye.z", which is linear depth value. So nuke the rest. Just map this eye.z value from -n..-f to 0..1, use that as your fragment intensity, and you're done:

```  intensity = (-eye.z - near) / (far-near);
```

### In Topic: 300 000 fps

10 January 2014 - 05:52 PM

glutSwapBuffers, on the other hand, is the real, true thing. It actually swaps buffers, so there is really a notion of "frame". It also blocks, but synchronized to the actual hardware update frequency, and in a somewhat less rigid way (usually drivers will let you pre-render 2 or 3 frames or will only block at the next draw command after swap, or something else).

When timing with just SwapBuffers though, be careful. The problem ends up being the driver typically queues up the request quickly on the CPU and returns immediately (i.e. CPU does not block), after which it lets you start queuing up render commands for future frames. At some random point in the middle of queuing one of those frames when the FIFO fills, "then" the CPU blocks, waiting on some VSYNC event in the middle of a frame. This causes really odd timing spikes leaving you puzzled as to what's going on.

If you want reasonable full-frame timings, after SwapBuffers(), put a glFinish(), and then stop your timer.

### In Topic: Questions about mesh rendering performance

14 December 2013 - 11:20 AM

Exactly! Furthermore, drivers pack your data in the optimal way along with all relevant information for later access. ... If you are happy with DLs continue to use them, but on the proper way, and they'll serve you well. If you want to switch to non-deprecated functionality abandon them. VBOs are certainly the right way to do things, but you have to know your hw better.

Right depends on your goals. For most of us, right isn't defined by core but most efficient use of the hardware (fastest performance).

So yes, agreed. If DL works for you use it. If you want more control over how your GPU memory is utilized, use static VBOs but you'll take a performance hit if you use them alone and you have to be smart about how you encode your data within them. If you happen to be running on NVidia and want VBOs with display list performance, use NV bindless to launch your batches with those VBOs. If not, then substitute VAOs in place of bindless -- it doesn't perform as well but it's better than nothing.

Also, the more data you pack in your VBOs (the larger your batches), the less likely you are to be CPU bound launching batches (which is for the most part what bindless and VAOs strive to reduce).

### In Topic: Vertex Array Object + Direct State Access

13 December 2013 - 08:49 PM

There is no viable alternative to VAOs though, which is why we are all so confused.

Cross-platform, no. But on NVidia, bindless can easily exceed the performance you get from VAOs, and the reason is intuitive.

You ideally just "enable" your 5 attribs, and then proceed to render 300 VBOs. You can't. It sucks.

Instead I have to create 300 VAOs, with 5 "enabled" attribs each. Then render 300 VAOs.

No you don't. Having a bazillion little VAOs floating around with all the cache misses that go with them isn't necessarily the best approach. Best case, use bindless (does an end-run around the cache issues, but is NV-only) or a streaming VBO approach with reuse (which keeps the bind count down, and works cross-platform).

If you have a number of separate, preloaded VBOs, you absolutely can't/refuse to make them large enough to keep you from being CPU bound for some reason, and you can't/won't use bindless, then fall back on (in the order of the performance I've observed on NVidia) 1) VBOs with VAOs, one per batch state combination, 2) client arrays, or 3) VBOs without VAOs or with one VAO for all.

In case it's not obvious, I do what gives me the best performance. I'm not a core purist.

Its really unfortunate AMD hasn't stepped up and supported bindless in their OpenGL drivers, at least for launching batches (vertex attribute and index list specification).

PARTNERS