# Dark Helmet

Member

10

173 Neutral

• Rank
Member

• Interests
Programming
1. ## Bones and Skeletons

Beyond the assimp manual, see these for descriptions and code: * <a href="http://ogldev.atspace.co.uk/www/tutorial38/tutorial38.html">Tutorial 38 - Skeletal Animation With Assimp</a> * <a href="http://ephenationopengl.blogspot.com/2012/06/doing-animations-in-opengl.html">Doing animations in OpenGL (Ephenation OpenGL)</a>
2. ## reconstruct depth from z/w ?

&amp;amp;nbsp; Re the geekslab page, yeah I tried that years ago and it doesn't work. Try this (for an arbitrary perspective frustum, with glDepthRange of 0..1): vec3 PositionFromDepth_DarkHelmet(in float depth) { vec2 ndc; // Reconstructed NDC-space position vec3 eye; // Reconstructed EYE-space position eye.z = near * far / ((depth * (far - near)) - far); ndc.x = ((gl_FragCoord.x * widthInv) - 0.5) * 2.0; ndc.y = ((gl_FragCoord.y * heightInv) - 0.5) * 2.0; eye.x = ( (-ndc.x * eye.z) * (right-left)/(2*near) - eye.z * (right+left)/(2*near) ); eye.y = ( (-ndc.y * eye.z) * (top-bottom)/(2*near) - eye.z * (top+bottom)/(2*near) ); return eye; } Note: "depth" is your 0..1 window-space depth. Of course, if you assume a "symmetric" perspective frustum (but not necessarily one that is 90 deg FOV), the eye.x/.y lines simplify down to: eye.x = (-ndc.x * eye.z) * right/near; eye.y = (-ndc.y * eye.z) * top/near; Now of course for mere depth buffer visualization, all you really want from this is "eye.z", which is linear depth value. So nuke the rest. Just map this eye.z value from -n..-f to 0..1, use that as your fragment intensity, and you're done: intensity = (-eye.z - near) / (far-near);
3. ## 300 000 fps

When timing with just SwapBuffers though, be careful. The problem ends up being the driver typically queues up the request quickly on the CPU and returns immediately (i.e. CPU does not block), after which it lets you start queuing up render commands for future frames. At some random point in the middle of queuing one of those frames when the FIFO fills, "then" the CPU blocks, waiting on some VSYNC event in the middle of a frame. This causes really odd timing spikes leaving you puzzled as to what's going on. If you want reasonable full-frame timings, after SwapBuffers(), put a glFinish(), and then stop your timer.
4. ## Questions about mesh rendering performance

Right depends on your goals. For most of us, right isn't defined by core but most efficient use of the hardware (fastest performance). So yes, agreed. If DL works for you use it. If you want more control over how your GPU memory is utilized, use static VBOs but you'll take a performance hit if you use them alone and you have to be smart about how you encode your data within them. If you happen to be running on NVidia and want VBOs with display list performance, use NV bindless to launch your batches with those VBOs. If not, then substitute VAOs in place of bindless -- it doesn't perform as well but it's better than nothing. Also, the more data you pack in your VBOs (the larger your batches), the less likely you are to be CPU bound launching batches (which is for the most part what bindless and VAOs strive to reduce).
5. ## Vertex Array Object + Direct State Access

Cross-platform, no. But on NVidia, bindless can easily exceed the performance you get from VAOs, and the reason is intuitive. No you don't. Having a bazillion little VAOs floating around with all the cache misses that go with them isn't necessarily the best approach. Best case, use bindless (does an end-run around the cache issues, but is NV-only) or a streaming VBO approach with reuse (which keeps the bind count down, and works cross-platform). If you have a number of separate, preloaded VBOs, you absolutely can't/refuse to make them large enough to keep you from being CPU bound for some reason, and you can't/won't use bindless, then fall back on (in the order of the performance I've observed on NVidia) 1) VBOs with VAOs, one per batch state combination, 2) client arrays, or 3) VBOs without VAOs or with one VAO for all. In case it's not obvious, I do what gives me the best performance. I'm not a core purist. Its really unfortunate AMD hasn't stepped up and supported bindless in their OpenGL drivers, at least for launching batches (vertex attribute and index list specification).

7. ## Choosing specific GPU for OpenGL context?

Too bad we're talking Windows here not Linux. There you just setup one GPU per screen, create an X connection on the appropriate screen, and create a GL context on that connection (on NVidia at least). Pretty simple.
8. ## PBO to GL_BACK

GL_BACK_LEFT is a component buffer of the default (system) framebuffer. It is not the name of a buffer object. glBindBufferglCopyBufferSubDataI can easily see why you made this mistake though. Components of a framebuffer are historically called buffers (e.g. color buffers, depth buffer, stencil buffer; thus glDrawBuffer/glReadBuffer/etc.). These are different than buffer objects (arbitrary blocks of driver memory you can create). With framebuffer objects, these component buffers are called "attachment points" to help disambiguate these concepts.
9. ## Skeletal Animation Using Dual Quaternions

Doesn't sound right. This should be the full inverse of the bind pose transform for the specified joint (rotation and translation). For the root joint this might just be a negative translation (i.e. 0 deg rotation), but for child joints in general, this is not the case. Sounds like you're close. If trouble persists, would just do your transform compositing using matrices, and then just do a matToDQ on the tail end. Then later you can flip to DQs.
10. ## Writing to Render Target from Itself

Check out: * GLSL : common mistakes#Sampling and Rendering to the Same Texture NV_texture_barrier might be useful to you on NVidia specifically, but I don't know of a cross-vendor way to support this. OpenCL IMO is a non-starter except for limited use cases, as IIRC flipping back and forth requires a full pipeline flush/sync (that is, in the absense of cl_khr_gl_event / ARB_cl_event). An OpenGL Compute Shader is much more interesting in terms of avoiding that overhead, but I'm not an expert on those yet.