Jump to content

  • Log In with Google      Sign In   
  • Create Account

Awesome job so far everyone! Please give us your feedback on how our article efforts are going. We still need more finished articles for our May contest theme: Remake the Classics

clb

Member Since 22 May 2004
Offline Last Active Today, 04:23 PM
***--

#4953078 Test point inside triangle, 3D space

Posted by clb on 26 June 2012 - 11:47 AM

MathGeoLib contains an implementation of some code testing whether a 3D triangle contains a given point. See Triangle::Contains. To get to the implementation, click on the small [x lines of code] link at the top of the page. The code is adapted from Christer Ericson's Real-Time Collision Detection.


#4950526 Strange artefacts when rendering texture fonts form freetype

Posted by clb on 19 June 2012 - 03:31 AM

I think you have a off-by-one error in the code. The line
  char p = ((char*)bmp.buffer)[x+((bmp.rows-y)*bmp.width)];

looks like it should instead be

  char p = ((char*)bmp.buffer)[x+((bmp.rows-1-y)*bmp.width)];



#4950324 Strange artefacts when rendering texture fonts form freetype

Posted by clb on 18 June 2012 - 12:19 PM

Try using gDEBugger to take a snapshot of the texture in GPU memory, and see what the pixel contents are. I'm using FreeType2 for my fonts, and haven't observed such an artifact. Perhaps there's an off-by-one copying error occurring somewhere in the code, and the texture actually does contain that row of pixels in GPU memory, or you address one line too low?


#4950252 High performance texture splatting?

Posted by clb on 18 June 2012 - 08:10 AM

Perhaps try avoiding the mix and optimize the code manually:

void main()
{
lowp vec4 alpha = texture2D(texture4, v_texcoord);

lowp vec4 color0 = texture2D(texture0, v_texcoord);
lowp vec4 color1 = texture2D(texture1, v_texcoord);
lowp vec4 color2 = texture2D(texture2, v_texcoord);
lowp vec4 color3 = texture2D(texture3, v_texcoord);

gl_FragColor = v_color * (alpha[0] * color0 + alpha[1] * color1 + alpha[2] * color2 + alpha[3] * color3);
}

(I reindexed the way how the indices of the alpha vector affect the read color texture for straightforwardness). The idea is that alpha[0] is already precomputed to be 1.0f - alpha[1] - alpha[2] - alpha[3] in the texture, so one doesn't need to compute that in the shader. I feel this would be faster than using mix(), but can't be sure without profiling. Let me know how it compares.

Something that's potentially optimizable is to drop one or two texture channels to splat, and subdivide your mesh down by which splat textures it is using at each triangle. Also, if the splat texture is low-frequency, try storing the splat weights as vertex attributes and pass them through to pixel shader, which will avoid you one texture read.

Finally, if the splat texture is very low frequency, you can try just decaling the contents, i.e. manually generate geometry planes that you alphablend on top of the terrain.


#4950215 Future-proof technologies to start learning now

Posted by clb on 18 June 2012 - 05:57 AM

C#, Java, C/C++, Objective-C/C++, HTML(5), CSS, XML, Sockets, JavaScript, Python, OpenGL3, GLES2, Direct3D11 are all keywords that you can see desired in games-related job applications today. Android and iOS experience is very hot for several games companies.

Off the top of my head, some technologies I can think of phasing out are D3D9, MDX, OpenGL2, GLES1, XNA, Symbian.

Qt is a bit of an interesting case - Qt for mobile is pretty dead with Nokia, but for desktop and non-games/non-3D it's still strongly alive.


#4950080 Which OpenGL version?

Posted by clb on 17 June 2012 - 02:18 PM

I've done interviews for programmers, and my thought is that I don't care if you've done OGL2 or OGL3. The more important things related to GPUs I test is whether they can do shaders or not, and whether they understand the concept of writing code that's executed by two separate processing cores (cpu & gpu) in parallel, and what kind of implications that has, performance and code structure-wise.

As a programmer, I don't touch OpenGL 2 at all. In my hobby engine where I don't care about legacy compatibility, it's just Direct3D11 and OpenGL3. That takes a lot of headache away, and keeps things simpler.


#4950064 Struct Initialization Within Struct

Posted by clb on 17 June 2012 - 01:22 PM

Yes, but unfortunately C++ doesn't allow you to conveniently do the initialization like you do on line 7. You'll have to do it in a constructor, like as follows:

[source lang="cpp"]struct A{ int iSize; A(int x = 10) { iSize = x; }}; // Astruct B{     A myA;     B()     :myA(2)     {     }}; // B[/source]


#4950050 Trouble understanding Gimbal Lock from a mathematical perspective

Posted by clb on 17 June 2012 - 11:24 AM

In that example, the writer describes gimbal lock in the context of representing rotations using euler angles. With euler angles, one specifies rotations as a sequence of three successive rotations around predetermined axes. These convention of which axes to use and in which order are chosen arbitrarily, or depending on the application, e.g. one might represent orientation as rotation about X, then Y and then the Z axis.

Since you have three scalars there you can specify rotations with, you have 3 degrees of freedom. If you fix any one of these to a constant value (say y = 25deg), you'd expect to be left with the ability to rotate still in 2 degrees of freedom, because you can still freely manipulate two rotation angles (and so on, if you fix two of these angles to a constant value, you'd expect to be left with the ability to rotate in 1 DOF). But due to the gimbal lock effect, this is not always the case. It is possible to fix only one of the angles to a specific constant value, that constrains the system to only 1 degree of freedom left, instead of the 2 dof that you'd expect.

The problem here is that whichever order we pick for the Euler angles, there exists an angle value for the middle rotation, that causes the first and the third axis to line up (the angle is +/-90degrees, depending on the choice of rotation order), so that varying the angle values for the first and the third rotation axis will both produce an end result rotation about the same axis. If we fix the second rotation axis constant to this value that causes the first and the third axes to line up, we not only freeze rotation about one dimension, but about two dimensions, since we are left with the ability to only rotate the whole object around one axis. Effectively, our expected to-be-2dof system is now only  a 1dof system. Altering the angle value for the middle rotation axis immediately breaks off the gimbal lock, giving back the 3dof rotations.

Mathematically we can see this as follows. Let's represent rotation using Euler angles, R = Rx(a) * Ry(b) * Rz©, where Ri(v) is a rotation matrix about axis i by angle v, and use the Matrix*vector convention. Fix b to -90 degrees (I think, or, b=90 if I messed up the sign). Then it can be seen that Ry(-90deg) * Rz© == Rx©. Therefore when b=-90, the rotation equation for the remaining two axes is R=Rx(a) * Rx© = Rx(a+c).

What this equation means is that we have two free scalars a and c left to specify the orientation, but they are both producing rotation about the x axis, i.e. the altering a and c is interchangeable and has the same end effect of rotation about the x axis. Effectively we fixed only one rotation axis to a special value, but managed to kill two degrees of freedom with that move.

It should be remembered that there's nothing intrinsically wrong with using Euler angles, just that having that middle value be +/-90 can be problematic. If you don't need those kind of angles, e.g. in a FPS shooter you can't tilt your head above 90 deg to look backwards, so you can pick such an Euler convention (XYZ/XZY/ZYX/etc. depending on your cooridinate system) for the camera that has the constrained rotation axis in the middle, and you are not going to have any problems.

The other solution (in game development field) to this is to not do your rotation logic as sequences of rotations about fixed axes. Using quaternions implicitly avoids this, not because of some mathematical special property they have, but because with quaternions you don't logically use sequences of fixed axes. Or if you do, and just have QuatYaw, QuatPitch and QuatRoll, you're no better off than when you were using euler angles and are still susceptible to gimbal lock.


#4949980 OpenGL performance question

Posted by clb on 17 June 2012 - 03:26 AM

I use the second approach (although with glDrawElements). Performance is not a problem at the moment (I can do hundreds of UI windows), and if it gets too slow, I'll investigate whether batching manually might help.

In the first approach, it might not necessary to update the whole VB if one rectangle changes - you could update a sub-part of the vertex buffer, if you keep track of which UI element is at which index. Although, I've got to say in my codebase that might get a bit trickier than it sounds, since I'm double-buffering my dynamically updated VBs manually (which I have observed to give a performance benefit on GLES2 even when GL_STREAM_DRAW is being used), so the sub-updates should be made aware of double-buffering.


#4949758 Techniques for reducing memory usage on HD sprite sheets

Posted by clb on 16 June 2012 - 04:20 AM

If your current sprite sheets contain a lot of potentially unrelated sprites, you can try storing the sprites individually in single files (or group together the ones you know that will always be used in tandem), and perform the atlasing dynamically at runtime, when the sprites are about to be rendered.

Another potential way to optimize animations is to subdivide them down into multiple sprites to avoid having to store full animation frames for parts that don't change. E.g. a sprite of a building with a flashing neon sign "Open" could only store the rectangle around the Open sign in the animations, and not the full building, if the other parts of the building sprite don't animate.


#4949446 Vertex Attributes, VAOs and Shaders

Posted by clb on 14 June 2012 - 11:57 PM

You are correct. If you have two different shaders with differently bound input attribute positions, you can't use the same VAO to render with both shaders.

What I do is I make all my shader programs to use a compatible layout that are intended to be used with the same VAO. A compatible layout can have extra stuff, i.e program with { index0: pos, index1: normal, index2: uv} and a program with { index0: pos, index1: normal} are compatible, since the common portion of them is identical. I have a function AreCompatible(ShaderProgram, VAO) that checks the objects that the VAO and shader program agree with the stored state. This check is enabled at debug time and it yells red text in the console if not. That way I can manually author my shader programs and VAOs to use a common structure wherever necessary, and in release mode, the checks don't exist and there is no overhead.

So, essentially, in my case, I am able to get around the potential problems by rules of convention.


#4949315 Per-vertex lighting - Are these ugly lines normal?

Posted by clb on 14 June 2012 - 03:49 PM

The lighting pattern/artifact is an unfortunate problem when you have a fixed-repeating grid of lowly tessellated heightmap data. Doing per-pixel lighting does not necessarily help (but can alleviate the issue), since the geometry position and normal data is interpolated across the geometry. As possible solutions, I have heard people choosing randomly the diagonal split direction they chop up the height map quads into two triangles. This makes the pattern a bit more random, and not repeating in an obvious fashion. Another option might be to add a bit of noise to the computed lighting values to de-emphasize the repeating pattern.

When you render a degenerate triangle with at least two of the three coordinates being equal, there's a good chance no pixels will be rasterized. One can either use GL_LINE_LIST or GL_LINE_STRIP to draw lines, or use a pair of triangles to produce quads oriented towards the camera in billboard fashion to make them look like fat lines.


#4949256 _BLOCK_TYPE_IS_VALID

Posted by clb on 14 June 2012 - 01:39 PM

I find that when I have a potential double delete issue focused in a very small piece of code, I simply add prints to trace which pointers get freed. In terms of your code, I'd probably throw in something like

void MyWindowApplication::OnTerminate ()
{
printf("Deleting renderer ptr %p", m_pkRenderer);
delete m_pkRenderer;// Reach this breakpoint
m_pkRenderer = 0;
}

MyDxRenderer::~MyDxRenderer(){
int i;// Reach this breakpoint
for (i = 0; i < m_iMaxTextures; i++)
m_pqDevice->SetTexture(i,0);
for (i = 0; i < m_apTextureNames.GetQuantity(); i++)
{
printf("Deleting texture %p", m_apTextureNAmes[i]);
delete m_apTextureNames[i];
}
}

Logging those pointers, and checking that none of the values printed are identical, I can confirm that indeed there is no double free.

Then, the next thing to check is that you're not freeing a garbage pointer. I'd trace the pointer values I got for memory allocations, and compare them to the freed ones, to see that they match. Something to also check are the actual pointer values: check that the pointer is not something like 0xDDDDDDDD (see this page). Also check the this pointer of the class instance that's deleting memory, that you're actually in a valid instance of a class deleting a valid piece of member data.


#4948823 how to draw lines over the terrain

Posted by clb on 13 June 2012 - 08:45 AM

To have the lines stay above the terrain, you can use glPolygonOffset to provide a constant depth bias. This effectively moves the lines above the terrain towards the camera.

Most often when rendering with a depth bias, I don't have depth-writes enabled, i.e. specify glDepthMask(GL_FALSE), but have depth-testing enabled, i.e. specify glEnable(GL_DEPTH_TEST). Depth-biased renders always occur at the end of the frame, after all opaque geometry have been rendered with z-writes enabled.


#4948534 Epic Optimization for Particles

Posted by clb on 12 June 2012 - 10:17 AM

It's been several years since I used D3DXSprite. It does use some form of batching, and is likely not the slowest way to draw sprites. Your current performance sounds decent.

If you want the best control of how the CPU->GPU communication and drawing is done, you can try batching the objects manually instead of using D3DXSprite. Be sure to use a ring buffer of vertex buffers, update the data in a single lock/unlock write loop, and pay particular attention to not doing any slow CPU operations. To get to 100k particles, you'll need to have a very optimized update inner loop that processes the particles in a good data cache optimized coherent fashion.

However, if you can, I would recommend investigating the option of doing pure GPU side particle system approaches, since they'll be an order of magnitude faster than anything you can do on the CPU. Of course, the downside is that it's more difficult to be flexible and some effects may be tricky to achieve in a GPU-based system (e.g. complex update functions, or interacting with scene geometry).




PARTNERS