Jump to content

  • Log In with Google      Sign In   
  • Create Account


We have 4 x Pro Licences (valued at $59 each) for 2d modular animation software Spriter to give away in this Thursday's GDNet Direct email newsletter.

Read more in this forum topic or make sure you're signed up (from the right-hand sidebar on the homepage) and read Thursday's newsletter to get in the running!


Member Since 22 May 2004
Offline Last Active Yesterday, 10:02 PM

#4988730 AI for a simple shedding card game

Posted by clb on 10 October 2012 - 08:27 AM

Monte Carlo sounds an excellent approach here, but unlike in your pseudo, you need to be very smart about where you direct your playouts. Having a fixed "Do 10,000 times" loop in the code is very inefficient and wastes a lot of the playouts.

An important aspect of Monte Carlo is to construct/expand a game tree while the playouts are being played. You should never settle to just playing a fixed number of playouts at the root of the tree (or you can, but it won't be as effective). This is the machinery that causes the result to converge to the best play as you increase the number of playouts.

You'll also want to estimate the uncertainty of a certain branch. For a move M, if you play 1000 random samples and always lose, versus if you played 1000 random samples and win 50% of the time, you'll know statistically that you should probably invest more on the latter case, and never play the first move. This should guide the statistical playouts to balance the search towards uncertain branches. UCT tree search is one of the recent key advances in the field. This is also an interesting read, that gives some more references.

#4988680 Should you support Linux?

Posted by clb on 10 October 2012 - 05:42 AM

In my engine I support Win7, Win8RT, OSX, Android, iOS, Web (JS+WebGL), NaCl and NPAPI, but not linux. The reason is majorly because the number of distros is huge and after evaluating it, the build and packaging process is a pain. If you ask five different linux people what the most important distro to support is, you'll get at least five different answers. Depending on distro * distro version * kernel version * GPU card vendor combination, the support for OpenGL driver varies wildly. The system testing complexity is up the roof compared to any other platform. And most importantly, since the market segment is smaller compared to Windows and OSX and there are no good marketing channels, I can't see the point in it. If you're already a linux whiz that knows the different distros and kernels and drivers in and out, perhaps you'll be able to pull off decent support for all the combinations with a bearable/manageable pain, but for "normal" developers, I don't think it's at all worth it.

If some big player (Valve+Steam?) comes in and manages to unify the development pain (doubt it), then I'll definitely be reconsidering.

#4988080 Exiting a game by closing window.

Posted by clb on 08 October 2012 - 01:32 PM

If this is C++, you never explicitly call destructors of objects, unless you have allocated them using placement new (which I doubt in this case).

While the OS will clean up all dynamically allocated memory, it is a *very* good idea to free up all resources by calling delete for each object you new, delete[] for each array you new[], free() for everything you malloc() and so on. This is because (unless your program is very very simple) you don't know how many times different objects are allocated and removed, and you'll easily have a million memory leaks at runtime, and your app can run out of memory easily. Also, there are other types of resources that might not be automatically freed by the OS (sockets, file handles, temporary allocated disk space, pipes, mutexes, other Win32 API handles, etc.), or their release may be delayed (in particular, abandoned sockets will linger and go to unusable state for security reasons).

Some argue that calling exit(0) can be better instead of manually tearing down all application objects, but unless you're 500% sure that nothing described above applies in your case, I really really recommend avoiding it. Plus, not freeing up memory makes it impossible to debug whether you have memory leaks in the first place, e.g. you can't use the amazing VLD tool then.

#4988076 DirectX 11 States in HLSL or C++ D3DAPI?

Posted by clb on 08 October 2012 - 01:21 PM

Setting particular states is not interchangeable between HLSL and C++. This means that it's not possible to enable/disable/change e.g. culling mode or other in HLSL code, but the draw calls are rendered with whatever rasterizer state you have currently enabled when the draw call is submitted.

You should call CreateSamplerState and CreateBlendState only once for each state set you want to use at load time, and then for each frame, reuse the state objects you created.

#4986440 How to check for frame buffer support

Posted by clb on 03 October 2012 - 10:17 AM

If you are using OpenGL version 3.0 or newer, the presence of framebuffer objects (FBOs) is guaranteed. See core profile 3.0. In this case, you can just call all the functions related to FBOs (glBindRenderBuffer etc...) found in that spec. You do not need (and preferably should not) to query any extensions, what you find in that pdf are guaranteed to work.

If you are using an older OpenGL version than 3.0, you'll have to detect if there is some extension present on the system that enables similar functionality. One such extension is GL_EXT_framebuffer_object. If you detect that extension, you can call all the functions specified in that extension (glBindRenderBufferEXT etc...).

You'll find that the OpenGL 3.0 FBO docs and the GL_EXT_framebuffer_object docs are very similar (== most probably identical, except for function naming), and in fact the OpenGL 3.0 FBO feature is a result of the GL_EXT_framebuffer_object extension being "accepted" to the core.

This is very much how Khronos/OpenGL rolls. They play this game for all features in OpenGL, so it's a common OpenGL "programming pattern" to detect functionality first by checking the OpenGL version, and then by checking if one or more equivalent extensions exists. As a programmer, you'll need to be prepared to have codepaths for the "present in core" case as well as the case of each extension you want to support, in case there are functional differences.

As for how to do this check in Java/jMonkeyEngine3/lwjgl, I don't know. Try to find if there's a function for checking the OpenGL version and then if there's a function for checking for extensions.

#4986416 How much math do I need to use directx?

Posted by clb on 03 October 2012 - 09:21 AM

You don't necessarily need to know the implementation details or derivation of half of the formulas, but you will need to understand conceptually how to use the math libraries, and how to use them to compute the proper transforms.

If you do only 2D, the math involved is simpler, but not necessarily by much. It depends a lot on what you are doing.

Here are some links to test yourself:
- Do you think you can understand a math library like this and how to utilize it e.g. to specify object positions, move them around, rotate and scale them?
- Are you familiar with the Direct3D pipeline stages? A lot of the math you feed into the device revolves around that architecture.
- Does the chain of linear spaces "local -> world -> view -> clip -> screen" sound familiar, and can you understand the concepts related to this?

If you feel comfortable with these, you'll be pretty well set math-wise on developing your own 3D engine. Math-wise there's not much else than basic calculus and linear algebra involved.

#4986142 Which versions of OpenGL is not worth to learn anymore?

Posted by clb on 02 October 2012 - 12:39 PM

I recommend discarding anything older than OpenGL 3.0.

Have a look at Core and Compatibility in Contexts page. In OpenGL 3.0, a lot of the silly, old-fashioned slow and retarded stuff was marked deprecated. In OpenGL 3.0 through 3.2 they got a bit confused at how deprecation, feature removal etc. should be done, and (in the typical messy OpenGL fashion) they got it sorted out for good only in OpenGL 3.2. So, primarily go for OpenGL 3.2, and if that's not available, OpenGL 3.1 with forward compatibility, and if that's not available, OpenGL 3.0, but don't call any deprecated functions.

OpenGL 3.2 is actually quite modern, and doesn't have all that crap anymore it used to have. also, GLES2 and WebGL are conceptually (almost) identical to OpenGL 3.2 API-wise, although feature/extension-wise they differ somewhat.

#4986066 Linear motion - 2 algorithms

Posted by clb on 02 October 2012 - 09:33 AM

have another question, how do i get X2, Y2 position depending of the given time, velocity, and distance? Do i need the slope of the line?

You do not need to compute the slope of the line. In the sentence 'depending on the given time, velocity, and distance' you have over-specified the problem. If we have an object at constant linear motion, the following formula holds:
float2 startingPos = given;
float2 velocity = given;
float time = given; // Time since t=0. At t=0, the object is at position startingPos.

return startingPos + time * velocity;

This is the same in 3D as well, just compute using 3D vectors.

#4986029 Learning D3D; having issues creating shaders

Posted by clb on 02 October 2012 - 05:40 AM

Programming is not a "play a psychic and guess the errors without checking" game!

1. D3DX11CompileFromFile returns a HRESULT value that tells the error reason if one occurred. Why make life hard on yourself (and the forum posters) and try to guess the error, when you can actually just ask what it was?

2. You ask D3DX11CompileFromFile to create you the blobs VS and PS. Why do you blindly trust it to do so? I don't think anyone loves Microsoft that much to believe their code is bug-free! You should check their end of the deal, namely, test that the function assigned to VS and PS, and that it's non-null:
ID3D10Blob *VS = 0;
D3DX11CompileFromFile(L"shaders.hlsl", 0, 0, "VShader", "vs_5_0", 0, 0, 0, &VS, 0, 0);
if (!VS)
   printf("D3DX11CompileFromFile failed to generate ID3D10Blob!\n");
   clean up and abort;

3. Do you really trust Microsoft enough that even if the function manages to return you a valid blob, that it actually contains anything?
if (VS->GetBufferPointer() == 0 || VS->GetBufferSize() == 0)
   printf("Error: Generated shader blob was empty!\n");
   clean up and return;

4. Do you trust the CreateVertex/PixelShader function to always succeed? It also returns a HRESULT.

5. Do you trust the outputted pVS/pPS pointers to be valid?

I recommend you first make your program print out "Error: The file shaders.hlsl was not found!" and benignly failing before actually trying to locate why you fail to load it. The file shaders.hlsl should contain your HLSL shader program code. You need to write it yourself (or copy from whatever tutorial/sample you're following). When dealing with relative paths, using _getcwd to double-check the current working directory is helpful to diagnose where exactly the relative path is being looked at.

#4986013 newbie question on glBufferData?

Posted by clb on 02 October 2012 - 04:00 AM

Btw, as a note, on Android, I was in a habit of using glBufferData to initially create a VBO, then using glBufferSubData in all subsequent calls to update the full contents of that VBO, e.g. for per-frame particles. What I noticed was that it was slower than just directly using glBufferData each frame to update the particles, combined with manual double-buffering of the VBOs.

#4985492 [Answered] 3x4 Matrix instead of 4x4

Posted by clb on 30 September 2012 - 03:20 PM

If you're using M*v convention (like is typical with OpenGL), the last row stores parameters related to projection transforms. For affine transforms, the last row is always (0, 0, 0, 1). It's fully possible to use only float3x4, but one needs to be careful about the data layout. For genericity, float4x4 is often used, since being the most generic form it allows using the same matrix type for both projection transforms and other transforms.

In MathGeoLib, I have the class float3x4, which I use in my game when I explicitly want to specify an affine transform without projection, or as storage to save a few bits, or when I want to save a few cycles off the computations. Those are rather minor though, and therefore just using the same type float4x4 for all math often trumps the rest.

#4985488 newbie question on glBufferData?

Posted by clb on 30 September 2012 - 03:14 PM

glBufferData doesn't "queue up" the data you pass in, but specifies the full data contents. So, to specify a single triangle in a VBO, do
Vec3 points[3] = { pointA, pointB, pointC };
glBufferData(GL_ARRAY_BUFFER, sizeof(Vec3)*3, points, GL_STATIC_DRAW);

(Note however, that you never want to draw just a single triangle in a VBO, but you want to batch as many of them in a single array as possible for best batching)

#4985280 [Answered] Good math library for OpenGL

Posted by clb on 30 September 2012 - 01:19 AM

MathGeoLib has all of the things you mention. Also, you can find other math libraries listed here.

#4983924 Math API performance: saving CPU cycles?

Posted by clb on 26 September 2012 - 03:14 AM

Answering the original question..

For example, it's my understanding than multiplication is slightly faster than division, and saves a few CPU cycles here and there

Is this correct/true, and should I be doing it this way? And what other optimizations might I use in general to make my math code blazing fast and efficient?

Assuming that the SSE instruction set instead of the old FP87 stack is used, then a single-precision float scalar division (DIVSS instruction) has a latency of 14-32 cycles and a processing time of 14-32 cycles, depending on the architecture. Double-precision float scalar division (DIVSD) has a latency of 22-39 cycles and a processing time of 20-39 cycles.

Compare to multiplication: a single-precision float scalar multiplication (MULSS) has a latency of 4-7 cycles, and a processing delay of 1-2 cycles, and double-precision scalar multiplication (MULSD) has a latency of 5-7 cycles and a processing time of 1-2 cycles.

The figures were taken from Intel Intrinsic Guide.

So, multiplication is about 20 times faster (assuming perfectly pipelined instructions).

I'm ignoring here the fact that you're not using C/C++ and direct SSE asm/intrinsics, and instead use C#, but the point is that 'yes, division is considerably slower *for the CPU* to execute even on modern CPUs than multiplication'. Whether that can be seen in C# execution environment, is then a matter of profiling.

MathGeoLib uses this 'multiplication by inverse' form, as do most of the game math libraries I've seen as well. Note that x / s versus x * (1/s) are not arithmetically identical, since first computing the inverse as a float and multiplying by it does lose some precision.

And what other optimizations might I use in general to make my math code blazing fast and efficient?

It should be noted that in C/C++ both a single function call, or an 'if' statement are far slower than performing a single division. However, again, in the context of C#, I recommend profiling in your real application hotspot to see what kind of effects these are, since that's quite a different context than low-level C code on the assembly/intrinsic level.

#4982592 Rotating Hitboxes, images and more

Posted by clb on 22 September 2012 - 12:28 AM

It is a standard technique/feature that you can render your 2D sprite rotated to an arbitrary angle in realtime. They just use a rotation matrix that they apply to the sprite rectangle vertices when rendering, and the GPU deals with the rotation and filtering in realtime, no problem. Rendering a sprite axis-aligned versus rendering it in an arbitrary angle does not even carry a performance penalty, it's the same performance for the GPU. As for the assets, you only need a single animation sequence for the effect in the axis-aligned position. Googling for "2d rotating a sprite" finds some good hits, e.g. this is an example of how to do it in SDL.

For hitboxes, the case is the same. They probably have defined a rectangle, or a polygon that marks the hit area of the effect. This vector shape is rotated to the appropriate angle from its default axis-aligned orientation before testing for collision against the vector shapes of the other objects. Since there's only a very few points needed to represent such a shape (perhaps 10 at most?), it's very cheap and could be done in realtime without performance implications even on a mobile device.