Jump to content

  • Log In with Google      Sign In   
  • Create Account


clb

Member Since 22 May 2004
Offline Last Active Yesterday, 02:16 PM
***--

#5158908 glDrawElements invalid operation

Posted by clb on 07 June 2014 - 08:00 AM

Try adding an assert(vao && numIndices >= 0 && numIndices % 3 == 0);  in 'void Mesh::Draw()'. If everything is drawing ok, but the error is being spammed, I assume it's something like you have one or two Meshes in the scene for which vao is null or similar.

 

If that doesn't help, I'd spin up AMD CodeXL (works on all GPUs, not just AMD) and take a trace of the GL calls with that. It will show the stream clearly and the state of the GL context at the time of the error.




#5158900 glFlush/Finish Uses

Posted by clb on 07 June 2014 - 07:02 AM

The short answer is: never.

 

A bit longer answer: All GL functions are properly guarded to ensure that if you e.g. glReadPixels or swap buffers similar, you will always get the data back after all previous operations are finished. Instead, In some environments, you have access to companion APIs (e.g. with EGL) that integrate OpenGL with some other environment, which could be a very low-level compositing library provided by the platform, or some other GPU-accelerated API. While the built-in synchronization primitives in GL properly guard all access to the data via the GL functions themselves, they often are unable to do so across different APIs or when low-level system access is performed. That is the reason why the GL specification added glFlush() and glFinish() commands to be able to synchronize across APIs. It is quite rare that game developers would be using either, so the normal operation in a game is to avoid flushing or finishing at all, they just stall the CPU for nothing.

 

Now, sometimes there are arguments of people saying that it improves performance if you flush right after you have finished the last rendering command of the frame. That sounds like bs, or a bad driver in action. Neither glFlush or glFinish are performance primitives.

 

Also, sometimes people do a glFinish() right before starting a gpu micro-benchmark, to guarantee that the GPU would be idle when the benchmark starts. That's probably the only semi-decent use of a finish that I've heard of.




#5158485 Determining if a 3D point is in a 3D polygon

Posted by clb on 05 June 2014 - 01:01 PM

One needs to be very careful with the definitions. Usually a polyhedron is a closed volume of space defined by polygonal faces, which subdivides a space into two disjoint regions ("inside" and "outside"). Navigation meshes aren't usually like this - they don't form a closed volume, but instead they are just a connected set of planar polygon faces in a 3D space.

 

For keeping a point-polygon containment test numerically robust, one usual solution is to compute the closest point on such a polygon list to the target point, and the compare the distance of the closest point to the target point, and use that a threshold value. Another way is to imagine the polygons have a "thickness" and they get extruded in both positive and negative direction by this thickness amount by the purposes of the test. This is what MathGeoLib does in the link I posted above.




#5158479 Why do large engines use their own string class

Posted by clb on 05 June 2014 - 12:53 PM

In addition to the points mentioned above, I develop my own string class mostly because the std:: api is just plain stupid. I don't want to write over-verbose iterator madness or things like std::string::npos for operating the api. You can get a very good feeling of how obtuse it is to work with it when you search for most voted questions in StackOverflow related to std::string work: http://stackoverflow.com/questions/tagged/stdstring?sort=votes&pageSize=15 . With my own string class, I can make the functionality work as comfortably as I like. Compared to C#, JavaScript or Python, C++ has probably the weakest string api in existence.

 

C++11 improves handling of different unicode formats, but working with utf8/utf16/utf32 sources with just the std::string and std::wstring is a pain (see -fshort-wchar et al.). Having my own with explicit awareness for different encodings helps in my case.

 

Of course when you work on something like this, it should be associated with a good unit testing battery, and development will invariably take up time. It's not a "I'll just get it done in a weekend" type of project.




#5157986 Metal API .... whait what

Posted by clb on 03 June 2014 - 08:02 PM

This is an extremely funny and interesting development. After GDC this year, I would not have dreamed that in the AMD Mantle vs MS D3D12 race, I would be first reading Apple's docs instead on close-to-metal rendering API. And by the looks of it, there's a chance that Metal will be the first one of the three to ship out to public developers in a stable form. Some thoughts:

 

Was glad to see how it moves from core GLES2 spec towards D3D11-esque abstractions:

  • immutable state objects
  • no VAO-like abstraction, but proper separation of vertex format declaration and vertex data like D3D10+ has.
  • no shader program objects that causes vertex*fragment shader combinatorial explosion, but cleanly separate set()table like D3D8+ has.
  • by the looks of it, shader data bind to device and not to shaders/programs like GL and GLES has it. (which is completely retarded and silly)
  • based on UBOs and sampler objects, like D3D10+
  • proper support for offline shader compilation, like D3D9+. Runtime compile from strings still exists like in GL, so they didn't go the strict route like MS did on WinRT.
  • resource views in the form of MTLBuffer and MTLTexture, like D3D10+ resource view semantics.

 

Stuff that's there that D3D11 doesn't have:

  • Very nice machinery for starting and finishing rendering to a Framebuffer, to make buffer discard, preserve and resolve semantics explicit. Also makes it explicit that FBO switches are the most expensive ones that tiled GPUs have, so makes it easy to write code that takes that heaviness into account.
  • Command Buffers and Queues. Also like noted above, I was surprised that Command Buffers are transient (per-frame creatable single-use) and not permanent offline-creatable.

 

It feels like MJP said: the upgrade is largely to jump from GLES2 to an API that conceptually is much more like D3D11, there's not a great amount of enhancements on top of that.

 

Things that left me wondering:

  • Will it be available from C/C++ code? The examples are all with Obj-C, but perhaps they will provide a C header for accessing from C code?
  • Wth is the 16bytes of color data per sample restriction in Framebuffers? A single 4xFloat32 already consumes all of that and it's not possible to add multiple render targets after that. That is very low, and limits a lot of deferred rendering abilities, which I'd think the latest-gen hardware is very capable of already.
  • Where is their versioning/extension/feature query APIs? They don't seem to have any in place. Perhaps they'll solve that issue when the first update comes in, but I'd hope they'd develop something better than that. The current restrictions that are codified in numbers in the documentation alone are very arbitrary. I wish they will do something that does not resemble anything like the poor GL extensions mechanism, but instead something more direct (e.g. glGetIntegerv(GL_COMPRESSED_TEXTURE_FORMATS) or more global (e.g. D3D feature levels), and not a confusing indirect "play a spec-lawyer across all spec versions and extension registryes" type of game. Since they can dictate, I imagine that won't be the case, gladly.
  • What is the error reporting machinery they have? I hope it's something explicit like in D3D debug runtime (exact errors to console + coarse error enum as return) and not a GL-like METAL_ERROR_INVALID_CALL generic guess-what-happened biz.

 

Overall, like Promit mentioned, I'm also very glad to see this development. It's a great shakeup, the race is on, and even if in the short term we'll have multiple players, the weakest will eventually drop out. Since performance+programmability tradeoff is this time the technical merit being judged here, there's a good chance that OpenGL will get a good blow and have to reform itself properly this time, and not like the halfbaked AZDO extensions that they're coming up with. I really hope that *each* of these succeed and very well, to the point that Google will feel threatened on their Android platform with GLES3 only, which would pave the way towards a future where OpenGL as a technology either gets completely rewritten, or sunset as a legacy technology. One can always dream ;)




#5149070 Octree for object culling

Posted by clb on 23 April 2014 - 04:43 PM

A big thing that strikes me here is the variable bool Node::Visible. There should not be such cache variable or stored data of a visible node. It might be the case for now that you make only single camera queries for rendering purposes, but later on you could have game logic-specific code making queries to the same acceleration structure, i.e. using the structure to do raycasting queries, or "give me objects inside this bounding sphere/box", and so on. It doesn't make sense to hardcode the Octree to contain state pertaining to the world camera.

 

Instead, I'd make a separate QueryResult object, which contains a vector/hash table of collected visible nodes. Then, when you are about to render:

 

QueryResult visibleObjects;
octree.FindVisibleObjects(cameraFrustum, &visibleObjects)
sort(visibleObjects, sceneObjectsSortPredicate); 
for (object in visibleObjects)
   render(object);

 

This collecting pass of scene objects allows you to sort, which an important optimization, i.e. it gives the ability to sort front-to-back for depth, and sort to group objects that use the same material/shader/texture, and so on.

 

At minimum, if you need to keep that boolean, strongly document the semantics that the bool Node::Visible refers to either the most recent query (effectively making it a cache for the most recent visibility query operation) or rename it to bool Node::VisibleToMainCamera; to signify that the boolean refers to whether the node is visible to the certain specific god camera that the player views the scene through.

 

Then, I'd replace Node *children[8]; with a Node *children, which I'd allocate with a children = new Node[8]; statement. Also, remove the IsLeaf boolean and replace it with a function IsLeaf() const { return children == nullptr; }

 

Also, Octree::AddObject does 9 calls to a generic IntersectAABBAABB function. That is a bit excessive, and instead I'd do something like:

 

Octree::AddObject(Node *node, Object *object, const AABB &objectAABB)
{
   vec center = node->CenterPoint();
   if (objectAABB.maxX < center.x)
   {
     // left half
     if (objectAABB.maxY < center.y)
     {
       // top half
       // test z ...
     }
     else if (objectAABB.minY > center.y)
    {
       // bottom half
       // test z...
    }
   }
   else if (objectAABB.minX > center.x)
   {
     // test y and z.
   }
}
 
Octree::AddObject(Node *node, Object *object, const AABB &objectAABB)
{
  return AddObject(node, object, ComputeObjectAABB(object));
}
 

 

this form makes one efficient AABB intersection test to place the object AABB in the proper octant, and manage the case where it straddles multiple octants. As was suggested above, there are different ways to handle this:

  • support having objects up the tree, and not just in leaves. Objects will be placed in bottommost node they fully fit
  • add to multiple children
  • add to fixed child e.g. with the top-left rule that was presented above. Then when querying, see the top-left neighbors.
  • make the octree what is known as a "loose octree" - there the neighboring octree nodes overlap in volume.

There is no single best way to handle the straddling issue, but they are a bit of "what works best" decision in my experience.




#5024654 Standard expansion of DXTn colors?

Posted by clb on 23 January 2013 - 03:46 AM

The way how K bits wide fixed-bit-width channels with unsigned integer values [0, 2^K-1] (UNORM types) are interpreted is standard: The unsigned number k in the range [0, 2^K-1] represents the rational number k / 2^k-1.

 

This means that a 5-bit channel can represent the rationals 0/31, 1/31, 2/31, 3/31, ..., 30/31, 31/31 = 1. A 8-bit color channel can represent the rationals 0/255, 1/255, ..., 254/255, 255/255=1.

 

 

When encoding a rational represented as a floating point as an unsigned integer, we (usually) pick the integer that is nearest to the floating point number in question, since that minimizes the generated rounding error.

 

 

Converting the other way, from 5-bit integer to floating point, or 8-bit integer to floating point is lossless (but not necessarily exact), since floats have more precision. 

 

The method of "expanding bits" you specify is an optimization that does not have mathematical basis, and it introduces an error.

 

Here is a small code snippet that converts colors encoded as 5-bit UNORM to floating point and to their closest 8-bit representative, as well as directly 5-bit to 8-bit using the approximation you describe:

 

for(int c = 0; c <= 31; ++c)
{
double d = c / 31.0; // Convert 5-bit UNORM color to nearest double representation.
int u = d * 255.0; // Convert UNORM color as double to nearest 8-bit UNORM encoded representation.
int u2 = (c << 3) | (c >> 2); // Approximate conversion from 5-bit directly to 8-bit.
printf("5-bit:%3d as UNORM:%.05g Stored as 8-bit:%3d approx 5-bit->8-bit: %3d\n", c, d, u, u2);
}

 

The output for that is:

 

5-bit: 0 as UNORM:0 Stored as 8-bit: 0 approx 5-bit->8-bit: 0
5-bit: 1 as UNORM:0.032258 Stored as 8-bit: 8 approx 5-bit->8-bit: 8
5-bit: 2 as UNORM:0.064516 Stored as 8-bit: 16 approx 5-bit->8-bit: 16
5-bit: 3 as UNORM:0.096774 Stored as 8-bit: 24 approx 5-bit->8-bit: 24
5-bit: 4 as UNORM:0.12903 Stored as 8-bit: 32 approx 5-bit->8-bit: 33
5-bit: 5 as UNORM:0.16129 Stored as 8-bit: 41 approx 5-bit->8-bit: 41
5-bit: 6 as UNORM:0.19355 Stored as 8-bit: 49 approx 5-bit->8-bit: 49
5-bit: 7 as UNORM:0.22581 Stored as 8-bit: 57 approx 5-bit->8-bit: 57
5-bit: 8 as UNORM:0.25806 Stored as 8-bit: 65 approx 5-bit->8-bit: 66
5-bit: 9 as UNORM:0.29032 Stored as 8-bit: 74 approx 5-bit->8-bit: 74
5-bit: 10 as UNORM:0.32258 Stored as 8-bit: 82 approx 5-bit->8-bit: 82
5-bit: 11 as UNORM:0.35484 Stored as 8-bit: 90 approx 5-bit->8-bit: 90
5-bit: 12 as UNORM:0.3871 Stored as 8-bit: 98 approx 5-bit->8-bit: 99
5-bit: 13 as UNORM:0.41935 Stored as 8-bit:106 approx 5-bit->8-bit: 107
5-bit: 14 as UNORM:0.45161 Stored as 8-bit:115 approx 5-bit->8-bit: 115
5-bit: 15 as UNORM:0.48387 Stored as 8-bit:123 approx 5-bit->8-bit: 123
5-bit: 16 as UNORM:0.51613 Stored as 8-bit:131 approx 5-bit->8-bit: 132
5-bit: 17 as UNORM:0.54839 Stored as 8-bit:139 approx 5-bit->8-bit: 140
5-bit: 18 as UNORM:0.58065 Stored as 8-bit:148 approx 5-bit->8-bit: 148
5-bit: 19 as UNORM:0.6129 Stored as 8-bit:156 approx 5-bit->8-bit: 156
5-bit: 20 as UNORM:0.64516 Stored as 8-bit:164 approx 5-bit->8-bit: 165
5-bit: 21 as UNORM:0.67742 Stored as 8-bit:172 approx 5-bit->8-bit: 173
5-bit: 22 as UNORM:0.70968 Stored as 8-bit:180 approx 5-bit->8-bit: 181
5-bit: 23 as UNORM:0.74194 Stored as 8-bit:189 approx 5-bit->8-bit: 189
5-bit: 24 as UNORM:0.77419 Stored as 8-bit:197 approx 5-bit->8-bit: 198
5-bit: 25 as UNORM:0.80645 Stored as 8-bit:205 approx 5-bit->8-bit: 206
5-bit: 26 as UNORM:0.83871 Stored as 8-bit:213 approx 5-bit->8-bit: 214
5-bit: 27 as UNORM:0.87097 Stored as 8-bit:222 approx 5-bit->8-bit: 222
5-bit: 28 as UNORM:0.90323 Stored as 8-bit:230 approx 5-bit->8-bit: 231
5-bit: 29 as UNORM:0.93548 Stored as 8-bit:238 approx 5-bit->8-bit: 239
5-bit: 30 as UNORM:0.96774 Stored as 8-bit:246 approx 5-bit->8-bit: 247
5-bit: 31 as UNORM:1 Stored as 8-bit:255 approx 5-bit->8-bit: 255

 

I can't outright find a page from MSDN that describes this, but you can find the same interpretation from OpenGL specification pdfs where it is explained with formal rigor.




#5008557 Try Angelscript live!

Posted by clb on 08 December 2012 - 12:48 PM

Hi,

here's something cool I wrote up this weekend. With your modern browser with WebGL enabled, visit this page: MathGeoLib live test site. It is a project consisting of a few parts:
  • Uses my native C++ graphics rendering engine gfxapi. Utilizes the emscripten compiler to deploy the application to HTML5.
  • Integrates Angelscript as the live scripting engine.
  • Uses my MathGeoLib library for scripting primitives and geometry manipulation.
  • Interop between C++ MathGeoLib classes and Angelscript is achieved using an automatic bindings generator based on juj/CodeStructure.
The application allows writing Angelscript live on a web page, and render stuff on the screen by using constructs from MathGeoLib and gfxapi. I really must congratulate Andreas Jönsson for the Angelscript project - it is a very fine piece of technology! I have previously written C++ - script bindings interop for MathGeoLib for QtScript and Javascript, and using Angelscript is the most mature, easist and convenient one of them to work with! I evaluated Python, Lua and Mono to replace my previous JavaScript-based scripting engines, but Angelscript is the king of the hill. The support for strong typing and ability to do value types with function and operator overloading is exactly what is needed for good interop and convenient games scripting, and something where the whole competition falls short :)


#5008441 C11 initializer lists not working in visual studio 2010/2012

Posted by clb on 08 December 2012 - 03:10 AM

Can one use the VC11 IDE with the GCC or other compilers?


See the vs-tool plugin in my signature if you want to try using MinGW or Clang from Visual Studio. There are also other plugins like that existing. Note however that the plugin is very experimental at this stage, so it can/may require some hacking and tweaking activities. Here's an example of what it looks like in action: https://dl.dropbox.com/u/40949268/code/vs-mingw.png


#5008372 C11 initializer lists not working in visual studio 2010/2012

Posted by clb on 07 December 2012 - 10:55 PM

Neither the VC10 or VC11 compilers support initializer lists, see http://msdn.microsoft.com/en-us/library/vstudio/hh567368.aspx

Your choice is probably quite straightforward:
- Rework the code not to use C++11 initializer lists, or
- Switch to using another compiler that does support them.GCC has initializer lists since GCC 4.4: http://gcc.gnu.org/projects/cxx0x.html . Clang has support for initializer lists since Clang 3.1 http://clang.llvm.org/cxx_status.html . Intel C++ compiler advertizes partial support for initializer lists in 13.0 (I don't know what that means in practice): http://software.intel.com/en-us/articles/c0x-features-supported-by-intel-c-compiler .


#4998136 Moving a projectile from an angle.

Posted by clb on 06 November 2012 - 12:19 PM

The general update mechanism of a 2D point with constant velocity towards a given aimed angle looks something like
float newX = x + cos(angle) * velocity * deltaTime;
float newY = y + sin(angle) * velocity * deltaTime;
The angle is expressed in radians, velocity is the speed of the projectiles, and deltaTime is the amount of time that has passed since the last update.


#4996559 Sums of uniform randoms

Posted by clb on 02 November 2012 - 09:36 AM

If you only have an API that allows you to generate uniform bits (0 or 1), you can generate a random number in the range [0, N] in the following way:
int uniform(); // Returns 0 or 1 uniformly at random.

uint32_t RandomNumber(int N)
{
   uint32_t NPow2 = largest power-of-2-number smaller or equal to N.
   int i = 1;
   uint32_t result = 0;
   while(i <= NPow2)
   {
      if (uniform)
         result |= i;
      i <<= 1;
   }
   return (uint32_t)((uint64_t)result * (N+1) / (NPow2 << 1));
}

However, this is most likely not the route you want to go through, unless you somehow have a very special setup where your RNG can only supply individual random bits. That sounds rare, and instead most RNG sources are built-in to produce random integers in range [a,b] and random floats in the range [0,1] (which equals roughly 23 bits of randomness).


#4995363 which 3D engine is now available for Windows 8 Metro App?

Posted by clb on 30 October 2012 - 05:42 AM

The Unity 3D engine is advertising to have support for Windows 8 Store applications starting from Unity 4. Version 4 has not yet been released, but will apparently be out very soon.


#4990748 Link to a Game development themed math primer?

Posted by clb on 16 October 2012 - 09:10 AM

My favorite references and go-to for game maths are Christer Ericson, David Eberly, Eric Lengyel and Wolfgang Engel. Also see here. In particular, I've enjoyed Mathematics for 3D Game Programming and Computer Graphics by Eric Lengyel.


#4989027 AI for a simple shedding card game

Posted by clb on 11 October 2012 - 02:37 AM

Yeah, even with Monte Carlo, it's customary to maintain a game tree. Instead of exploring the tree fully like with traditional minimax, the tree is expanded only selectively to the areas based on the Monte Carlo simulation. When the tree search hits a leaf node (the moves after that node are not tracked by the tree), the game is continued with random play i.e. a random playout until the end is performed.




PARTNERS