Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!

1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


Member Since 14 Feb 2007
Online Last Active Today, 01:23 AM

#5058853 New professional programming language

Posted by Hodgman on 03 May 2013 - 03:10 AM

// In this example the first 4 squares will be filled in parallel and the final one will not happen until
// they are all complete because it overlaps them all

I find that feature very interesting! Does this basically mean, that when scheduling that parallel block, you can determine every byte that the function-calls in the block could possibly touch? That seems like a tricky thing for a compiler to figure out; how robust is it - can you fool it with pointer math or too much indirection?

How do you wait on these parallel jobs to complete? Does the closing brace act as a "join" point, or do you again automatically detect when a later function reads from their output regions and insert a 'wait' marker at that point?


Also, there may be some cases where you want to pass the same output array to every thread, but you know algorithmically that those threads will write to unpredictable but unique regions of the array. e.g. with a parallel radix sort, each thread will read from a non-overlapping range of the input array, but will write to non-overlapping, but also non-contiguous elements in the output array. Is there a way to kick off these kinds of "dangerous" parallel function calls (with undefined behaviour if the user is wrong about their algorithmic safety)?


Having said all that, I'm very keen for people to use it and if you think that just putting the source code out there regardless
of bugs would help to make that happen then I'll consider it. Please let me know if you have any thoughts about how I should proceed.

I recently decided to start using the ispc language, alongside C++ in my game, which was a big decision to commit to! Swaying users to a new language is no easy feat. Some of the things that swayed me were:

* The source code is published, so if it's abandoned by the authors, it's development can still continue. This is quite different to the author saying that they will publish the source at some point in the future.

* They use LLVM for back-end code generation, which reduces the complexity of the project and gives me a bit more faith in the general ability for the project to be maintained.

* The compiler spits out standard C object files, conforming to C's ABI, so I can integrate it into my project very easily, and do easy interop between ispc/c/c++.

* The compiler can also spit out C++ source files (transcode from ispc->c++), so if I want to port my code to a platform that isn't yet supported, I've got this backup plan available.

* The compiler spits out standard debug files to go with the object files, so existing debuggers can be used -- this one is only theoretical for me for now though, because this feature is Linux/Mac only at the moment sad.png


So yeah, I think that publishing the compiler as an open-source project would be a positive thing to do smile.png

#5058533 Row vs Column Major Matrices: Most Useful?

Posted by Hodgman on 02 May 2013 - 12:16 AM

Ignoring row-major storage vs column-major storage for a moment, there's matrices that have been constructed to transform mathematical row-vectors and matrices that have been constructed to transform mathematical column-vectors.


Whether you're using row-vectors or column-vectors simply depends on whether you write vec * mat (mat1x4 * mat4x4), or mat * vec (mat4x4 * mat4x1). With the former type of mathematics, the forward/right/up axes will be in the first 3 rows, whereas with the latter, these axes will be in the first 3 columns (they're the mathematical transpose of each other).


Therefore, if you work with row-vectors, I'd prefer to store my matrices in row-major order, so that it's easy to extract the basis vectors (this also makes the matrix multiplication code simpler on a lot of hardware).

Likewise if you work with column-vectors, I'd use column-major storage order, for the same reason.


Personally, I work with column-vectors (and matrices designed to transform column-vectors), because that's the way that most maths material I've used teaches it -- it seems the be the prevalent convention. Therefore, I prefer column-major storage.


One place where I have to mix storage conventions is when saving space. Sometimes you want to pass matrices around as 3rows x 4columns, with a hard-coded/assumed 4th row of [0,0,0,1]. Often your registers (shader uniforms or CPU SIMD) are 4-wide, so this storage format isn't possible with column-major storage (or, it takes the same amount of space: 4 registers). So when storing a 3x4 matrix, I'd use row-major storage (which only uses 3 registers).

This same problem presents itself with the opposite convention: with row-vectors + row-major storage, you'd want to use a 4x3 matrix with an assumed 4th column of [0,0,0,1], forcing you to use column-major storage for space efficiency.

#5058511 Is there anything faster than A* pathing?

Posted by Hodgman on 01 May 2013 - 10:07 PM

There's a lot of variations on A*, which will be better in different circumstances:

#5058238 Custom memory manger operator new

Posted by Hodgman on 30 April 2013 - 09:46 PM

You can add extra parameters to operator new, and then you can store a pointer to the allocator inside the allocation itself so that operator delete can access it:
struct Allocator
	int name;
	static void* Alloc( uint size  ) { return malloc(size); }
	static void  Free ( void* data ) { return free(data); }
class Myclass
	struct Block
		Allocator* a;
	void* operator new(size_t size, Allocator& a)
	{//store the allocator pointer in a header, return the memory after the header
		Block* data = (Block*)a.Alloc(size + sizeof(Block));
		data->a = &a;
		return data+1;
	void operator delete(void* memory)
	{//go back a few bytes to find the start of the header
		Block* data = ((Block*)memory)-1;
		data->a->Free( data );

Allocator alloc = { 42 };
Myclass* object = new(alloc) Myclass;//extra parameters to new
delete object;

#5057989 Draw Call Sorting Using std::map

Posted by Hodgman on 30 April 2013 - 01:11 AM

No one mentioned one of the most important things: You don’t sort the renderstates (or whatever object you put into your render queue) directly; you sort the indices to that list.

Copying around objects wastes time.  Only copy the 32-bit indices and use that to draw in sorted order.





L. Spiro


In my renderer, from a list of "items" that need to be drawn, I've got a function that generates a list of sort-key/index pairs. This second list can then be sorted (I use a radix sort for large lists) by itself, which is faster due to the amount of memory that has to be moved around being lower.

To render, I can then either submit the original list and the list of sorted indices together (where the index list will be iterated linearly, and used to randomly iterate the original list), or, I can use the indices to re-order the original list and then submit a sorted version of the original list by itself.

#5057942 Microsoft burned down my home, where now?

Posted by Hodgman on 29 April 2013 - 07:41 PM

C++ isn't really used on its own,

Not sure what you mean by this. Most professional gave developers use C++ on its own.
Well, at every games company I've worked for, they've used C++ for the engine runtime out of necessity, and then a bunch of other languages for the toolchains (C#, Python, etc) and another language for gameplay programming (Lua, UnrealScript, etc), and another language for build automation, and another language for graphics, so it does hold true in my professional career.

Yes, there's too many hurdles to using C# in your cross platform, performant core engine runtime library right now, but it is becoming more and more popular for both gameplay and tools programming due to ease of use over C++.
At the last studio I worked for, all the programmers know C++, but the game itself (not the engine) is almost entirely written in Lua, because programmer productivity is more important than performance in that kind of code...

As for it's popularity being tied to XNA, that's not at all true in the professional world because we never picked up XNA, but we did pick up C# because it's a good 'productivity language'.

#5057654 Modulus Operator Usage

Posted by Hodgman on 28 April 2013 - 11:40 PM

 I would say just because it was drilled into our minds so much in school that you learn to use it while using the modulus isn't really drilled into your mind to use in school.  Only reason I can think of I guess...?

In that case, think of it as being basically the same operation as division. With integers, division has two results. You get the quotient with "/" and the remainder with "%".

With floats, division only has one result, you get the "quotient + remainder/divisor" with  "/".


You can also use it to bring a random int into a range:

This is a very common usage, but if you require a completely random statistical distribution of random numbers, then this can screw you with a bias.

e.g. say that rand returns numbers from 0-15, but you want numbers 0-12. In this case, the numbers 13/14/15 are wrapped around to 0/1/2, which makes these first three numbers twice as likely to be picked than the rest (3-12) are!

Usually this isn't a problem because rand generates really big numbers, making the bias very small and not noticeable  so I would still usually use modulo for this.

It's usually only a problem in very precise scientific simulators, where you'd want to cast the random number to a float and divide by RAND_MAX, to get a normalized range of 0-1, and then scale this value to the desired range.

#5057627 Microsoft burned down my home, where now?

Posted by Hodgman on 28 April 2013 - 08:39 PM


You're talking as if Microsoft has deleted all copies of XNA from the earth, when all they've done is announced that it's reached maturity and won't be developed any further. D3D9 isn't being updated either, and it's still used to create games.

  • You can keep using the existing versions of XNA.
  • You can switch over to the open-source versions of XNA, like Monogame.
  • You can use C# and D3D via SlimDX, etc.
  • You can use C++ and D3D (for free; no you don't have to buy anything).


#5057612 Modulus Operator Usage

Posted by Hodgman on 28 April 2013 - 07:37 PM

I'm not really sure how to answer it, as it's the same as any other operation -- how do you know when to use addition or division?

A lot of the time you need to know how to do an operation and the opposite of that operation. e.g. if an attack subtracts 5 from health, then to 'undo' that attack, you add 5 to health.
In these cases, then mod will come up whenever you're trying to do the opposite of integer division. Keep in mind though that this is only for integers -- with floats there's no remainder in division because they can store fractions as well as whole numbers.

One of the most common examples is calculating an index for a 2D array (when emulating a 2D array using a 1D array).
int* array2d = new int[width*height];

To calculate a 1D index from 2D x/y values, we can use:
int index = x + y*width;

If you want to then turn this index back into an x and y value, we need the opposite of this function.
Mathematically, we can rearrange the above formula to solve for x/y:
x = index - y*width
y = index/width - x/width

or working with real numbers, we can find y first then use it to find x:
y = RoundDown( index / width ) (this works because we know x < width, so x/width < 1)
x = index - y*width

or still working with real numbers, we can find x first then use it to find y:
x = Remainder( index / width ) (again, this works because we know x will be in the fractional part and y in the whole part)
y = index/width - x/width

or working with integers, we can take the two bold bits above that map easily over to integer code:
int y = index / width; (integer division rounds down)
int x = index % width;

#5057465 Build Lua Source Into Project

Posted by Hodgman on 28 April 2013 - 07:51 AM

LuaJIT is great, as it is just another implementation of Lua that obeys their specifications exactly, but it has much better performance. Even if you disable the "JIT" part (where it compiles your Lua scripts into native CPU instructions), it's still faster than Lua due to large parts of it having been hand-optimized.

However, the down-side is that because it's so optimized, it's source code is barely readable, unless you're also an optimization expert. This isn't a problem in normal usage, but if you use any API illegally (e.g. pass it bad pointers, etc), then you can make that API crash internally. If you make LuaJIT crash internally, it will be hard to debug, whereas with regular Lua, you might get a bit more useful debugging info, like call-stacks etc...

Once you've got Lua working fine, you can pretty much just replace your Lua source files with the LuaJIT source files and everything will still work the same.

#5057456 Economics engine

Posted by Hodgman on 28 April 2013 - 06:20 AM

What would you like to see in an agent-based economics simulator?
Farms complicate things. Is a farm one 'building' or is it made of grain silos, beehives, etc.?

I was going to say something about contracts, where an agent would gain access to a service but with some commitments. e.g. after buying a monthly bus ticket, the agent has a 'sunken cost' that it has to consider when evaluating other forms of transport. So if a better option pops up, they likely won't change at least until their monthly ticket expires.

You could use these kinds of contracts to build hierarchies to solve the farm issue too. E.g. a field can be worked to produce food, but it's owner has specified that that workers can only deliver the food to a specific silo (also belonging to the owner).
As with the bus ticket example, the farm owner might sign a distribution deal with a particular retailer for some time period, where breaking the contract means he has to pay a penalty fee. The retailer may sign a contract with a particular trucking group to move food from the silos to the retailer, etc, etc...

If you could build this reusable economics engine as a stand-alone library, which games could use to create their economies, then I can picture it being quite popular among indies that want to make these kinds of RTS/city-builder games, but don't have the time to build the simulation engine.

#5057434 Build Lua Source Into Project

Posted by Hodgman on 28 April 2013 - 03:49 AM

Yeah, you can treat Lua.c as basically being an example application that makes use of the Lua library in order to create a command line interpreter.

Regarding LuaJIT, I've just included the LuaJIT source directly into my engine project (so I've the opposite situation - end users can't choose to swap over to the official Lua version).

#5057429 Performance between half and float under SM3.0

Posted by Hodgman on 28 April 2013 - 03:18 AM

On some old cards, around '04/'05/'06 maybe, then the half type did actually make your shaders run faster. Most GPUs though ignore it and treat it the same as float.

#5056814 Reasons for wanting custom malloc or allocators?

Posted by Hodgman on 25 April 2013 - 08:58 PM

Stack allocators can also be useful (where you can only free the most recently allocated object) and so on.

Yeah, stack allocators are way more useful than they first seem. In the engine I'm working on at the moment, malloc/new are treated like the global variable that they are, which means their use is avoided as much as possible. If a function needs to allocate memory, it will generally have an allocator passed in as an argument. The vast, vast majority of the time, a scope/stack allocator is used instead of a malloc-esque heap allocator.

Ranked by usage in the engine, the most used is probably Scope/Stack, then Pool, then Stack (without the scope layer), then Malloc. I can literally count malloc usage on one hand.

I actually find scope/stack allocation easier to use/maintain than shared_ptr/new allocation. Leaks are impossible because the scope is a parameter to the allocation call (I use a macro eiNew(scope, Type)(parameters)), there's no clean-up code because the scopes use the RAII pattern, and reasoning about the scope works exactly the same way as reasoning about built-in language scopes, like local variables, etc... It just extends this familiar concept to dynamic allocations.


For me, replacing malloc with a different version (dlmalloc, tcmalloc, etc) is nice to be able to do, but isn't that much of a big deal -- when I see "custom memory management", I don't think of "custom written malloc", I think of completely different paradigms for allocation, like pools and stacks wink.png


When I worked on console games, we would allocate all the memory upfront from the system and use our own allocator to divvy it out. It was a lot faster than going to the system for every allocation and allowed us to place more strict restrictions on memory usage, i.e. this level can only use 200MB, and if it goes above that then crash dump and trace where all the memory is going.

Yeah this is very common. When I worked on a adventure/platforming game, from the main allocation, we'd allocate three large contiguous chunks for the level to use. Two would be in use at once, and a 3rd would be streaming in the background. Each geographical 'chunk' of the level had to fit within this hard memory limit, but in return, managing the streaming of chunks was dead simple. When a chunk was no longer required, we'd just let it leak (remove all pointers to it's member structures), and then start streaming the next chunk over the top of it. There was no real memory allocation going on.

#5056641 Reasons for wanting custom malloc or allocators?

Posted by Hodgman on 25 April 2013 - 07:59 AM

Every big commercial game engine will have it's own tools for this kind of thing, so when they choose to use a new library, they will want that library to provide a way to 'hook' all it's memory allocations.

e.g. if the library uses void* MyMalloc( size_t size ) { return malloc(size); } instead of malloc, then the engine authors can just change that one function (and the corresponding MyFree) to convert the library over to using the engine's memory system.

It's the same with a few other system features -- you'd also like to be able to hook any debug output (log/print statements) generated by the library, and if it makes use of worker threads, you'd like to be able to have it use your existing worker/job/task system rather than have it spawn it's own threads.


The features of Valgrind/memcheck are pretty important -- leaks, double deletes, invalid accesses, etc, so these features will certainly be present in the engine's memory tools. There'll probably also be a live reporting feature, when a log of memory events can be streamed out of the game over a socket/etc to a separate analysis program, so you can inspect the game's memory as it's running. This should show all current allocations (and maybe a history so you can step back in time to view past allocations) broken down by module (which will usually be tagged by programmers -- e.g. them saying that certain allocations go in the "physics" category). e.g. if you get a crash where something has accessed some memory that's already been free'ed, how useful would it be to be able to quickly rewind the memory history and see what objects have been allocated at that address most recently, and who allocated them?


When developing for consoles, you might just have, say, just 200MiB of RAM to play with and no virtual memory, so fragmentation becomes a very big concern.

e.g. After going in and out of your game/main menu 100 times, maybe you've got 100MiB of memory free, but it's so damn fragmented that the largest contiguous unallocated block is only 1MiB!

To debug these issues, you want to have a visual display of the address space so you can see where in memory each different module is allocating it's memory.

For performance analysis, you'll want per frame per module stats like number of allocations, time in malloc, size distributions, lifespan distributions, etc... For keeping memory usage low, you'll want to have reports on current, average and min/max memory usage per system. Ideally you'd be able to easily track how these numbers change over the weeks so you can get the heads up when someone implements a memory hogging new system.

If possible, it would also be able to traverse an ownership tree between allocations to help spot memory that isn't needed by the game but is still allocated anyway, like assets that are loaded by no longer required at this point in the game.


The memory tools in these engines will be quite complex, and most likely only available to and used by big console game developers.

Apart from the ones that come built into the big engines, the only stand-alone tool/middleware combo that I know of in this niche is Elephant/Goldfish.