Jump to content

  • Log In with Google      Sign In   
  • Create Account

We need your help!

We need 7 developers from Canada and 18 more from Australia to help us complete a research survey.

Support our site by taking a quick sponsored survey and win a chance at a $50 Amazon gift card. Click here to get started!


Member Since 06 Aug 2009
Offline Last Active Aug 28 2015 06:56 PM

#5244656 Which version should I choice to build my ocean?

Posted by Lightness1024 on 05 August 2015 - 08:43 AM

You don't want to write half a page because .... you are lazy ?

It will be tough implementing a nice ocean system if you are that lazy.

Choose, either you don't care about sharing what you do, or you care and you write your half page. Half a page is nothing, read at least the papers you linked, how many pages do they have ? how many months you think it took those researchers to do their work AND cherry on top, write about it and share it publicly ?

Meanwhile, what are you doing ?

#5239838 54x50 tiles map searching for neighbours - takes extremely long

Posted by Lightness1024 on 11 July 2015 - 07:39 PM

You're all wrong, you should use morton code (z order) for better cache coherency on adjacent cells :P

#5238364 Job Manager Lock-Free

Posted by Lightness1024 on 04 July 2015 - 11:22 AM

you can only work lock free safely on structures that work with plain primitive integers. what are you going to do with that ? you need to push function objects, so unless you work out a complex permanent memory position pooling technique and your lock free structures manipulate indices in that pool only, you have no exit.

read this document:


you'll realize that



also, without condition variable, lock free also means wait free, and thus you're going to need to embark in some crazy scheme where you need to recycle your job executor thread, into something that can make useful stuff during the moments there is no job; otherwise you need to spin and generate heat for nothing. Recycling a thread so that it can metamorphose into your rendering loop for example, would not be something I'd recommend to do. So why try going lock free when nothing said you would gain any perf than when using mutex+CV ? not mentionning much easier invariants proofs, better power usage in case of bubbles, software engineering sanity, etc...

#5237052 Beginner Advice: 2D Sidescroller Graphics

Posted by Lightness1024 on 27 June 2015 - 01:57 AM

Nowadays you don't need to be a programmer to make a game, I saw a presentation about unreal engine blueprint by an artist who did his game about somebody crash landing on an alien planet, and has to contraptions assembling stuff around to survive.

You could check that out.

Or you could use another engine like Unity and code stuff in callbacks, like update-entity style callbacks. this is C# (or javascript), pretty nice to develop fast.

Or, you could use a C++ engine, like for example sfml, it is not really an engine but enough for a scroller. You could use a 2d engine, it seems there are zillions of those



Or make everything from scratch, which is not necessarily evil either, because you know everything of what is happening, you control your game from the ground up, and when we are talking small games like scroller, this is a good-thing™. Especially if you are in the category of people who (like me) have difficulties to use already existing systems because we don't know 100% of what they do under the hood. Personally it used to cause me mind-freezes lol. Now I've come to realize that people usually write stuff the way we expect for a given purpose so I can guess much better how to use foreign stuff. But I needed experience in DOING this stuff before I could use it.


So, if you're a flexible person who is not blocked by opaque things and can leverage existing libraries, go this way; if you feel you need to be in control of every line of code, start from scratch it will be a wonderful lesson for the future.


You can check my platformer demo, everything is open source, so you can really see how everything is done, from scratch to game.

#5237047 what is best way to check a code is more effiecient and runs faster than others

Posted by Lightness1024 on 27 June 2015 - 01:16 AM

I wrote a piece of "article" for an answer on stackoverflow about benchmarking, it can be found here:


Of course only the first part is of any interest to you.


Otherwise, I'd say you can always predict performance by hand, using a "math" model of the machine, but it is so difficult and tricky that you might as well consider it impossible.

Do do a performance analysis on paper, you'd need to know the exact binary form of the program (op codes, or de-assembled op-codes), then you'll need to know how your cpu is going to treat this, what instructions goes into what pipeline (which out of order execution parallel pipe), and the exact state of the caches, which is hard.


Please consider this article:



The not-crazy-way™ is to measure it. You profile it in situation, or simply time it between start and end. Sampling profilers will give you details about where are the hot spots. Which means usually what function is called the most in your program. Then you can try different implementations for it, and profile again, if you lost time, revert code, if you gained speed, good. I'd say basically that's it.

#5236556 massive allocation perf difference between Win8, Win7 and linux.

Posted by Lightness1024 on 24 June 2015 - 07:46 AM

Hello, maybe you've seen my topic about the open address hash map I provide, this topic is about a mystery in its performance benchmarking.


I've noticed something plain crazy, between machines and OS, the performance of the same test, is radically in opposition.

Here are my raw results:




(full source code here)

Ok no graphs, you're all big enough to look at numbers. These results are all issued from the same code.

I have overloaded operator new afterward to count the number of allocations that each method resulted in, in the push test.

this is the result:


mem blocks

std vector: 42
reserved vector: 1
*open address: 26
*reserved oahm: 2
std unordered: 32778
std map: 32769

So my conclusion, purely from these figures, is that windows 8 must have a very different malloc function in the common runtime. I tried to use the same binary exported by the visual studio I had on Win7 and run it on Win8, I got the same results as the binary built directly by VS on Win8. So It has to be the CRT dll. Or, its the virtual allocator in the kernel that has become much faster.


What do you think, and do you think there is a scientific way to really know what is going on ?


Can you believe iterating over a map is 170 times slower on gcc/linux than on VS12/Win8.1 ? 'the heck ?? (actually for this one I suspect an optimizer joke)


ps: 32778 nodes comes from the fact that i push using rand().

#5235507 open address hash map

Posted by Lightness1024 on 18 June 2015 - 09:50 AM

Hi guys, I also have a piece of code tooling to give out.



I found that it is not so obivous to find a simple, integrable, open address hash map to a C++ project without going into crazy setups or extensive build fixings, or godknowswhat™ (licensing issues etc., you name it).

First let's link two stackoverflow threads for reference about other libraries that does similar things: https://stackoverflow.com/questions/3300525/super-high-performance-c-c-hash-map-table-dictionary




1. In game development, caring about memory is a huge must. Notably one very good rule of thumb is to never allow fragmentation to go rampant.

2. we need associative containers in many algorithms.


So if 1. and 2. met, the child they would have together would be called open address hash map.

vanilla `std::unordered` just doesn't cut it, because it is closed address and therefore allocates per element. (node based containment.) A problem to be mitigated by the use of allocators or something called burst allocation. Please c.f. this document: http://igaztanaga.vosi.biz/allocplus/ So anyway, for those who want to avoid headaches, why not give my hash map a try ?

Here is the page



bests to all

#5179320 Thesis idea: offline collision detection

Posted by Lightness1024 on 10 September 2014 - 07:46 AM

Frankly, in this age, if you arn't aware of the whole historical bibliography on some subject, then ANY idea a human can possibly have, has already been: at least thought about, possibly tried and potentially published if is has any value.

I hope you're not thinking about a phd thesis ? because I don't see how any of this world's academy would allow someone to enter a 3/4 years cycle of research on an idea that sounds of little utility, propsed by a person who has basically next to no knowledge of the field.


Sorry to sound really harsh, I just want to calm down the game. At least go read everything you need to before, other ideas will come up when reading papers, only to realize later that it was proposed as a paper 2 years later, that you will also read, and have another idea, that either happens to not work, or be covered by some even later paper, and this cycle goes on until, if you are lucky and clever, you finally can get your idea that will actually bring progress to the world's status of the research. BUT, a phd being in 3/4 years the chance is great that some other team will publish a very close work before you finish... Yep.


Good luck anyway :-)

#5179314 Deep water simulation

Posted by Lightness1024 on 10 September 2014 - 07:35 AM

Of course it is the reason. Also you get another problem that is much worse :


your baked animation will be tiled and repeated !

Not only, it is very difficult to MAKE it tileable in space, but you must also make it repeatable in TIME, and those are 2 pretty crazy stuff to have correctly.


In the era of shader model 2 (c.f AMD render monkey, ocean sample), water was indeed made using baked animated noise.


Also, about huge textures, think about the bandwidth, not only the memory is great, but today's graphic cards are limited by memory bandwidth rather than raw ALU.

#5168379 How to pack your assests into one binary file (custom file format etc)

Posted by Lightness1024 on 22 July 2014 - 08:42 AM

How about you put all of those into a zip using the zlib ? it has one of the best licenses over in the wild; and maybe it just solves exactly your packing problem, and provides compression as a cherry on top of the cake.

#5168377 Inter-thread communication

Posted by Lightness1024 on 22 July 2014 - 08:27 AM

For what its worth, if anybody knows this software:


I'm the co-author of the opengl rendering together with christophe riccio (http://www.g-truc.net/).

So basically, the opengl preview viewports of Vue has a GUI-thread message producer; and Render-thread message consumer queue system, based on a dual std::vector that swaps during consumption (each frame); and each message "push" takes the lock (boost mutex + lock) and each consumption also before swapping the queue. It just so happens that in common situations, more than 200 000 messages are pushed by second, and it is by no way a bottleneck.

I just mean to say, if you are afraid of 512 locks per frame... there is a serious problem in your "don't do too much premature optimization" thinking process, that needs fixing.

I agree that its total fun to attempt to write a lock free queue, but, if it was production code, frankly not worth the risk; and plainly misplaced focus time.


Now just about the filter thing, one day, just to see, I made a filter to avoid useless duplication of messages, it worked but was slower than the raw, dumb queue. I idon't say its absolutely going to be the same in your case; just try... but in my case being clever was being slower.

#5142969 Cascaded shadow map splits

Posted by Lightness1024 on 28 March 2014 - 07:36 PM

Cascaded shadow maps biggest problem is in the computation of the frustums of the cameras used to render the shadows. There are multiple kinds of policies;

The most common is surely the one that cuts the main cmera view frusutms into subparts according to distance and use a bouding volume of those slice to create an encompassing orthogonal frustum for the shadow camera.


There are lost efficiency in this scheme because of bounding volume of bounding volume so lots of shadow pixels end up off screen and never used. In other words you loose resolution in the visible zone.


Therefore some recent solutions using compute shaders to be able to determine the actual pixel perfect min depth and max depth percieved in a given image, then you can optimize the slices of the camera frustum to perfection making crazy high resolution shadows, especially in scenes a bit enclosed in walls.


There is another very simple policy for shadow frustums, just center the shadow camera on the view camera's position and zoom out in the direction of the light, each cascade zooming out a bit more thus logically encoding more distance in view space. But this has the problem of calculating shadows behind the view where they could be unnecessary.

I say could; because actually you never know when a long oblique shadows must drop from a high rise bulding located far behind you. this is why this simple scheme is also popular.


In my opinion; this is your scheme that fails. you should visualize the shadow zones by sampling red; blue and green to obtain this:


once you get this debugging will be easy.

#5051207 Man Hours Necessary to Make a Game

Posted by Lightness1024 on 08 April 2013 - 09:03 AM

I was going to say 300 hours, but that would be only the programming part. you need to art an equivalent for the artistic part and the gameplay part will take a variable amount of time depending on how far you want to fine tune it. so roughly put between 600 and 700 will get you there. You should however drop the C++ and go for C# unless you want a (almost) direct portability to linux and mac ? Because it saves the time of having to setup and understand boost and stl, memory overrun issues and dangers of temporary references and other joys. C++ needs to code while concentrated and focused at 100%, with rigor, (apply RAII, careful design thinking with SOLID, avoid smells...) Whereas in C# you can code correctly even after a beer, and the compiler is much much nicer in its messages. Sometimes it even gives you the solution directly. Also there are virtually no build times which accelerates code-to-test cycle. I could go on and on... I know SFML has C# interfacing, but I know that SFML is not en engine, and you will lose time having to do other things like level editor and stuff. Though you're in luck, C# has built in serialization, that will save you tons of time to save and load your levels, and the current game state. Even if you go with doing your editor yourself. Now even with that, I'll recommend trying to find some platform game engine with an editor, if you can find one that will not limit you in your project. Because yes, engines are good, BUT a new game may not be doable in an engine that's already done, even with the most generic thinking from the author, you cannot support all future ideas and game paradigms. good luck, go for it

#5049190 Direction to light in parallax/relief mapping

Posted by Lightness1024 on 02 April 2013 - 09:03 AM

depending on the depth of your displacement, the light is supposedly so much farther than the vector will not be very different. it will make a difference only on very incident angles AND deep regions in the depth map AND with lights very close to the parallaxed surface. Many conditions that suggest that the calculation overhead to get the correct vector is not necessary.

If you really want that vector : you know the UV to get the color ? then you also know the UV to read the height map, you just have to use the world position of the plane at UV b and go towards the inverse normal from a distance dictacted by the heightmap's value (x artist factor that tunes the depth extent).


To get the world position of b : either you can use the rastered world position passed from the vertex shader through the varyings (which is the world position of a) and displace it according to ddx/ddy of the depth of the position reprojected into viewspace, and multiply by the difference of UV rescaled into viewspace. (using the vertex buffer min/max of the coordinates and the world matrix to estimate this scale factor).

This one is complicated and imprecise.


Another way would be determining the vector a to impact.


or, the one I recommend : find the local position (in plane space) using the UV; normally the UV should range 0 to 1 (in float2) and simply be an exact equivalent of the local coordinate in object space. then you just need to make it 3d by putting 0 in the Z (since it is a plane) and multiply by the world matrix and projection matrix to get the world coordinate. there you go.


Remain the problem of shadowing, you need to evaluate if that ray is free to reach the light or not. same principle than what you did to find b.

#5047662 How to update a lot of random tiles in a 2D array?

Posted by Lightness1024 on 28 March 2013 - 09:27 AM

it is far from being natural that update some thousands stuffs takes that long.

In extreme carnage I'm updating the IA of 2000 to 3000 enemies that are stored in a linked list (not the fastest thing to iterate on...) and doing lots of ray casts in each of them, and it runs in real time (> 60 FPS).

Stronger even : Intel Ray Tracing Library, or many indie demos (on cpu) are able to cast millions of rays per second and 1 cast is much heavier than update a sprite id.

So definitely, you have a bug. what is the language ? do you not have a garbage collection issue rather than a looping issue ? can you run on MacOS ? if yes use the built in instruments profiler. Or valgrind + cachegrind on linux. Or Vtune, or AMD code analyst on windows.