Jump to content

  • Log In with Google      Sign In   
  • Create Account

PeterStock

Member Since 17 Oct 2005
Offline Last Active Apr 29 2016 07:47 AM

#5289215 Data alignment on ARM processors

Posted by PeterStock on 29 April 2016 - 05:00 AM

*((unsigned char*)(&result)) = *offset;

Watch out for compiler optimisations breaking code like this (called type punning). It breaks the language aliasing rules. See this link:

http://stackoverflow.com/questions/20922609/why-does-optimisation-kill-this-function/20956250#20956250


#5284208 Beyond the Infinite

Posted by PeterStock on 30 March 2016 - 04:37 AM

Have you seen Proun?

http://www.proun-game.com/

Might be useful for ideas/inspiration, if you weren't aware of it.

Good luck with it.


#5275236 Clinical studies on overlooking stupid bugs

Posted by PeterStock on 11 February 2016 - 02:18 AM

I think most compilers have a warning setting to catch things like
if (x);
{
	y;
}
Maybe turn up your warning level? Or use some static code checking tool, like PVS-Studio or Cppcheck.

The compiler find most of my mistakes like this. The worrying time is when it compiles successfully and you know you're missing something, but can't remember what :-)


#5264844 Replay & recorded games

Posted by PeterStock on 04 December 2015 - 03:00 AM

Hopefully those links will change your mind. Otherwise, scroll back up to the Gaffer On Games articles about Deterministic Lockstep, and the other on Floating Point Determinism.


Unless you jump through a lot of very careful hoops, turn off a lot of optimizations, and introduce some slower operations, you can get different results even on the same computer with the same executable and the same exact input.


No, all the cases you mention are caught by:

I use 32-bit float operations (SSE doesn't internally use 80-bit extended precision) with no denormals (FTZ bit set), code floating point atomically (e.g. no x = a + b + c) and use compiler flags to restrict float optimisations. And you can only use + - * / sqrt() operations - not things like sin().


Yes, this prevents the compiler using fancy instructions like fused-multiply-add and reciprocal sqrt. But I don't consider the performance penalty significant, given the benefits it gives for networking model it enables.

As evidence of my claim of IEEE 754 specifying the exact binary result of calculations, I cite 'What Every Computer Scientist Should Know About Floating-Point Arithmetic' by David Goldberg:

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Which says:

"The IEEE standard requires that the result of addition, subtraction, multiplication and division be exactly rounded. That is, the result must be computed exactly and then rounded to the nearest floating-point number (using round to even)."

"One reason for completely specifying the results of arithmetic operations is to improve the portability of software. When a program is moved between two machines and both support IEEE arithmetic, then if any intermediate result differs, it must be because of software bugs, not from differences in arithmetic."

"There is not complete agreement on what operations a floating-point standard should cover. In addition to the basic operations +, -, × and /, the IEEE standard also specifies that square root, remainder, and conversion between integer and floating-point be correctly rounded. It also requires that conversion between internal formats and decimal be correctly rounded (except for very large numbers)."


#5264798 Culling? Duplicate Impulses during Collision Resolution.

Posted by PeterStock on 03 December 2015 - 03:41 PM

It ends up with 2x the velocity it should have. If there were 3 simultaneous collisions, it would have 3x the velocity.

Solution: scale each collision response by a factor of 1/n when you have n simultaneous collisions.

This requires that you find all the collisions first, then do the resolutions as a separate step afterwards, so you know what n is for each object before you start adding any forces.


#5264402 handling data/state in a board game - composition/coupling

Posted by PeterStock on 01 December 2015 - 06:32 AM

Your design looks pretty good to me!

1. When you have a group of interacting 'things', like you say, you have to either make them accessible to each other or tell each one about the other ones it needs to talk to. No easy 3rd way unfortunately!

2. I'd put the function in the place where it needs most access to, then you reduce the above inter-dependency problem, but it can't be completely avoided.

I'd guess that most released games are nothing like as tidy as your design, with plenty of globally-accessible stuff ;-)

When you use composition, you can make the members public without sacrificing the restrictions on access - if you have the appropriate access level for the Board class, making Board a public member of GameData gives no further access to Board, but can mean less boilerplate to access it (depending on your answer to point 1 above).


#5264383 Replay & recorded games

Posted by PeterStock on 01 December 2015 - 04:09 AM

If your simulation uses floating point calculations they may not replay perfectly.

ALWAYS remember that floating point math is an approximation.  Every chip and every optimization is allowed some wiggle room within that approximation.


Although you have some valid points, unless you change the floating-point control flags in a non-deterministic way, any single build will replay the same on the same machine. So if the OP is only interested in same-machine, same-build determinism, floating point can be ignored as a source of non-determinism. But yes, a new build could break determinism.

Actually, although you're right that there is wiggle room for floating point implementations, there is also some strictness. Just like C doesn't exactly nail down what an int is, it does make some concrete restrictions. All platforms I know of are conformant to IEEE 754, so you can get cross-platform cross-build identical float behaviour. But it takes much effort.

This is a bit off-topic, but this is a useful concise overview of IEEE 754:

http://www.appinf.com/download/FPIssues.pdf

This is a good discussion, with some useful points in the comments, but a long read:

http://www.gafferongames.com/networking-for-game-programmers/floating-point-determinism/

I use 32-bit float operations (SSE doesn't internally use 80-bit extended precision) with no denormals (FTZ bit set), code floating point atomically (e.g. no x = a + b + c) and use compiler flags to restrict float optimisations. And you can only use + - * / sqrt() operations - not things like sin().

But it *is* possible to have cross-platform floating point determinism.


#5264365 Replay & recorded games

Posted by PeterStock on 01 December 2015 - 01:47 AM

If I find a difference... then what? What will the fix likely be...?


Follow the program flow back from the point(s) where that variable gets set, reasoning about the code. In some cases you will need to make a mental leap beyond what the code appears to do. It's particularly painful if you have something like a stray pointer or buffer overflow overwriting a variable that by reasoning about the code, you *know* can't get updated like it does. Be extremely careful about any multithreading. If it's just an uninitialised variable (which is most common) then you're lucky ;-)


#5264359 Replay & recorded games

Posted by PeterStock on 01 December 2015 - 01:10 AM

So in addition to computing a CRC from every sim variable in the game (probably 100.000 or so) each frame, the program must also dump the name and value into a text file. Perhaps I can group the code that logs the field and the code that computes its CRC value together in the same method. This way, I am less likely to miss anything.
I predict that this will hit the frame rate very hard. 
 
Then the workflow will be: CRC diverge detected at run time. Then I compare the text files for the recorded game and the replay. If I find a difference... then what? What will the fix likely be...?


You don't *have* to dump every single variable, and it doesn't *have* to be text formatted - which would make output slower. It just makes it more readable, and the purpose of it is to find your bug, and your time is probably the bottleneck.

What my code does is save (not dump to file) the game state every tick (effectively just a memcpy), and calculate and store the CRC for every state. Compare CRCs and if equal then great, otherwise dump something like the last 10 saved states (in text format) and the next 10 too. And the program forgets old states that have been verified consistent, otherwise you run out of memory fast. This way the file output, which is the slowest bit, only needs to happen when you've found the divergence.

Instead of the CRC, you could just do a compare on the states themselves. If using the CRC method then be aware that things may cancel out (e.g. 2 bool flags different, both flipped) - I maintain an increasing shift to reduce the likelihood. But still: CRCs different => states definitely different; CRCs equal => states *probably* same.


#5264314 Replay & recorded games

Posted by PeterStock on 30 November 2015 - 03:35 PM

To answer the original questions (sorry - only realised I missed them on re-reading):

Yes, CRC for determining when diversion happens.

Yes, extensive logging to determine what has diverged - I know of no other way. Ideally, the entire simulation state, otherwise you're searching in the dark: Butterfly effect causes ripples of differences and you need the source. I just do a text file dump of [var name, var value] pairs, so I can do a file diff to see easily what's different. Keep in mind you may have missed some simulation state from logging, so the first apparent difference may not be the underlying initial divergence.


#5264294 Replay & recorded games

Posted by PeterStock on 30 November 2015 - 01:31 PM

Yes, that will definitely be problematic - like HappyCoder says, save/restore the binary representation of floating point values.

It's not actually necessary to use a fixed time step to get repeatable/deterministic behaviour for replays, but it is (much) easier for certain applications (like keeping 2 games in sync over a network).

For variable time steps, you need to make sure you use the same sized time steps during replays, which it sounds like you already are.

Tracking down determinism bugs will be hard, but if you do get to the bottom of them (and there likely will be many - not just one!) then the knowledge you gain from it will likely be useful in the future :-)


#5264253 Replay & recorded games

Posted by PeterStock on 30 November 2015 - 10:44 AM

Determinism is totally possible - you just have to be careful and thorough. Whether it's worth the time investment for you I cannot tell.

Here's some useful links:

http://gafferongames.com/game-physics/fix-your-timestep/
http://gafferongames.com/networked-physics/deterministic-lockstep/


#5264206 Lerp vs fastLerp

Posted by PeterStock on 30 November 2015 - 04:09 AM

This is a useful reference for IEEE floating point:

http://www.appinf.com/download/FPIssues.pdf


#5264204 Lerp vs fastLerp

Posted by PeterStock on 30 November 2015 - 04:00 AM

At small scales, given constant values X and Y and variable v<=0<=1, FastLerp(X,Y,v) is smoother than Lerp(X,Y,v).

Here's some test code that shows the effect. I print out the hex value of the floats to show it more clearly. Also, note the explicit single-float-operation-per-line coding style - combined with compiler flags to limit floating point optimisation, it means this is the code that gets executed. Without this, my compiler generates code equivalent to FastLerp in both cases.

float Lerp(float x, float y, float v)
{
//	return x * (1 - v) + y * v;
	float oneMinusV = 1 - v;
	float xOneMinusV = x * oneMinusV;
	float yV = y * v;
	float xOneMinusVPlusYV = xOneMinusV + yV;
	return xOneMinusVPlusYV;
}

float FastLerp(float x, float y, float v)
{
//	return x + (y - x) * v;
	float yMinusX = y - x;
	float yMinusXV = yMinusX * v;
	float xPlusYMinusXV = x + yMinusXV;
	return xPlusYMinusXV;
}

void Test(void)
{
	int i;
	float x = 0.9999988079f;
	float y = 1.0000000000f;
	float delta = 0.0078125f;
	int numIterations = 32;
	printf("%8x\n", Types::IntFromFloat(x));
	printf("%8x\n", Types::IntFromFloat(y));
	printf("%8x\n", Types::IntFromFloat(delta));
	printf("\n");
	float v = 0.0f;
	printf("v = %f\n", v);
	for (i=0; i<numIterations; i++)
	{	float f1 = Lerp(x, y, v);
		float f2 = FastLerp(x, y, v);
		printf("%8x\t%8x\n", Types::IntFromFloat(f1), Types::IntFromFloat(f2));
		v += delta;
	}
	printf("\n");
	v = 0.5f;
	printf("v = %f\n", v);
	for (i=0; i<numIterations; i++)
	{	float f1 = Lerp(x, y, v);
		float f2 = FastLerp(x, y, v);
		printf("%8x\t%8x\n", Types::IntFromFloat(f1), Types::IntFromFloat(f2));
		v += delta;
	}
}
Output:

3f7fffec
3f800000
3c000000

v = 0.000000
3f7fffec        3f7fffec
3f7fffec        3f7fffec
3f7fffec        3f7fffec
3f7fffec        3f7fffec
3f7fffed        3f7fffed
3f7fffed        3f7fffed
3f7fffed        3f7fffed
3f7fffed        3f7fffed
3f7fffed        3f7fffed
3f7fffed        3f7fffed
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffee        3f7fffee
3f7fffef        3f7fffef
3f7fffef        3f7fffef
3f7fffef        3f7fffef
3f7fffef        3f7fffef
3f7fffef        3f7fffef
3f7fffef        3f7fffef
3f7ffff0        3f7ffff0
3f7ffff0        3f7ffff0
3f7ffff0        3f7ffff0
3f7ffff0        3f7ffff0
3f7ffff0        3f7ffff0
3f7ffff0        3f7ffff0
3f7ffff1        3f7ffff1
3f7ffff1        3f7ffff1
3f7ffff1        3f7ffff1

v = 0.500000
3f7ffff6        3f7ffff6
3f7ffff6        3f7ffff6
3f7ffff6        3f7ffff6
3f7ffff6        3f7ffff6
3f7ffff6        3f7ffff7
3f7ffff7        3f7ffff7
3f7ffff7        3f7ffff7
3f7ffff7        3f7ffff7
3f7ffff7        3f7ffff7
3f7ffff8        3f7ffff7
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff8
3f7ffff8        3f7ffff9
3f7ffff9        3f7ffff9
3f7ffff9        3f7ffff9
3f7ffff9        3f7ffff9
3f7ffffa        3f7ffff9
3f7ffffa        3f7ffff9
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffa
3f7ffffa        3f7ffffb
3f7ffffa        3f7ffffb
3f7ffffb        3f7ffffb
Look at the values around 3f7ffff9.


#5261943 Using a physics engine on the server

Posted by PeterStock on 13 November 2015 - 03:34 PM

I recently tidied up my client time synchronisation code and I made it so the client stores a buffer of the time differences for 4 seconds. Then the calculation of 'current time difference estimate' is made by a weighted average of these, weighting '0 seconds ago' as 1 and '4 seconds ago' as 0. Then the estimate remains pretty constant and accurate even with a large (200ms) packet jitter.

It might be a good idea to set your packet delay emulation to emulate jitter too, so you get packets out of order (and dropped and duplicate packets too). From your timing data it looks like you just have a constant delay? Test you code by trying as hard as you can to break it ;-)




PARTNERS