Syncing issues (algorithm description inside)

Started by
5 comments, last by hplus0603 7 years, 7 months ago
I'm implementing a snapshot system like the one in Quake 3. Below is the progress I've made:
* The server and client are updating state at 60 frames a second.
* The client sends user commands to the server each frame (so 60 times a second).
* The server sends snapshots of the gamestate to each client every frame (so 60 times a second).
* The server and client are executing the same physics logic for the player.
* Everything is running locally without packet loss.
* The client and server player positions are completely independent (neither depends on or is built on the other), but both execute the same logic and user commands.
So with that algorithm why might the player position on the server vs client get out of sync the more its moved around? Everything should be running in lock step right now.
Advertisement

The first thing is that messages take variable amounts of time to travel. If I send the 'start moving' message at 12:00pm and a 'stop moving' message at 12:05pm, I will have been moving for 5 minutes. But if the receiver gets the first message at 12:01 and the second message at 12:07 it will have moved me for 6 minutes, putting me in the wrong place.

You can timestamp/frame-stamp your commands, telling the server which frame they happened in, but that means the server has to be able to execute commands as if they had arrived when they were originally sent - doing 'time-travel' in a sense. There are also potential cheating issues here you'd need to look out for.

It's not clear from your model how you accommodate delays in transmission or jitter in transmission time, but if you're not preventing a client from moving before the server has confirmed it, you will need to have something for this.

There are a number of reasons that simulations can diverge. Examples include:

- If there are multiple players, the position of player B on player A's computer when running simulation step X will be different than on the server (bacause of latency)
- If the server and client have different CPUs, slight implementation differences where the last bit of some math function is different, and the butterfly effect makes it diverge
- Random generators used for simulation outcomes may end up with different seeds or being called in different order
- If you use a software physics engine like ODE or Bullet, the order that the constraints go into the physics simulation may vary, leading to subtle math differences
- If you use a GPU physics engine like PhysX, you will additionally get math bit mis-matches across different GPUs

Options 2 .. 5 can be solved with a carefully constructed simulation engine.
Option 1 is a killer for FPS-type games, because the only real solution is to wait to simulate until all commands/positions of remote players are known, which means that you have a round-trip time of latency between command and action.
This is why RTS games have a "Yes, Sir!" acknowledgement animation when you give units commands -- it hides the round-trip latency.
enum Bool { True, False, FileNotFound };

It's not clear from your model how you accommodate delays in transmission or jitter in transmission time, but if you're not preventing a client from moving before the server has confirmed it, you will need to have something for this.

Right now there should be no transmission delay. The client and server are running on the same machine. There should be no packet loss.

If there are multiple players, the position of player B on player A's computer when running simulation step X will be different than on the server (bacause of latency) - If the server and client have different CPUs, slight implementation differences where the last bit of some math function is different, and the butterfly effect makes it diverge - Random generators used for simulation outcomes may end up with different seeds or being called in different order - If you use a software physics engine like ODE or Bullet, the order that the constraints go into the physics simulation may vary, leading to subtle math differences - If you use a GPU physics engine like PhysX, you will additionally get math bit mis-matches across different GPUs

None of these situations apply right now.

I suspect there must be something wrong with the implementation because based on the description in my original post everything should be moving in lockstep right now. I'll continue to debug it. Thanks for your help.

I find that building the networking such that you record every packet that comes in, with the game step at which it comes in at, and the full payload, is super helpful.
Also record system state, such as the clock value each time through the main loop.
Then, and this is the real important bit, build the reverse -- a reader, that, instead of reading from a socket, and reading the system clock, read from the file and returns those values to the program.
Now, you suddenly have a fully debuggable system, where you can pause/stop and single step as much as you want, without losing state.
And you can re-play as often as you like with the same state.

The replay files also make for great QA tools -- run an automated test at top speed without any graphics or delays, and make sure that the events you expect should be happening, do happen.
And, the final tip of that ice berg, is making record/playback available to players. But that's really just icing on the cake. The amount of time you save in development is the real win!
enum Bool { True, False, FileNotFound };

Hey,

that your players getting out of sync can have multiple reasons like the others said.

I don´t think you can solve this without a position sync and smoothly moving the player to his real position.

Even if you get it to the point that your players seem to be synced in a five minute test run. What is after an hour? A little deviation added over time is enough to end up with a limited time you can play the game.

Part of the Team working on the game Metatron. Check out our website TubbyKiD UG. If you have any questions feel free to ask. :)

I don´t think you can solve this without a position sync and smoothly moving the player to his real position.


It is possible to get 100% sync in lock-step by design, as long as everyone runs the same build of the game on the same CPU architecture.
This is the whole point of "lock-step architecture," used in a lot of RPGs and a small number of other games.
(See for example the "1,500 archers on a 28,800 modem" article, which is the classic text on this method.)
enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement