Server appears to lag behind client every few frames

Started by
11 comments, last by hplus0603 5 years, 11 months ago

I have begun adding online functionality to my game. I use ENet to handle my networking. I intend to have the server and client run at a fixed timestep, with the server ahead in time sending world state information, and the client sending input commands. This seems to be working for the most part. However, it appears to be really jittery. The reason is that apparently every 2 or 3 frames the client tries to read from the server and gets nothing. The server somehow manages to get slightly behind for a bit despite having the same timestep as the client and starting first.

I tried adding artificial delays to the client, but eventually the server would fall slightly behind. Naturally, I assumed that perhaps my simulation code was taking too long, but having created a minimal client and server and it happened again. I've been bashing my head against this for a while and I'm out of ideas. Perhaps someone more experienced can help me out here?

Here's the code for the server loop:


void oneSecondServerThread() {

	ENetAddress address;
	ENetHost* server = nullptr;
	ENetEvent event;
	ENetPacket *packet = nullptr;

	std::vector<ENetPeer*> connectedPeers;

	address.host = ENET_HOST_ANY;
	address.port = 8080;

	server = enet_host_create(&address, 32, 2, 0, 0);

	//wait for the client to connect
	while (connectedPeers.empty()) {

		while (enet_host_service(server, &event, 0) > 0) {

			if (event.type == ENET_EVENT_TYPE_CONNECT) {
				connectedPeers.push_back(event.peer);
			}

		}

	}

	bool running = true;

  	//for testing purposes, the timestep is one frame per second
  
	std::chrono::high_resolution_clock::time_point currentTime = std::chrono::high_resolution_clock::now();
	std::chrono::high_resolution_clock::time_point previousTime = std::chrono::high_resolution_clock::now();

	std::chrono::nanoseconds timestep(1000000000);
	std::chrono::nanoseconds accumulator(0);

	uint8_t packetNumber = 0;

	while (running) {

		previousTime = currentTime;
		currentTime = std::chrono::high_resolution_clock::now();

		std::chrono::nanoseconds delta =
			std::chrono::duration_cast<std::chrono::nanoseconds>(currentTime - previousTime);

		accumulator += delta;

		while (accumulator.count() > timestep.count()) {

			accumulator -= timestep;

          	//check for events
			while (enet_host_service(server, &event, 0) > 0) {

				switch (event.type) {

				case ENET_EVENT_TYPE_CONNECT:

					connectedPeers.push_back(event.peer);
					break;

				case ENET_EVENT_TYPE_DISCONNECT:

					running = false;
					break;

				case ENET_EVENT_TYPE_RECEIVE:

					uint8_t receivedNumber;
					memcpy(&receivedNumber, event.packet->data, 1);
					printf("Client packet %u received\n", packetNumber);

					enet_packet_destroy(event.packet);

				}

			}

          	//create a packet consisting of a single byte
			packet = enet_packet_create(&packetNumber, 1, ENET_PACKET_FLAG_UNSEQUENCED);

			for (ENetPeer* peer : connectedPeers) {
				enet_peer_send(peer, 0, packet);
			}

			printf("Server packet %u sent\n", packetNumber);
			packetNumber++;

		}

	}

	enet_host_flush(server);
	enet_host_destroy(server);

}

 

And the client:


void oneSecondClientThread() {

	ENetAddress address;
	ENetHost* client = nullptr;
	ENetEvent event;
	ENetPeer *peer = nullptr;
	ENetPacket *packet = nullptr;

	client = enet_host_create(nullptr, 1, 2, 0, 0);

	enet_address_set_host(&address, "localhost");
	address.port = 8080;

	printf("Attempting to connect ...\n");
	peer = enet_host_connect(client, &address, 2, 0);

	bool connected = false;
	while (!connected) {
		
		while (enet_host_service(client, &event, 0) > 0) {

			if (event.type == ENET_EVENT_TYPE_CONNECT) {
				printf("Connection successful!\n");
				connected = true;
			}

		}

	}

	bool running = true;
  
  	//like with the server above, the timestep is 1 frame per second

	std::chrono::high_resolution_clock::time_point currentTime = std::chrono::high_resolution_clock::now();

	std::chrono::nanoseconds timestep(1000000000);
	std::chrono::nanoseconds accumulator(0);

	uint8_t packetNumber = 0;


	while (running) {

		std::chrono::high_resolution_clock::time_point previousTime = currentTime;
		currentTime = std::chrono::high_resolution_clock::now();

		std::chrono::nanoseconds delta =
			std::chrono::duration_cast<std::chrono::nanoseconds>(currentTime - previousTime);

		accumulator += delta;

		while (accumulator.count() > timestep.count()) {

			int receivedCount = 0;

			accumulator -= timestep;

			while (enet_host_service(client, &event, 0) > 0) {

					switch (event.type) {

						case ENET_EVENT_TYPE_DISCONNECT:

						running = false;
						break;

						case ENET_EVENT_TYPE_RECEIVE:

						receivedCount++;
						uint8_t receivedNumber;
						memcpy(&receivedNumber, event.packet->data, 1);
						printf("Server packet %u received\n", packetNumber);

						enet_packet_destroy(event.packet);

				}

			}

          	//The server is sending frames once per second constantly
       		//if we didn't receive a packet, that means that the server fell behind
			if (receivedCount == 0) {
				printf("Server dropped a frame!\n");
			}

          	//create a packet consisting of a single byte
			packet = enet_packet_create(&packetNumber, 1, ENET_PACKET_FLAG_UNSEQUENCED);

			enet_peer_send(peer, 0, packet);
			printf("Client packet %u sent\n", packetNumber);

			packetNumber++;

		}

	}

	enet_peer_reset(peer);
	enet_host_destroy(client);

}

 

The server method is called in its own thread, while the client works on the main thread. I'm sure I'm missing something obvious, but for the life of me I don't know what.

Advertisement

Edit: Use "code" tags instead of spoiler tags to show quoted code! You get to those with the <> icon in the editor, or using brackets around the word code and /code.

enum Bool { True, False, FileNotFound };

 if a timeout of 0 is specified, enet_host_service() will return immediately if there are no events to dispatch

This means your client and server are sending independently so due lag it is well possible messages sent by the server arrive after your client checks

If you want the server to stay ahead you need to add a syncing mechanism like lockstep for example

14 hours ago, hplus0603 said:

Edit: Use "code" tags instead of spoiler tags to show quoted code! You get to those with the <> icon in the editor, or using brackets around the word code and /code.

I did, but it seemed a bit large, so I thought it would look better in spoiler tags. I will remember this in the future.

7 hours ago, R-type said:

 if a timeout of 0 is specified, enet_host_service() will return immediately if there are no events to dispatch

This means your client and server are sending independently so due lag it is well possible messages sent by the server arrive after your client checks

If you want the server to stay ahead you need to add a syncing mechanism like lockstep for example

Upon testing this, you appear to be one hundred percent correct. I was under the impression that the latency of sending network messages to myself would be so negligible that the client would always have something to read, but apparently it is enough to cause this issue.

So in terms of timeout, how much would you recommend?

Quote

 

So in terms of timeout, how much would you recommend?

 

However much time is left before it's time to run the next simulation step.

So, if you run simulation at 100 Hz, remember when the last simulation step started. Then when going to read from the network, calculate how long until 10 ms ahead of that time, and use that for timeout.


nextTime = clock()
forever {
  nextTime += stepLength
  simulationStep()
  while ((diff = nextTime - clock()) >= 0) {
    pollNetwork(timeout=diff)
  }
}

If your simulation runs so slowly that it takes longer than one simulation frame to simulate, you will obviously be in trouble, so you may want to detect that and quit the game, let the user know, or something like that, if that happens a lot in the current game instance. After all, SOMEONE will try to run the machine on an old Pentium MMX 120 MHz they have sitting in a closet.

enum Bool { True, False, FileNotFound };

Generally you don't want to use a timeout because it will block the network thread, having 1 thread per player would normally not be efficient

Also the requirements of your game determine what should happen if a player does not reply on time, which is something you shouldn't handle in your network layer

There can be all kinds of reasons messages could arrive or be sent later:

- Wifi/network is unstable

- Server congestion

- User switching to another application and then coming back

- User has some older hardware

Your game should be able to handle this without throwing out players just because of having 1 lag spike


 

On 5/10/2018 at 12:28 AM, R-type said:

Generally you don't want to use a timeout because it will block the network thread, having 1 thread per player would normally not be efficient

I figured as much. But I have no idea what else to do. Everything I try, the server eventually falls behind.

I improved the condition somewhat in my actual game by fixing my interpolation buffer implementation. I render 5 frames behind the server, and yet the client catches up after 20 frames or so. Changing the number of frames behind does not stop the server from catching up, but it does delay it somewhat. All of this is still local to my machine.

 It is quite infuriating, as I'm sure anybody making a real time multiplayer game has solved the issue I'm having, but I'm floundering. Do other games have a non-zero timeout? That certainly would make things easier. If not, what the heck am I doing wrong here?

Everything I try, the server eventually falls behind

.

You need to use a high accuracy clock to determine what the current time is. QueryPerformanceCounter() is fine. GetSystemTime() or TickCount() or timeGetTime() are not as good.

int64_t baseValue;
double multiplier;
int32_t baseGameTime;

void InitClock(int32_t timeNowInGameTicks) {
    ::QueryPerformanceCounter((LARGE_INTEGER *)&baseValue);
    int64_t pcc;
    ::QueryPerformanceFrequency((LARGE_INTEGER *)&pcc);
    multiplier = 1.0 / pcc;
    baseGameTime = timeNow;
}

double TimeInSeconds() {
    int64_t pcc;
    ::QueryPerformanceCounter((LARGE_INTEGER *)&pcc);
    return (pcc - baseValue) * multiplier;
}

int32_t TimeInGameTicks() {
    return TimeInSeconds() * TICKS_PER_SECOND + baseGameTime;
}

void AddDeltaTicks(int32_t deltaTicks) {
    baseGameTime += deltaTicks;
}

Once you can measure seconds accurately, you should establish a "baseline" timer value, where you know the "baseline" game tick number. Then you derive the tick number you're supposed to be at by measuring the distance in seconds from base time to now, and multiplying by tick rate, and adding the base tick number.

You do not want to re-initialize the base time too often, because each time you do, you may "miss" or "gain" some fraction of a tick. Instead, adjust the output tick clock by full ticks only, by adjusting the baseline tick value.

Now, if the server tells you your input arrived X ticks too late, you should add X+2 to the base game time, and then set a flag that means "don't listen to the server for the next second." If the server tells you your input arrived X ticks too late, you should add 2-X to the base tick value, and again, set a flag that means "don't listen to the server for the next second." This is to avoid oscillation in clock correction. The server should only tell you about being too early once you're more than 4 ticks early.

The values 2 and 4 can be tweaked, as can the value "one second," but those are good baseline values for games that send messages 10-60 times a second and simulate 30-120 times a second.

The game simulation loop is simple:

int32_t simulatedTimeInTicks;

void MainLoopSimulate() {
    int32_t ticksNow = TimeInGameTicks();
    if (ticksNow - simulatedTimeInTicks > 10) {
        warning("too big a time jump -- skipping simulation");
        simulatedTimeInTicks = ticksNow;
    }
    while (ticksNow - simulatedTimeInTicks > 0) {
        StepSimulationOneTick();
        simulatedTimeInTicks++;
    }
}

enum Bool { True, False, FileNotFound };
1 hour ago, hplus0603 said:

You need to use a high accuracy clock to determine what the current time is. QueryPerformanceCounter() is fine. GetSystemTime() or TickCount() or timeGetTime() are not as good.

The chrono library's high_resolution_clock can get to nanosecond precision, so I'm not entirely convinced there's a problem there.

As for the rest of your post; I must admit I'm having a little trouble understanding. It seems like way too much to accomplish something that should be so simple. However, from what I think I understand, you believe my server falls behind and continues to lose time in a downward spiral, right? However, that's not really what's happening: a frame arrives late, yes. But then it appears to correctly receive frames again for a few frames, then drops one again later, ad infinitum. In fact, after the initial failure, it fails again every 4 frames. A very distressing pattern.

My interpolation buffer delays this for a bit, but that's all its done. Perhaps some sleep will help me piece it together.

Quote

chrono library's high_resolution_clock

Sure! Assuming the base clock is well chosen, then that will work fine. I just indicated one possible implementation.

Quote

It seems like way too much to accomplish something that should be so simple.

Distributed systems are never simple.

Real-time systems are never simple.

Distributed, real-time systems (like games,) are never simple.

That doesn't mean that you must build ultra-complex solutions, but I don't see what's complex about "establish a baseline, measure time since baseline, divide time into ticks using a consistent mechanism?"

Assuming client and server clocks proceed at a rate of one second per second (which is a fair assumption most of the time,) this will keep client and server in good sync, once they establish an appropriate offset. The necessary adjustments come from TimeInGameTicks() and MainLoopSimulation(), and the way that ticking the simulation from the main loop may simulate zero, one, or more ticks, based on what the current offset is.

it fails again every 4 frames

What does your profiler tell you happens during these frames? Do you measure the absolute time (in milliseconds) for each iteration through your main loop, each graphics frame, and each simulation frame, and log outliers?

enum Bool { True, False, FileNotFound };

This topic is closed to new replies.

Advertisement