Quote:So in the end nobody will wait for each other's input, so predictions must be made?
There are cases where you can't lock-step without latency. For example, if there's a gold coin on the ground, and you and I are both close to the gold coin, and try to pick it up, and picking up is an "instant" action (like avatar movement), then both of our clients will predict that we get the coin. The server will break the tie (one of us gets the coin) and ends up sending a correction to the other. The amount of time that you may get corrected for is the same as the round-trip time between you and the server.
Another option to solve the same problem (rather than sending corrections and time travel) is to design latencies into actions. If picking up was an action that took 500 milliseconds, then the server will break the tie before any one of us actually ends up picking up the thing. We've designed our system such that we keep enough state for time travel that both of us could be on separate continents, playing on a server in a third continent, over dial-up modems, and the system will still work. That takes a bit of memory, but our take on that is that memory on the clients is cheaper than bits of bandwidth.
In our model, for each kind of interaction, you can choose points along this spectrum of instantaneous-with-possible-timetravel through latency-and-guarateed-consistent. For things that are unlikely to have contention (avatar movement), clearly choose the instant response. For things that involve many players (applying brakes to a train, or puching the "go" button on an elevator), you sometimes choose to accept some latency, and hide it with feedback animations/sounds.