Time Inconsistency Issue

Started by
6 comments, last by choffstein 15 years, 4 months ago
Okay, so to finally teach myself C#, I thought I would take on a market order matching system, or something like that. Basically, I downloaded a whole lot of daily tick data for the e-mini futures. If you don't know what I am talking about, don't worry about it, it isn't an issue. Now, I wanted to test some strategies on this data, so my idea was to create a server that would 'host' the data in real-time, and a client that could connect. Or multiple clients, if I so chose it. This way I could have a simulation server. But I have run into an issue -- see, to test, I don't want to experience 'real time'. If my data-set is for an entire day, I don't want to run the program for an entire day. Rather, I want to run it at high speed. My issue becomes, in speeding up the data, how to I keep the 'clients' in sync, where now their connection latency becomes a bottleneck? My idea was to allow clients to subscribe to the server, so that any time a new tick occurred, the client would get the data. But if the server is reading through the ticks as fast as possible, the speed at which clients would be able to respond would put them seriously behind the curve. By the time the client gets the tick, calculates its action, and responds, the server may be minutes ahead, even if it only took a couple hundred milliseconds to respond (because the server isn't doing any computations). This becomes even more complicated when I want to have ten or twenty clients running at once. I had a couple thoughts: 1) Force the server to wait for the client responses before reading the next tick. Since the clients won't actually affect the data, the server could just wait for either an Action or NoAction packet. This is totally unrealistic, but the simplest solution. 2) Some sort of 'frames per second' requirement? Only allow the server to work through X many ticks per second, or something similar. This works, sort of -- but if we allow it to go too fast, the connection lag between clients becomes too slow, and we run into the initial issue. Any ideas?
Advertisement
As I see it, the basic problem is that the server wants to advance the simulation past any timestep that no client will care about and/or will take no action in. But the server doesn't know which timesteps these are without interrogating the client per step, which defeats the purpose.

First of all, I think (1) is the right basic approach. I disagree that it's inherently unrealistic: If there's a timeout, a client stands to lose if he doesn't get back to you in a decent period of time. Then, just go on to make that system actually realistic.

The simplest solution, I think, is to allow clients to voluntarily "sleep" for a set number of timesteps, during which time they get no updates. If all clients are asleep and the next wakeup isn't for N timesteps, you can auto-advance N-1 steps before waking up the client. Getting a bit more complicated, you can allow clients to specify early wakeup conditions (stock X drops below price Y, for instance) which are checked on the server.

The basic way to make that work, I think, is to give clients a network budget. A client can choose not to sleep, but once he's received his allotted 1000 syncs for the day, that's all he gets; the market closes early for him.
Why do clients need to stay in sync? Why is it a problem if one client finishes in 2 minutes while other needs 20?

In real time case, each client would either have, on average, adequate computing power to keep up with stream, or it would permanently fall behind.

Or is this a question about how Server should implement send(), so it doesn't flood individual clients?
Quote:Original post by Sneftel
As I see it, the basic problem is that the server wants to advance the simulation past any timestep that no client will care about and/or will take no action in. But the server doesn't know which timesteps these are without interrogating the client per step, which defeats the purpose.

First of all, I think (1) is the right basic approach. I disagree that it's inherently unrealistic: If there's a timeout, a client stands to lose if he doesn't get back to you in a decent period of time. Then, just go on to make that system actually realistic.

The simplest solution, I think, is to allow clients to voluntarily "sleep" for a set number of timesteps, during which time they get no updates. If all clients are asleep and the next wakeup isn't for N timesteps, you can auto-advance N-1 steps before waking up the client. Getting a bit more complicated, you can allow clients to specify early wakeup conditions (stock X drops below price Y, for instance) which are checked on the server.

The basic way to make that work, I think, is to give clients a network budget. A client can choose not to sleep, but once he's received his allotted 1000 syncs for the day, that's all he gets; the market closes early for him.


I think I will have to go with 1, which isn't necessarily bad ... it is just unrealistic. The rest of your ideas seem sound ... I will let them rattle around a bit in my head and see if I can come up with why they wouldn't work.

Quote:Original post by Antheus
Why do clients need to stay in sync? Why is it a problem if one client finishes in 2 minutes while other needs 20?

In real time case, each client would either have, on average, adequate computing power to keep up with stream, or it would permanently fall behind.

Or is this a question about how Server should implement send(), so it doesn't flood individual clients?


Let's say there is a tick-per-second in real-time. This means that if I am trading and my latency is 500ms, it isn't a big deal. I catch the next tick. But now let's say that the server is simulating at 1000-ticks-per-second ... now we have an issue, because I am 500 ticks behind when my order gets there. Do you see the issue now? So by increasing the speed to increase the simulation, I get a latency issue for connecting clients.

Even in the case where the client just caters to a single client, or multiple clients with different time streams, I run into the same issue. Unless the server WAITS for a client response, it may move at unrealistic speeds that makes client orders seem INCREDIBLY slow.



Thanks for the input guys.
Quote:Original post by visage

Let's say there is a tick-per-second in real-time. This means that if I am trading and my latency is 500ms, it isn't a big deal. I catch the next tick. But now let's say that the server is simulating at 1000-ticks-per-second ... now we have an issue, because I am 500 ticks behind when my order gets there. Do you see the issue now? So by increasing the speed to increase the simulation, I get a latency issue for connecting clients.


At time T, server sends out quote(T).
Does quote(T+1) depend on clients in any way whatsoever?

If yes, then you have absolutely no choice but to wait for all clients before sending it out.

If quote(T+1) doesn't depend on clients, then each client may work on its own.

Your data is already stored ("I downloaded a whole lot of daily tick data") so memory isn't an issue. This could be a problem if you were generating quotes randomly, and would want all of them to receive same.

Or is your question related only to:
C          S|<- send --||          |  10ms|          |  10ms|-- done ->|  10ms|          |  10ms // network latency / RTT|<- send --|  10ms|          |
In above case, while latency is low, it's still reasonably large with regard to actual computation time, and you want to minimize or eliminate that.
Quote:Original post by Antheus
At time T, server sends out quote(T).
Does quote(T+1) depend on clients in any way whatsoever?

If yes, then you have absolutely no choice but to wait for all clients before sending it out.

If quote(T+1) doesn't depend on clients, then each client may work on its own.


Well, originally, I was going to have it determine on the clients ... but just to simplify it, it probably won't.

The issue is that quote(T) and quote(T+1) are separated by a non-consistent time-step. Now, for a client to find out what price they got the contract at, they need to place a buy order on the server, which 'fills' the orders based on the order-book at the time. The idea being that the market continues to move even while the client is calculating, and the client gets filled at whatever level the market is at when they submit their order. Now, this isn't an issue if I simulate in REAL time. But if I want to speed up the simulation, so that I can run multiple years in the span of a few minutes, the time gap between quote(T) and quote(T+1) becomes minuscule -- so minuscule that the client will never get filled near a price they would get filled at in a realistic simulation.

I think the best solution is to just have the client send back a message on each tick, telling the server if it wants to do anything or not. If nothing, the server can send out the next tick. Then I can just use some sort of random factor to determine what clients get filled at, so they don't necessarily get filled at the next tick.
Quote:Original post by visage

Now, this isn't an issue if I simulate in REAL time. But if I want to speed up the simulation, so that I can run multiple years in the span of a few minutes, the time gap between quote(T) and quote(T+1) becomes minuscule -- so minuscule that the client will never get filled near a price they would get filled at in a realistic simulation.


Technically, even if you speed simulation up, you're still on wall clock time. Computers just got much slower.

Since clients don't depend on each other, the accurate solution is quite possible.

Server manages each client independently. Messages it sends out are wrapped in simulation-time wrapper (server_time, real_data). When client connects, server sends out N messages, simulating real-time events, so each carries relevant time stamp (0.1, 0.2, .., 0.1*N). It proceeds to send N such messages over network to client.

Client's network handler actively reads those messages and stores them in local list as they arrive.

Meanwhile, client's simulation uses its own time step to handle messages. It consumes messages 0.1 and 0.2. Client then sends server a notification saying "I've completed 0.2", and starts processing the data. Client proceeds to read what it considers next time step, let's say it consumes 0.3 - 0.5.

Server has now received client's 0.2 notification, so it removes first two events from the list and fills empty spots with next 2 events which it sends to client.

If client runs out of messages, then server is overloaded based on wall clock. Client blocks until more messages are available. From simulation perspective, nothing has happened. When enough new events arrive over network, client unblocks.

The reason for two buffers on client and server is to solve the problem of network latency. If you speed up simulation 1000 times, then your 10 ms ping becomes 10 seconds. By using buffers and batch sends, the problem of latency is solved, unless there's huge difference in processing rates.

C               CQ             S|-read 0.0-0.2->|<-- send 0.1--| // send up to 0.5|*blocked*      |<-- send 0.2--||<- got 0.0-0.2-|<-- send 0.3--||*work*         |-- done 0.2-->||*work*         |<-- send 0.4--| // receives done 0.2, send up to 0.5 + 0.2|*work*         |<-- send 0.5--||*work*         |<-- send 0.6--||-read 0.3-0.5->|<-- send 0.7--| |<- got 0.3-0.5-|              ||*work*         |-- done 0.5-->||*work*         |              | // receives done 0.5, send up to 0.7 + 0.3|*work*         |<-- send 0.8--||*work*         |<-- send 0.9--|....
C - client
CQ - client queue
S - server

There is no need for actual list on server, it just needs to know last send time and how far up ahead to send.

CQ is abstraction of real time on client. Due to asynchronous nature of batch sends network latency becomes a non-issue. This simulation runs as fast as network allows it, but each client determines how fast to advance virtual wall clock. Server is oblivious to all that, it just makes sure each client has up to N events as far as network bandwidth allows.

While the simulation isn't advanced based on wall clock, client perceived it as such. Regardless if it's blocked while waiting to read, if server is ahead or behind, it will see it as if running in real time, better yet, it decides for itself how far to advance.

Since each client's implementation of 'read' is now a local call, the "network latency" becomes same as function call, solving the problem of network running on wall clock, while simulation isn't.

This of course assumes that client is simulated to process every message, it's possible to simulate underpowered client which misses some events because it can't keep up.

The approach would be your option 2), but implemented in a way that solves the problems you mentioned. It's also the basic network bandwidth throttling technique.
Quote:Original post by Antheus
...very intriguing method...


I like it Antheus. Thanks for the input. I'll do some research on network throttling.

Corey

This topic is closed to new replies.

Advertisement