Sign in to follow this  

Threading against delays from I/O operations - Is it worthwhile?

This topic is 4816 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm writing a multi-user (server/client) network game in C++. Performance is critical to me. For debugging and maintenance, my application writes a fair number of log/debug messages during runtime, both to the console and into various log files (on the harddisk). It is common knowledge that disk I/O is quite slow compared to processor cycles, so I've been worrying about the delay that the writing of the log messages may cause. The most obvious solution that came to my mind is to create a seperate "logging thread" which would care about the disk and console output. In this case the main thread would pass log messages to the other thread via synchronized queue and thus operate without getting delayed by the I/O operations. So far for the theory. It might sound reasonable, but I'm asking myself: Is it really worthwhile? When I implement this concept, there won't be any delay from the I/O operations, however - additional time must be spent to manage the synchronized queue and to operate the seperate thread. It is hard for me to judge the pro vs. contra. as I have no clue what the proportions are. I mean, how slow *is* disk output really? What about console output? How much extra CPU time is an additional thread going to eat up? And not to forget the delays caused by synchronization. I looked around the web, but didn't find clear answers to my question. So I had hoped that some people around here may already have experience with it. Ever made a seperate thread just for logging, maybe even a seperate process/component? If not, how did you handle it? Thanks for help, AV

Share this post


Link to post
Share on other sites
Overall I find threading usefull. If the thread exists only whilst there is something to log it wont make much diffrence and will prevent your app from bouts of unresposiveness (if it has to wait a long time, for HDD, not a big consideration). If the thread is operating all the time (e.g in a loop) it will get half your CPU time. (Bad).

Share this post


Link to post
Share on other sites
Yeah, resposiveness is a good point. I've seen for example that some types of consoles eventually block write operations when the user clicks the mouse inside. A thread provides safety against those sort of things.

I would put my log thread to sleep in regular time intervals in order to avoid unneccessary high CPU consumption. It'd wake up, process all pending messages, then go to sleep again etc...
When the output operation calls are blocking, that should also reduce the CPU load much like sleeping.


thanks for the reply

Share this post


Link to post
Share on other sites
I think the extra overhead of a passing the messages through a synchronized queue can be very small.

To determine whether or not it is worthwhile to log the messages in another thread, you can do a couple tests without having to actually implement it.

First, test the performance of your system with logging turned off completely. Anything you do to optimize message logging will never be better than this. If turning off logging completely does not improve the performance sufficiently, then there is no reason to optimize logging messages. If it does improve the performance, then spending more time on the problem may be justified.

Second, with message logging still turned off, add another thread that simply writes data to disk at about the same rate as your system would if it were logging messages. Make sure the two threads are completely independent. Anything you do to optimize message logging in a separate thread will never be better than this. If the performance of this case is not sufficiently better than the single-threaded performance, then there is no reason to create another thread.

Also, keep in mind that the results depend a lot on the amount of data being logged. You need to get a good idea of how much data is actually going to be logged in the production system.

Share this post


Link to post
Share on other sites
Unless you write out ALOT(hundreds/thousands) of debug lines per frame, I don't think you're going to gain much by threading your log file IO. And if you do write out alot of stuff per frame, consider revising that decision before multi-threading something as trivial as log files.

File IO already employs some pretty efficient caching techniques so that not every time you write to the log does it actually hit the disk, unless you are telling it to with flush() or something. I believe you could even adjust the size of the caching that is used in IO if you want to make it hit the disk less often.

Share this post


Link to post
Share on other sites
Write some amount of data to a file and time how long it takes. Divide the amount you wrote by the time it took. Then estimate how much data you will write to your log per unit of time. Divide that by how much you can write per unit of time and that's roughly what percentage of your time you are going to spend writing data to the log file. If that is too high a percentage reduce how much data you write to the log file.

This concern is actually backwards for a log file. The concern with log files is that the application crashes while the relevant data is still in a buffer and as a result you don't have the information needed to recover or diagnose the crash.

Share this post


Link to post
Share on other sites
Thank you for all those replies so far. The recommendation to test how much data is written and to do some time measuring is reasonable. The only problem is that my game application is highly variable in regard to both CPU load and amount of log data written. So I decided to write a small stand-alone test application and I'd like to share the results.

First off all, I made a shocking discovery on the way:
The C++ standard output stream ("cout") appears to be incredibly slow when used in a thread while another thread is calculating at maximum load.
I have used the following setup:
Windows XP, a C++ application with two threads running simultaneously. Thread 1 is doing arbitrary calculations in an endless loop (so it consumes all CPU power it can get). At the same time thread 2 is writing strings to "cout", in the fashion "cout << string;"

The result is, thread 2 is executing the output commands so slow that I can easily count each character as it appears on the console. [looksaround]
I have tried this with different runtime libraries: Multi- and Singlethreaded, Debug and non-Debug (MT, MTd, ML, MLd). The result is always the same.

Printf does not suffer from that problem. Replacing "cout << string;" by "printf("%s", string.c_str());" makes console output run at warp-speed under the exactly same setup.
Isn't that weird? I was totally baffled. Anyways, I simply went with printf and continued my undertaking. File output streams (ofstream) did not show any similar problems either.


So then I went on and extended my program. I made an interface to pass log messages from the "main thread" (thread 1) to the "log thread" (thread 2). The main thread would now alternately do a chunk of arbitrary calculations, then send a couple hundred lines of data to the log thread. In this way I hoped to simulate what my game app would be doing, roughly.
I had a switch where I could run the exact same scenario completely without multithreading, so that the ouput was done directly instead of passing it to a seperate thread.

My test results are the following (Values are rounded to full seconds due to the fact that millisecond diggits varied between runs anyways):

All output data is written to the console:
single thread: 43 seconds --- with multithreading: 29 seconds

All output data is written into an ascii file:
single thread: 40 seconds --- with multithreading: 29 seconds

Please note: These results are in no way representative for any other software projects, they might not even be representative for my very own application. The results totally depend on the proportions of CPU load generation versus amount of log data written. Naturally, I have chosen a rather large amount of log data compared to the time the CPU spent in my arbitrary calculation routine.
Moreover, hardware (CPU and HD speed) surely makes a difference.
There are many other factors playing into it, like how efficient the synchronization and data connection between the two threads is done. If you doubt the credibility of my results - then you are probably right. [wink]
Hence, please don't take my result times too serious.

AV

Share this post


Link to post
Share on other sites
If this logging is for debug logging then you shouldn't write it in a seperate thread as you'll want every line that has been passed to be present in the log in the event of a crash, or it's not very useful.

Since you don't want to be adding to the bugs you're trying to find/solve, I highly recommend you don't put it in a seperate thread, but leave it as-is. Just tell your users that it is only a debug build and that the performance will be better in the finished product.

If the amount of logging is huge then it will slow down other I/O to the same disk regardless of what thread it is in. Better to reduce the amount of logging. Stop logging the asserts just cant fail (by inspection) etc.

Share this post


Link to post
Share on other sites
You could try using memory-mapped files. It's not very portable (i.e. Windows-specific) but it's very fast - you're basically replacing I/O with calls to memcpy. It also won't help with your console output, but I assume that's small compared to file I/O (at least, many lines/sec to the console is probably not very useful to you...)

I've been meaning to release some of the memory-mapped file logging classes that I've implemented (it's a mix of unmanaged C++ and a managed C++ wrapper, but could easily be used from a completely unmanaged environment) and I might do that in the next few days (I'll post again if I do).

Basically, you'll want to look up the CreateFileMapping and MapViewOfFile APIs, but basically you need to pre-allocate a chunk of the file, map that chunk into memory, and memcpy to that memory. When it's full, you allocate another chunk of the file and map that into memory.

The OS takes care of ensuring the data is all written to the disk (and I've found that it's quite good at it, even in the event of a crash).

If you need more info, let me know and I'll post my code. I actually wrote it at work, so I might have to check with the boss to make sure it's OK, first though...

Share this post


Link to post
Share on other sites

This topic is 4816 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this