File I/O and speed

Started by
4 comments, last by Evil Steve 19 years, 8 months ago
When loading from a file I've seen people do it two ways. I've seen them load the whole thing into a buffer, and then parse it from there, and I've seen them load it normally with read or whatever. What are the gains fo using the buffer method? I know any kind of IO can be slow, but is it a noticable gain by loading the whole fine in and parsing it that way? I've programmed for a long time, but it's funny, I've never profiled anything and I have no clue what's faster than something else except by basic algorithm analysis.
Advertisement
Generally, it shouldn't matter much. Perhaps the reading in a buffer is faster, perhaps it's slower. I shouldn't bother about it. Just load everything into memory during the start of your application/game and show a "Please wait while we steal your personal data and transmit it to our server" screen.

And by loading into memory, I mean, do it just the way you like it. There shouldn't be much speed issues, except perhaps allocation the memory on the heap might take alot of time(Allocation memory is slow). And since you allocate memory twice(One for the raw data and one for the processed data, such as bitmap) and clean up once, I'd say loading into memory is a tad bit slower.

EDIT: HAR HAR! I beat Skizz by 3 seconds [grin]

Toolmaker

Reading the whole file into a big buffer (or, at least, load big chunks) merely decreases the number of calls to the OS' file IO system. Calling the IO system adds a big overhead so doing it as little as possible will help. However, memory on modern OS's is virtual and could be paged, so, loading a big file could cause some other data to paged to disk which will slow things even further.
So, you're best off not worrying about it and just read files bit by bit. If it really is a problem then profile it to see if disk IO is what's slowing you down or if it's the processing of the loaded data.

Skizz
Both your hard drive's hardware and your OS cache disk reads. When accessing the disk, the computer reads a whole block, not just a couple bytes. Look for information on your hard-drive cache size.

Reading data in large chunks does have the advantage that you aren't doing as many system calls and that your processing loop isn't interrupted.

Of course, if your file is too big to fully fit in memory in the first place, the decision is taken care of for you :) (though you can start playing with memory-mapped files...)
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
If reading a file from disc and do some parsing of the file on a Windows platform, I'd suggest using a memory-mapped file. Then the OS will load from disk as needed and performance for this is good as the OS is quite clever figuring out how to make this optimal.
Quote:Original post by Anonymous Poster
If reading a file from disc and do some parsing of the file on a Windows platform, I'd suggest using a memory-mapped file. Then the OS will load from disk as needed and performance for this is good as the OS is quite clever figuring out how to make this optimal.

Seconded. Especially if you're doing sequential access (not skipping around a lot). That will be the most efficient way to do it. Of course it only works on Windows, but theres probably a similar method on other systems. Worst case is that just end up doing 2 versions of the function and switching with a #define...

This topic is closed to new replies.

Advertisement