• ### Announcements

#### Archived

This topic is now archived and is closed to further replies.

# fastest way to read in a file

## Recommended Posts

is it faster this way fopen("data.txt","r"); or ifstream fin; fin.open("data.txt", ios::in);

##### Share on other sites
don''t know for shure... but I''d say they''re about the same... if there is a diffrence then it would be a very small one... small enough not to matter... disk io is exreamly slow so the few clock ticks picked up by using one or the other is going to be so minamal that it isn''t going to matter which you use...

##### Share on other sites
The biggest speed increase comes from using binary files.

Ben

##### Share on other sites
Opening and closing the same file repeatedly fopen/fclose was able to do 4,399 per second and ifstream::open/ifstream::close could to 4,342 so if you open around 335,000 files fopen will save you ONE second. So, how many are you opening?

Always one for performance trivia I just has to try the Windows API CreateFile. It did 7,564 per second. So you would only have to open 10,500 files to save a second with it.

Edited by - LilBudyWizer on June 5, 2001 3:26:08 PM

##### Share on other sites
I supose he asks about wich way of reading files, C-like of C++ streams is fastest. Normally you don´t open many files at runtime.

If you want to write fast to a file, you have to batch as much data as you can instead of writing many small pieces...

What the hells!

##### Share on other sites
Both the stdio functions and the c++ streams eventually end up calling CreateFile. So I''d say using that should always be faster than the two alternatives.

##### Share on other sites
ok ill stick with streams then since i know how to use those

##### Share on other sites

  ifstream ifs;ifs.open("File1.txt", ios::in | ios::binary);ifs.seekg(0);for (int i = 0; i < 4096; i++) ifs.read(ucBuffer, 1024);ifs.close();

Edited by - LilBudyWizer on June 5, 2001 8:35:24 PM

##### Share on other sites
Remember that the processor is faster than your disk. So, although Win32 API calls will be marginally faster than C stdio calls, which in turn will be marginally faster than C++ iostream calls, 99% of the time, the processor will be waiting on the data. So choose whatever method you''re most comfortable with, use optimal algorithms, and I''m sure you won''t notice the difference.

##### Share on other sites
Please note that if you use fstreams, and Visual C++ 6 with the C++ library that comes with it, you''re likely to run into a very annoying bug: if you create a fstream (i- or o-) by name, it will be unbuffered, leading to a huge performance hit.

For fixes, see this page. Another option is to use a different STL/IOstreams implementation altogether, and STLport is a good (and free!) alternative in that case.

HTH

##### Share on other sites
What about _open? I suspect that just calls CreateFile in turn, though.

##### Share on other sites
Look back about a month, I ask the same question. If you don''t mind using Win32 api''s I found that this was the fastest way:

  void CFile::Load(const string& a_Filename, vector& a_Data){ HANDLE Handle = CreateFile( a_Filename.c_str(), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); DWORD Size = GetFileSize(Handle, NULL); HANDLE Mapping = CreateFileMapping(Handle, NULL, PAGE_READONLY, 0, Size, NULL); CloseHandle(Handle); LPVOID BaseAddress = MapViewOfFile(Mapping, FILE_MAP_READ, 0, 0, 0); a_Data.resize(Size); memcpy(a_Data.begin(), BaseAddress, Size); CloseHandle(Mapping); UnmapViewOfFile( BaseAddress );};

I''m not doing this optimally due to the way my code is structured. Better would be to use the filemapping and seek around in it for the data you need...providing you might not need the whole file.

This method is also optimal for pak file use(again which i don''t personally use). When i went looking for this I was certain that the OS would know of a faster buffered method of accessing files than one character at a time like my older method...

For more info look back to the post for the text the kind guy who answered my question. I think i recall he saying that using this with pak files can net up to a 200% increase in throughput.

HTH

Chris Brodie
http:\\fourth.flipcode.com

##### Share on other sites
The method gimp mentions looks indeed good for win32 specific (large) file access. Beware, however, that if you want to map from a specifc offset in a file, the offset has to match the system''s memory allocation granularity. That is, the offset must be a multiple of of the allocation granularity. You can use the GetSystemInfo function to get the allocation granularity.

HTH

##### Share on other sites
I can definitively say, without any doubt, that it all depends I did some testing. I tried a memory mapped file, reading the entire file with one read using ReadFile, reading the entire file with fread, reading 1k blocks with fread and reading the entire file with overlapped I/O. First I tested with a 4mb file. I got fairly consistant results across all five. I got roughly 2.8mb/s when it wasn''t in cache and 80 to 100mb/s when it wasn''t. I have no idea why today I get 80 to 100mb/s against the file cache when yesterday I got 100 to 130mb/s. I''ll figure that out later. The exception was the async I/Os. They only got about 30mb/s against cache. I believe the reason is that the I/O never came back pending so in the end it was just unneeded overhead.

When I switched to a 256mb file things were a little differant. Reading in the entire file with ReadFile or fread dropped to about 1mb/s. File mapping jumped to between 3 and 4mb/s. The fread of blocks went through the roof and hit 20mb/s. The machine only has 224mb of memory so it can''t all fit in cache. I have a hard time believing a eide 20gb 10,000rpm drive can deliver data that fast though so I have to assume the cache had some baring. The test isn''t really comparable though because I''m not sticking the entire file in memory at once with it. The async I/O (overlapped) began to shine though. It got 5 to 6mb/s. I repeated the memory mapped and async test several times. There was a fairly significant variance in both, 3 to 4 on one and 5 to 6 on the other, but the async consistantly beat out the memory mapped. I repeated all the tests a couple of times just to be sure nothing unusual happened, but clearly the memory mapped and async beat out reading the entire file in one read.

I still have to figure out what was up with the freads. The easiest is to eliminate memory contention by not loading the entire file in the other two tests. That almost invalidates the async test since you would have to be able to process the file out of order to use it for record by record processing. I also need to get a better handle on how the Windows file cache works now since I think the last time I tested was on either NT 4.0 or Windows 95. I also need to try a wider variety of scenerios. I have guests coming tonight though so that will have to be some other time.

##### Share on other sites
well my next question is why doesnt this work

int dat[]={0};

data.txt //data file
0 -10 10 //data
0 20 20 //data

while (!fin.eof())
{
for (x=0;x<6x++)
{
cin>>dat[x];
}
}

##### Share on other sites
Because your array can only hold 1 int!
int dat[]={0};
.............
this is a weird array declaration. Normally you declare the array´s capacity( int dat[6]).Because of you initialize it, the compiler can guess it´s capacity(1 element initialized).

.-If you want to have an expanding array,use STL´s vector.

What the hells!

• ## Partner Spotlight

• ### Forum Statistics

• Total Topics
627660
• Total Posts
2978489

• 10
• 12
• 22
• 13
• 33