Why is C++ ifstream so slow in VS.NET 2008

Started by
40 comments, last by Kylotan 15 years, 6 months ago
Hi there, I'm trying to read a large log file in C++ using the ifstream library. However, when I start to run the below code, I've noticed that my CPU usage is bumped up significantly e.g: by 60%. Below is my full program. However, when I do rewrite it using the C style I/O, I noticed that my CPU usage is reduced significantly - at such that it is lightning fast. I'm using Visual Studio 2008 Team Suite. Now, I've read on this forum that C++ is NOT slower than C, however, I'm finding it extremely challenging to find a simple way -for the beginner/average C++ programmer-to read a file line-by-line in C++ which is fast/efficient. Now, can someone please help with explaining to me why the below code is so slow? Or is it just that C++ is slower than C anyway and only advanced C++ programmer with in dept knowledge of the C++ language will be able to produce code that is fast?

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

int main(int argc, char *argv[])
{
	string b;
	cout<<"Hello World\n";

	ifstream inputFile("c:/messages.log", ios::in);
	std::string s;
	s.clear();

	while(!inputFile.eof())
	{
		getline(inputFile, s);
		cout<<s<<endl;
	}

	getline(cin, b);
	return 0;
}

Kind regards, Jr
Advertisement
Are you compiling and running program in Release or Debug configuration?
Can you also post your C style I/O code for comparison (of where the performance is going)
You don't have to check for eof after each getline :
#include <iostream>#include <fstream>#include <string>using std::cout;using std::cin;using std::string;int main(int argc, char *argv[]){	string b;	cout<<"Hello World\n";	std::ifstream inputFile("c:/messages.txt", std::ios::in);	string s;	s.clear();	while(getline(inputFile,s))	{		cout<<s<<"\n";	}	getline(cin, b);	return 0;}


I don't know how much that will speed it up though.Seeing the C code would help more.
Quote:Original post by gp343
Now, can someone please help with explaining to me why the below code is so slow? Or is it just that C++ is slower than C anyway and only advanced C++ programmer with in dept knowledge of the C++ language will be able to produce code that is fast?


The way the code itself is inherently slow, the C++ functions are plenty fast for such a task. However you have to make decisions as to which of the three three metrics of: speed, complexity, storage you are willing to sacrifice to get results you want.

The absolute fastest way to get all data from a file would be to use read and load chunks of the file into memory or even the entire file. If you do it by chunks, you sacrifice complexity, since all lines are not the same size, you will have to realign text based on new lines to process it correctly. If you do it by the entire file at once, you sacrifice memory storage since you loaded an entire file at once.

The method you are currently doing sacrifices speed because you have no complexity, it's a fairly trivial process you are using, and storage, since you only consume data as individual line entries so you save memory space.

Depending on what you are wanting to do will dictate which performance metrics you want to sacrifice. If you can afford to load the entire file into memory at once, you will soon see looping through each line and displaying to to the console is the real slowdown. Outputting to the console takes time and for most cases you won't ever notice it. However, if you have amounts of data that are large enough in magnitude, you will certainly see the slowdowns.

However, always consider this when you run into these issues: does the current speed of the program make it unable to do its job? If you were trying to load a 500mb file that took hours to display, the program is unusable and you would need to design something better. If you were only loaded a few hundred kb and it took a minute longer or so than your other method, is that speed difference really worth trying to fix? (That is not to say you should never investigate such things, for that is how you learn [wink])

I myself just ran into quite an interesting problem with working with sending a lot of data over the network. I was using a vector to buffer the data and it was causing all clients to crawl to a stop and wait for the current client to finish the transfer. I switched to a queue, and the problem literally disappeared and the data blazed through all clients. The problem was not necessarily my code, but the poor choice of a container that was resizing too often and being overburdened.

So moral of the story, if you notice weird slow downs, it'll mostly be due to trying to do something that is simply not made to do that task. At that point, your best bet is to try some other methods or consider why you are doing things that way to begin with in an attempt to find something faster. Hope that clears up a few things.
Hi Guys,

Below is the equivalent C style code - and it does run faster than it "C++" counterpart.

I'm no advanced user of C++ however, I would surely love to see a simple example - relevant to this topic - where Mr. Stroustrup "allegedly fast C++" can be as fast as the traditional C. If one cannot leverage the speed of C++ without having advanced knowledge of its internal workings, then one might as well use plain old C for performance intensive application and Java if they need their application to be designed in an OO manner?


#include <stdlib.h>#include <stdio.h>#include <conio.h>#define MAX_ARRAY 4096int main(int argc, char *argv[]){	FILE * inputFile;	char tmpLine[MAX_ARRAY] = {0};		//Get input file	if ((inputFile = fopen("c:/messages.log", "r")) == NULL)    {        printf("Can't open %s\n", argv[1]);        exit(1);    }	// Loop over the file line by line.	while ((fgets(tmpLine,MAX_ARRAY, inputFile)) != NULL)    {				printf("%s \n", tmpLine);	}	fclose(inputFile);	getch();	return 0;}

- Are you compiling as Release build
- What is the time needed to run C++ and C version (after removing the std::cout/printf)

Quote:If one cannot leverage the speed of C++ without having advanced knowledge of its internal workings


In order to leverage the "speed", internal knowledge of C++ is the least of your problems. In-depth familiarity of memory cache, concurrency and algorithms will be mandatory.

Quote:Java if they need their application to be designed in an OO manner?


Yes, that's the correct approach. If you do not have proficient software engineers, using one of modern languages is preferred.

Not because of performance, but if they have difficulty analyzing performance properties of trivial ifstream, they are not even remotely capable of developing in C++ with all its pitfalls and gotchas, which will not result in slow code, but painfully buggy applications.

Quote:with in dept knowledge of the C++ language will be able to produce code that is fast?


Bigger issue is classification of the problem. Is it even slow? No numbers, no real data, just some claims that it's "slow". CPU usage is not a measure of slowness.
As said above, we don't really know that it *is* slow yet. However, if it is, I can think of a couple of improvements:
1: #define _SECURE_SCL 1 // disables Microsoft's "safe" extensions. I don't know how much of an impact this has on streams, though, or if it's only containers/iterators
2: sync_with_stdio(false); // Does what it says on the box. By default, iostreams sync everything with the C I/O. If you don't need that, this may speed things up
3: Use I/O provided by the platform. Windows has functions for accessing files on its own. Use them if you want things to be as efficient as possible.
4: Just read everything into one big buffer in memory, rather than reading a line at a time.
Quote:Original post by gp343
If one cannot leverage the speed of C++ without having advanced knowledge of its internal workings, then one might as well use plain old C for performance intensive application and Java if they need their application to be designed in an OO manner?

There's a common misconception that C++ exists to magically give you extra speed when compared to other OO languages. This is not the case. Rather, C++ exists to magically give you some OO functionality on top of C/Assembly. It's there to make life easier for the high-performance programmer, not to make programs faster for the typical application programmer.
The two programs are not the same. The C++ program gracefully handles lines of an arbitrary length (limited by memory), whereas the C program handles lines of an extremely large - but fixed - length. You must re-write the programs to do the same thing before meaningfully comparing the performance of the two.

This topic is closed to new replies.

Advertisement