Getting line count in file C++

Started by
11 comments, last by doctorsixstring 19 years, 8 months ago
is there a fast way to get a line count in a text file. this is how im doing it. void PrintRecordCount(){ std::ifstream file("data.txt"); int RC = 0; std::string line; while(std::getline(file,line)){ RC++; } std::cout << "RecordCount: " << RC << std::endl; file.close(); } there are 700,000 lines in the file and it takes about a minute or two to comeback with the recordcount. Am I an idiot or something, can it be done faster?
Advertisement
Generally yes. The unix utility "wc" has a linecount mode and is open source I believe. See how they do it.
It looks like that code is reading in all the characters in the file. Maybe you could just do a search for all the newline characters and count them? I'm not sure how fast this would be, however.

- Mike
Code extract from wc.c:
      while ((bytes_read = safe_read (fd, buf, BUFFER_SIZE)) > 0)	{	  register char *p = buf;	  while ((p = memchr (p, '\n', (buf + bytes_read) - p)))	    {	      ++p;	      ++lines;	    }	  bytes += bytes_read;	}


You can find this file here:
TextUtils unix source package(Scroll down for the source code download link)

- Mike
Alterantely, you might try using streambuf iterators and see if that works fast enough for you. For example:
  std::ifstream ifs("data.txt");  std::istreambuf_iterator<char> begin(ifs);  std::istreambuf_iterator<char> end;  int record_count = std::count(begin, end, '\n');    std::cout << "RecordCount: " << record_count << std::endl;
I downloaded the source code and opened wc.c, I'm a beginner at this and I can't seem to find where they open the file there searching through. They have functions I never even heard of. Are these windows functions?

SET_BINARY (fd);

while ((bytes_read = safe_read (fd, buf, BUFFER_SIZE)) > 0)
{
register char *p = buf;

while ((p = memchr (p, '\n', (buf + bytes_read) - p)))
{
++p;
++lines;
}
bytes += bytes_read;
}

fd is an int. how is that a file, and how are they putting that file into the buf variable? any kind of help would be thanked. thanks for all the support.
Quote:
register char *p = buf;

Man. How old must that code be?
Quote:Original post by Sneftel
Quote:
register char *p = buf;

Man. How old must that code be?


Yeah, I started crying when I saw that.
Not giving is not stealing.
std::ifstream ifs("data.txt");
std::istreambuf_iterator<char> begin(ifs);
std::istreambuf_iterator<char> end;
int record_count = std::count(begin, end, '\n');

std::cout << "RecordCount: " << record_count << std::endl;

this only gets one less than how many lines there are because the last line doesn't have an \n new line character. it takes the same amount of time using the getline and counting it that way, plus its a little more confusing.
You could do worse than something like SLOCCount, which is designed for code and has all sorts of funky statistics. It can do stuff like ignoring whitespace and comments :)

One of the statistics is Cost to Develop - it includes all your maintenance which is why it's so shockingly high :)
-- Jonathan

This topic is closed to new replies.

Advertisement