Sign in to follow this  

Getting line count in file C++

This topic is 4861 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

is there a fast way to get a line count in a text file. this is how im doing it. void PrintRecordCount(){ std::ifstream file("data.txt"); int RC = 0; std::string line; while(std::getline(file,line)){ RC++; } std::cout << "RecordCount: " << RC << std::endl; file.close(); } there are 700,000 lines in the file and it takes about a minute or two to comeback with the recordcount. Am I an idiot or something, can it be done faster?

Share this post


Link to post
Share on other sites
Code extract from wc.c:

while ((bytes_read = safe_read (fd, buf, BUFFER_SIZE)) > 0)
{
register char *p = buf;

while ((p = memchr (p, '\n', (buf + bytes_read) - p)))
{
++p;
++lines;
}
bytes += bytes_read;
}




You can find this file here:
TextUtils unix source package(Scroll down for the source code download link)

- Mike

Share this post


Link to post
Share on other sites
Alterantely, you might try using streambuf iterators and see if that works fast enough for you. For example:

std::ifstream ifs("data.txt");
std::istreambuf_iterator<char> begin(ifs);
std::istreambuf_iterator<char> end;
int record_count = std::count(begin, end, '\n');

std::cout << "RecordCount: " << record_count << std::endl;

Share this post


Link to post
Share on other sites
I downloaded the source code and opened wc.c, I'm a beginner at this and I can't seem to find where they open the file there searching through. They have functions I never even heard of. Are these windows functions?

SET_BINARY (fd);

while ((bytes_read = safe_read (fd, buf, BUFFER_SIZE)) > 0)
{
register char *p = buf;

while ((p = memchr (p, '\n', (buf + bytes_read) - p)))
{
++p;
++lines;
}
bytes += bytes_read;
}

fd is an int. how is that a file, and how are they putting that file into the buf variable? any kind of help would be thanked. thanks for all the support.

Share this post


Link to post
Share on other sites
std::ifstream ifs("data.txt");
std::istreambuf_iterator<char> begin(ifs);
std::istreambuf_iterator<char> end;
int record_count = std::count(begin, end, '\n');

std::cout << "RecordCount: " << record_count << std::endl;

this only gets one less than how many lines there are because the last line doesn't have an \n new line character. it takes the same amount of time using the getline and counting it that way, plus its a little more confusing.

Share this post


Link to post
Share on other sites
You could do worse than something like SLOCCount, which is designed for code and has all sorts of funky statistics. It can do stuff like ignoring whitespace and comments :)

One of the statistics is Cost to Develop - it includes all your maintenance which is why it's so shockingly high :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Quote:

register char *p = buf;

Man. How old must that code be?

it's not exactly hitech technology so i wouldnt be surprised if it is old. They could also have kept the register keyword in the support older compiler with not so good optimizers.

To the OP: Try reading the file in large chunks and scan for newlines.

Share this post


Link to post
Share on other sites
Quote:
Original post by andyb716
I downloaded the source code and opened wc.c, I'm a beginner at this and I can't seem to find where they open the file there searching through. They have functions I never even heard of. Are these windows functions?


No - they are C standard functions. Remember, you are looking at code for Unix utilities (but you shouldn't have too much trouble with a port, or at least figuring out the concepts).

In basic C, you would open, read/write, and close a file like this (typing this from poor memory, forgive any errors):



FILE* file = fopen("file.txt", "r");

char read_data[64];
fread(file, &read_data, 64);

char text[] = "write some text";
fwrite(file, &text, strlen(text));

fclose(file);








The wc.c file uses the open() function to open the file - see line 503. I am not sure what its exact relationship with fopen() is, but I assume they are pretty much the same (please correct me if I am wrong).

Unless you want to learn, you may be better off downloading a line counting utility. The other day I was about to start writing one, but I decided to check the 'net for existing ones, instead. Here are some I found with a Google search:

Line Counter for MS VC++ 6.0 Projects
Project Line Counter for Visual C++ & Project Line Counter .NET
Line Counter This one costs $15, but the other ones are free (I think)

These are nice because they interface with your project, so you don't have to explicitly state a folder/file to count.

- Mike

Share this post


Link to post
Share on other sites
Quote:
Original post by andyb716
std::ifstream ifs("data.txt");
std::istreambuf_iterator<char> begin(ifs);
std::istreambuf_iterator<char> end;
int record_count = std::count(begin, end, '\n');

std::cout << "RecordCount: " << record_count << std::endl;

this only gets one less than how many lines there are because the last line doesn't have an \n new line character. it takes the same amount of time using the getline and counting it that way, plus its a little more confusing.


Ooh, nice! I would definitely recommend coding it this way - nice C++ instead of an icky C for() loop search.

- Mike

Share this post


Link to post
Share on other sites

This topic is 4861 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this