C++ : Best way to skip headers in file input/output...

Started by
5 comments, last by Zahlman 18 years, 8 months ago
Hi All, I've extended the ifstream class with a routine to skip headers, but it appears to me to be somewhat awkward...interested in best practice...here's the code: (template specialization...could use inheritance instead.) void IO<ifstream>::skipLines(intu n) { string _null_; for ( intu i = 0; i < n; ++i ) { std::getline(*this,_null_); } } This has become so routine in file i/o, that I've built it into the class constructor: IO<ifstream>::IO(const string & filename, intu n) { openFile(filename); // openFile checks for open errors. skipLines(n); } Any suggestions for better efficiency? Any better STL approach? --random
--random_thinkerAs Albert Einstein said: 'Imagination is more important than knowledge'. Of course, he also said: 'If I had only known, I would have been a locksmith'.
Advertisement
You could write t up as as stream manipulator, similar to std::ws, std::setw, etc... that way, it would work for any stream, not just your specialized stream.

Given that you mention "skipping headers", are you really trying to skip a given number of lines, or lines beginning with a specific character/set of characters (like # or //) ?
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Hi Fruny,

I've written a state machine that will ignore comments ( C++ style, //) and reassemble the rough data file in neat line-by-line form which is written to a 'state' file, which I then read in line-by-line as data only (except for the state file header which contains a copyright statement and date-stamp). This gives me fine-grained control on the newlines, spaces, delimiters and comments. It also means that I can put the rough data in tablular or line-by-line format and read in strings without using quotes.

Eventually I'll eliminate the 'state' file in favour of a temporary stream buffer but for the moment I'm using it for debugging purposes.

I've done it this way so that at a later date I can integrate a web browser to interface and compose the data files.

What would you suggest as a stream manipulator?

--random
--random_thinkerAs Albert Einstein said: 'Imagination is more important than knowledge'. Of course, he also said: 'If I had only known, I would have been a locksmith'.
So, for example, when the data is read to the Project struct instance, it is read line by line something like this:

using namespace std;

Project datastruct;
string statefile = "file.st";
intu header_size_ = 3;

Input in(statefile,header_size_); // where Input is a typedef to IO<ifstream>

in.getData(datastruct.record_no); // This is an unsigned int.
in.getData(datastruct.name); // This is a string.
in.getData(datastruct.value); // This is a float.
.
.
.
etc.

where the overloaded getData() method is an part of the IO specialization that reads a line of data into the variable (regardless of type) and goes to the next line.

--random
--random_thinkerAs Albert Einstein said: 'Imagination is more important than knowledge'. Of course, he also said: 'If I had only known, I would have been a locksmith'.
Well, to skip a line, the cleanest way is to do

stream.ignore(std::numeric_limits<int>::max(), '\n');

A character count equal to std::numeric_limits<int>::max() is handled as a special case meaning "any number of characters".

A parameterless stream manipulator is simply a function that takes and returns a stream reference. operator>> (operator<< for ostreams) is overloaded to call the function when passed such a function pointer: std::endl is a function, std::cout << std::endl calls std::endl(std::cout).

So in our case:

std::istream& skipline(std::istream& is){  is.ignore(std::numeric_limits<int>::max(), '\n');}


Now stream >> skipline will skip a line.

Things get slightly more complicated if you want to be able to specify the number of lines to skip as C++ does not support currying (passing some of the parameters to a function, leaving the others unbound, getting a function in return) - though boost::bind gets us halfway. So we need to create a class which, when passed to operator>>, will just to the righ

struct skipline{   unsigned int nlines;   skiplines(usigned int n=1) :       nlines(n)    {}};template<class CharT, class Traits>std::basic_istream<CharT, Traits>& operator<<(std::basic_istream<CharT, Traits>& is, const skipline& s){   for(unsigned int n=0; n<s.nlines; ++n)      is.ignore(std::numeric_limits<int>::max(), '\n');}


Now stream >> skipline(n) will skip n lines.

As much as you might want to (I did for a while, was about to write it up and then remembered), these two approaches do not mix. If you were to write an overloaded skipline function, either directly performing the operation or, if passed an integer, returning an object which, when streamed, would perform the operation, you would have to disambiguate with a cast which function call you intend when just passing the function pointer, which would be inconvenient considering we're trying to make the syntax as smooth as possible.

I am sure you can build on that model to write manipulators that skip comments, or something similar.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan
Thanks Fruny,

Actually, you've given me a lot of ideas regarding parameterless stream manipulators that could be used elsewhere in my code.

I know that a lot of books on C++ recommend getting each line (with getline(...)) and sending it to a 'null' string. But this just didn't seem right to me, although it works. Plus getline(...) is a little tricky in that even with the declaration 'using namespace std;' the compiler must have the std::getline(..) form.

Your suggestion to use stream.ignore(...) is a much more logical approach, I think, and I'll take on board the excellent concepts that you've given.

All the best!

--random
--random_thinkerAs Albert Einstein said: 'Imagination is more important than knowledge'. Of course, he also said: 'If I had only known, I would have been a locksmith'.
Quote:Original post by Fruny
Well, to skip a line, the cleanest way is to do

stream.ignore(std::numeric_limits<int>::max(), '\n');

A character count equal to std::numeric_limits<int>::max() is handled as a special case meaning "any number of characters".


I will never understand why they did the interface that way, instead of

stream.ignore(char delim = '\n'); // any number of characters
stream.ignore(int count, char delim); // when you need to specify

Or something along those lines. I mean really, limiting the amount to skip seems like the exceptional case, especailly since the data doesn't need to be stored anywhere.

Quote:As much as you might want to (I did for a while, was about to write it up and then remembered), these two approaches do not mix. If you were to write an overloaded skipline function, either directly performing the operation or, if passed an integer, returning an object which, when streamed, would perform the operation, you would have to disambiguate with a cast which function call you intend when just passing the function pointer, which would be inconvenient considering we're trying to make the syntax as smooth as possible.


Yeah, that part sucks too :(

Also reminds me of this one time I tried giving an object a conversion to std::string, and was annoyed to find that I needed a cast in order to use that for output. Solution, redo it as a member stream manipulator, and provide the operator<< overload to invoke the manipulator. Of course, it's cleaner design anyway, but C++ is just really weird overall about what implicit casts will and won't happen. :s

This topic is closed to new replies.

Advertisement