Jump to content
  • Advertisement
Sign in to follow this  
The Communist Duck

C i/o (fopen, fread) vs C++ i/o (ifstream, ofstream)?

This topic is 2907 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey.
NOTE: I do not want to start a flame war here. I'm just interested.

I see, more often than the latter, people using fopen and fread. Is there a real reason for this in C++? I've learnt to use std::ifstream and std::ofstream, and find it much easier. Or is it simply because the people have come from C?

Share this post


Link to post
Share on other sites
Advertisement
stdio.h vs fstream...
I think we have had this discussion again somewhere! After being critisized for being an stdio supporter, I did some tests on my linux machine, and I saw that fstream was way faster. I used fopen() and fscanf() (which is slow compared to other stdio.h functions) vs the << and >> operators for the test. But I am still not sure! I've noticed that many IOI (International Olympiad in Informatics) tasks have a note "Use stdio.h instead of fstream!" at the end of the task description, and I still wonder why.

I have to admit that I still use stdio.h when reading a large amount of data because of that note... :P

Share this post


Link to post
Share on other sites
The primary reason to prefer C++ streams over C file IO is type safety. Using C-style file IO the caller is responsible for ensuring that the number and type of all descriptors and arguments match. Any errors results in undefined behaviour. There are also potential security concerns when dealing with strings passed from external sources.

C++ streams have none of these problems - it is impossible to mismatch a type or number of arguments. Unfortunately C++ streams can be unwieldly for output which is why some people prefer boost::format which arguably provides the best of both worlds (albeit at a performance penalty).

Σnigma

Share this post


Link to post
Share on other sites
1) The fopen style lets you use fscanf. A lot of people are more comfortable using the fscanf style formatting over the iostream manipulators and flags. Though code like boost::format take some of the pain out of iostreams.

2) fopen style commands and iostream commands "fail" in a different way. Ones like fread() return the bytes read, while iostream::read requires you to call upto 3 seperate functions ("fail","eof","gcount") to figure out if it worked.

3) Many APIs support taking a FILE* as an argument, but not an iostream&.

4) It is slightly easier to replace fopen than it is to replace fstream. Many APIs provide file hooks, where you give it a list of fopen style function pointers to call when dealing with files. You can use these function pointers to provide an interface to your custom Virtual File System. This allows random APIs to open files inside your custom archive format.

5) For the very memory conscious, using the fopen class of functions insures you can easily know your exact memory footprint, even with formatted input (fscanf). Using iostreams, and formatted input ( operator >> ), it takes a more effort to manage your memory usage. On the other hand, fscanf style input forgoes a lot of safety that c++ provides, making it easier to accidentally preform buffer overruns.


I like the c++ form of stuff, and have been redoing my VFS to use iostreams instead of fopen style functions. The big bonus, in my mind, is being able to use all those nice C++ features to hide the complexity of the code base behind standardized interfaces. Anything that pipes, iterates, or transforms an iostream can operate on my ifstream class. This means cool things, like being able to std::copy from one file to another, and not have to worry about allocating buffers, and setting up loops with error checks, or even having to know that fileA is compressed while fileB is encrypted.

Quote:

I think we have had this discussion again somewhere! After being critisized for being an stdio supporter, I did some tests on my linux machine, and I saw that fstream was way faster.

IIRC, fstreams can be slower if the flag to sync with the fopen class functions is set.
If you replace the streambuf, a fstream can become significantly slower if your streambuf doesn't provide buffering (as it will be calling underflow() way too often)


--edit: wow i took to long to type all that.

Share this post


Link to post
Share on other sites
I don't believe there is a significant performance difference, unless used wrong. Both implementations are just a wrapper over the OS' file system functions.

Personally I use the C fopen / fread functions, as they are easier for me to work with. I often read files entirely into memory, and then parse it myself. C is easier, I find, to handle these binary files.

Quote:
Original post by KulSeran
I like the c++ form of stuff, and have been redoing my VFS to use iostreams instead of fopen style functions. The big bonus, in my mind, is being able to use all those nice C++ features to hide the complexity of the code base behind standardized interfaces. Anything that pipes, iterates, or transforms an iostream can operate on my ifstream class. This means cool things, like being able to std::copy from one file to another, and not have to worry about allocating buffers, and setting up loops with error checks, or even having to know that fileA is compressed while fileB is encrypted.


Well, you never get around not checking errors. Whatever implementation, you'd have to handle the errors. iostream this means exception catching. stdio this means checking for NULL pointer or return value.
I'm not sure what you mean with std::copy in relation to the file format (compressed or encrypted). For most files you'd not need a loop either...

Share this post


Link to post
Share on other sites
A good reason for or against C++ streams may be that they possibly do stuff you don't need/want. Or, maybe they do exactly what you need/want, as always it depends :-)

For me, they generally add too much bloat for a syntax that I don't like and for features that I don't need in this case. A program using C++ streams can easily be 100k larger than otherwise for apparently the exact same effect. Type safety is nice, but I don't need that when writing stuff to disk. And the syntax... well... it's a matter of taste, but I really hate the stream syntax, it is totally unintuitive for me. I always think of "much smaller than" when I see <<.

Often, I don't even use fread/fwrite, but even the operating system's low level read/write or readv/writev implementation. But, depending on what you do, those may actually be slower... it really depends on the particular case. Sometimes they can be twice as fast, too.

Share this post


Link to post
Share on other sites
Quote:

Well, you never get around not checking errors. Whatever implementation, you'd have to handle the errors. ...
I'm not sure what you mean with std::copy in relation to the file format (compressed or encrypted). For most files you'd not need a loop either...

No, you never get around the error checking. But you can reduce it alot. Like I was getting at, with the std::copy. In the case of making a tool to pack a file 'A' into archive 'B', you'd normally have to setup a loop, read, error check, compress and error check, write and error check. Now, if you hide all of that behind c++ constructs, the end user only has to worry about opening 'A' and 'B', add a compression streambuf to 'B', and std::copy from 'A' to 'B', then error check. It's less overall steps to remember, as most the error checking is done at a lower level.
You also have to remember the c++ way lets you do nice things like std::copy for reading files into std::vectors, or dumping it to std::cout (just another file really).

You can just say stuff like:

std::ifstream ifile("test.txt");
std::copy( std::istreambuf_iterator<char>(ifile.rdbuf()), std::istreambuf_iterator<char>(), std::ostream_iterator<char>(std::cout) );

and in 2 lines of code, you've safely printed a file to the console wihtout needing any error checks, loops, temporary buffers or figuring out the file's size. You could check ifile for errors, but it isn't needed.

Or, even though it is innefficent, you could:

std::ifstream ifile("test.txt");
std::vector<char> textbuffer;
std::copy( std::istreambuf_iterator<char>(ifile.rdbuf()), std::istreambuf_iterator<char>(), std::back_inserter<char>(textbuffer) );

and have your file read into 'textbuffer'. You can safely forgo error checking until the point where you are actually parsing textbuffer. At that point you need to be sure to handle textbuffer.size() being less than actually expected.

Quote:

iostream this means exception catching. stdio this means checking for NULL pointer or return value.

iostreams don't throw anything by default, you have to enable them. So error checking can actually mean looking at the .fail() and .eof() members. And for fstreams in particular, checking the .is_open() member.

Share this post


Link to post
Share on other sites
I always use the 'C' API for file IO.

IMNSHO:
printf and the like are vastly superior to streams for easy-of-use on the programmers part. (MS offers secured flavors if that if that is a concern).
The real-world-benefit of the "type safety" gained does not offset the horrible API. The real-world consequence of broken type-safety in the 'C' API is garbage printed out (minimal failure, easily detectable, easily correctable.)

The IO stream library of C++ is generally regarded as an example of how /not/ to design a reusable library for mass consumption. It made a simple task much harder and offers no tangible benefit over the 'C' API. The argument of 'syntax sugar' doesn't even hold since it's so much more complex to do formatted printing. Whatever you gain in a few {'s is destroyed by the syntax of actual work being done.

You can throw together a C++ class/encapsulating the 'C' API easily enough.

C# offers an actual improvement over the 'C' API while retaining the primary benefits.

As a test, write a function/class that logs csv formatted output (with " escaping), date-time stamps the log, and then print out addresses such that they are always 8 characters/4-bytes long (i.e. you get 0x00045678 not 0x45678).
Then compare the two solutions.

The quaint example of printing a vector of object is just that, quaint, not useful. When you really want to print a list you need to insert a delimiter and generally you do not want a bonus delimiter at the end (nor beginning).
Here's the 'real-world' code doing a useful vector-stream dump:

template<class OS, class FI>
OS& write_csv(OS& os, FI begin, FI end)
{
if(begin!=end)
{
for(FI next = begin++; next!=end; ++begin, ++next)
{
os << *begin << ",";
}
os << *begin;
}
return os;
}

You have to stop 'one early' (hence the 'next' iterator) to avoid printing the bonus ,
And typically you need to process the data in some-way so you really need a version that takes a functor and manipulates the data prior to printing...

Not worth it, just write the loop in-place and skip the iostream non-sense.

size_t last = v.size() - 1;
for(int i=0; i<last; ++i)
{
printf("%.8X,", v);
}
printf("%.8X,", v[last]);

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
I always use the 'C' API for file IO.

IMNSHO:
printf and the like are vastly superior to streams for easy-of-use on the programmers part. (MS offers secured flavors if that if that is a concern).
The real-world-benefit of the "type safety" gained does not offset the horrible API. The real-world consequence of broken type-safety in the 'C' API is garbage printed out (minimal failure, easily detectable, easily correctable.)

The IO stream library of C++ is generally regarded as an example of how /not/ to design a reusable library for mass consumption. It made a simple task much harder and offers no tangible benefit over the 'C' API. The argument of 'syntax sugar' doesn't even hold since it's so much more complex to do formatted printing. Whatever you gain in a few {'s is destroyed by the syntax of actual work being done.

You can throw together a C++ class/encapsulating the 'C' API easily enough.

C# offers an actual improvement over the 'C' API while retaining the primary benefits.

As a test, write a function/class that logs csv formatted output (with " escaping), date-time stamps the log, and then print out addresses such that they are always 8 characters/4-bytes long (i.e. you get 0x00045678 not 0x45678).
Then compare the two solutions.

The quaint example of printing a vector of object is just that, quaint, not useful. When you really want to print a list you need to insert a delimiter and generally you do not want a bonus delimiter at the end (nor beginning).
Here's the 'real-world' code doing a useful vector-stream dump:

template<class OS, class FI>
OS& write_csv(OS& os, FI begin, FI end)
{
if(begin!=end)
{
for(FI next = begin++; next!=end; ++begin, ++next)
{
os << *begin << ",";
}
os << *begin;
}
return os;
}

You have to stop 'one early' (hence the 'next' iterator) to avoid printing the bonus ,
And typically you need to process the data in some-way so you really need a version that takes a functor and manipulates the data prior to printing...

Not worth it, just write the loop in-place and skip the iostream non-sense.

size_t last = v.size() - 1;
for(int i=0; i<last; ++i)
{
printf("%.8X,", v);
}
printf("%.8X,", v[last]);


Lets do a comparison where we actually try and keep the functionality the same, eh?

	std::cout << std::hex << std::setfill('0') << std::setw(sizeof(void*)*2);
size_t last = v.size() - 1;
for(int i=0; i<last; ++i)
{
std::cout<<v<<",";
}
std::cout<<v[last];


Best part about my code: It doesn't contain any bugs (hint: yours does). The bug yours contains would be trivial to spot in my code.

Now, lets go the opposite route (also fixing your bugs):
template<class FI>
void write_csv(FI begin, FI end)
{
if(begin!=end)
{
FI next = begin;
for(++next; next!=end; ++begin, ++next)
{
printf("%.8X,", *begin);
}
printf("%.8X", *begin);
}
}


Of course, that code above assumes that the result of dereferencing begin will be convertible to an integral type which can be handled by printf. While the working C++ version:
template<class OS, class FI>
OS& write_csv(OS& os, FI begin, FI end)
{
if(begin!=end)
{
std::cout << std::hex << std::setfill('0') << std::setw(sizeof(void*)*2);
FI next = begin;
for(++next; next!=end; ++begin, ++next)
{
os << *begin << ",";
}
os << *begin;
}
return os;
}

Does not, and will in fact work for any type that has an operator << defined for it.

Another good thing about the C++ version? When I move to 64bit and my pointers magically grow in size (which they do), I don't have to change my output methods to fix it... while the C versions do need to be changed to maintain consistency...

However, I am NOT a fan of C++ iostreams. boost::format takes care of a lot of the foibles I have with C++ iostreams. One example is the fact that in both C and C++ formating functions you are parameter order sensitive, which makes localization, when dealing with a library that uses those function/objects, extremely complex.

[Edited by - Washu on July 10, 2010 3:08:14 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Shannon Barber
printf and the like are vastly superior to streams for easy-of-use on the programmers part. (MS offers secured flavors if that if that is a concern).
The real-world-benefit of the "type safety" gained does not offset the horrible API. The real-world consequence of broken type-safety in the 'C' API is garbage printed out (minimal failure, easily detectable, easily correctable.)

Maybe on the writing side that's true, but on the reading side the C API is both harder to use and much more prone to security flaws such as buffer overruns. Compare the code necessary to safely read an arbitrary sized line from a file with the C API vs. the C++ API.
Quote:
Original post by Washu
Lets do a comparison where we actually try and keep the functionality the same, eh?

If that's what you're trying, I think you're missing a

std::cout << std::hex << std::setfill('0') << std::setw(8);

before the loop.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

Participate in the game development conversation and more when you create an account on GameDev.net!

Sign me up!