Sign in to follow this  
v0dKA

Working with large files in ifstream and string

Recommended Posts

I recently tried to work with a HTML file using C++. The HTML file, as can be expected in a lot of cases, is pretty long - in this case, 600 lines (and many lines run for over 1000 characters). However, since I'm doing a lot of searching in the file, it would have been very convenient to treat the file as one big string. I tried to save the entire file (opened using std::ifstream) into a single std::string. When the program is run, it simply stalls. C++ streams and the workings thereof are still a mysterious topic to me, and I'm clueless about how to deal with errors when they arise in such cases. I did a little digging and found a special field in std::ifstream called failbit. Apparently, this bit can be tested using std::ifstream::fail() (is this correct?). I tested this function in the loop used to read the file and found that this function returns true when the current line of the file is 512. Could this be because the file is too big for std::ifstream? Or is it my attempt to save so many characters to the std::string that is causing the problem? Or is there a weird character at line 512 that may cause std::ifstream to fail? The relevant portion of my source code:
//...
std::ifstream in( "..." );
std::string file_contents;
//...
unsigned long i = 0;
	while( !in.eof() )
	{
		++i;

		char szOneline[ 4096 ];
		in.getline( szOneline, 4095 );

		if( in.fail() )
		{
			std::cout << "Failure at line " << i+1 << std::endl;
			i=0;
			std::cin.get();
			return 0;
		}

		file_contents += szOneline;
		file_contents += '\n';
	}


Is there a more convenient way to work with HTML in C++?

Share this post


Link to post
Share on other sites

void ReadFile(std::string const & pFilePath, std::string & pData)
{
if(pFilePath.empty())
return void();

else if(!pData.empty())
pData.clear();

std::ifstream pInputStream(pFilePath.c_str());

if(!pInputStream.good())
return void();

std::getline(pInputStream, pData, '\x00');

pInputStream.close();
}



Just tried this with a 7 mb file, worked fine...

Share this post


Link to post
Share on other sites
According to Josuttis: The C++ Standard Library

Failbit: is set if an operation was not processed correctly but the stream is generally IK. Normally this flag is set as a result of a format error during reading. For example, this flag is set if an integer is to be read but the next character is a letter.
Badbit: is set if the stream is somehow corrupted or if data is lost. For example, this flag is set when positioning a stream that refers to a file before the beginning of the file.

fail() returns true if an error has occurred (failbit or badbit is set).

Hope this helps,
EmmetjeGee

Share this post


Link to post
Share on other sites

#include <algorithm>
#include <fstream>
#include <iterator>
#include <string>

int main()
{
...

std::string text;
copy(istream_iterator<char>(ifstream("filename")), istream_iterator<char>(), back_insert_iterator<string>(text));

...
}




While the above works, never write code like this. It's too dense, it cascades too many operations without robustness, and it's fairly opaque to most people. I only wrote it to illustrate a number of constructs available for operations like reading an entire file into a string (std::back_insert_iterator, std::istream_iterator, std::copy).

I believe there's also a solution involving filebufs, but I can't quite figure out the semantics just yet.

Share this post


Link to post
Share on other sites
Got this from Drew Benton:


std::string fileToText( const std::string &filename )
{
std::ifstream fin(filename.c_str());
return std::string((std::istreambuf_iterator<char>(fin)),std::istreambuf_iterator<char>());
}




It has worked on any file I've tried it with...

Share this post


Link to post
Share on other sites
Quote:
While the above works, never write code like this. It's too dense, it cascades too many operations without robustness, and it's fairly opaque to most people.


ifstream file("filename");

istream_iterator<char> file_begin(file);
istream_iterator<char> file_end;

std::string text(file_begin, file_end);

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this