Is using the trim line function actually faster than just letting the switch statements ignore the unnecessary characters?
While going through the file twice (to find number of verts etc.) may be faster, perhaps keeping the file in memory and using that the second time would be quicker than reading everything off disk twice?
The going through the file twice isn't slowed down because of the disk, as I am sure this gets cached by the OS / HDD anyway: its slowed down most by that string stream constructor. And yeah it is way faster finding the right amount of verts because for a large data set ( like lucy ) if you don't reserve vector space, the whole thing runs almost five times as long (184 seconds )
No, it is a little slower. A switch statement works well in ignoring lines that start with # as the comment character ( like in obj files ) but if you have a comment like /* this is a comment */ that can span over multiple lines or be in the middle of a line or even have a line /* blah blah blah */ full off /* blah blah blah */ such comments like that, then you need something better, and the parser class is used to parse a few types of text files.
So everyone can see here is an implementation of my parser class I just created which loads lucy in 10 seconds, compared to the 48 it takes with the before posted one using string streams.
Parser::Parser( wstring file )
{
input.open( file );
ignoring = -1;
if( !input.is_open() )
throw ExcFailed( L"[Parser::Parser] Could not open file " + file + L"\n" );
}
void Parser::Ignore( const std::string& start, const std::string& end )
{
excludeDelims.push_back( start );
includeDelims.push_back( end );
}
void Parser::Rewind( void )
{
input.seekg( 0, ios::beg );
input.clear();
ignoring = -1;
line.clear();
}
void Parser::Next( void )
{
getline( input, line );
if( !input.good() )
return;
if( line.empty() )
{
Next();
return;
}
TrimLine( line );
if( line.empty() )
{
Next();
return;
}
}
void Parser::GetLine( std::string& _line )
{
_line = line;
}
void Parser::GetTokens( std::vector<std::string>& tokens )
{
tokens.clear();
string buff;
size_t from = 0;
while( from < line.length() )
{
GetNextToken( buff, from );
tokens.push_back( buff );
}
}
void Parser::GetHeader( std::string& header )
{
header.clear();
size_t from = 0;
GetNextToken( header, from );
}
void Parser::GetBody( std::string& body )
{
body.clear();
size_t i = 0;
// Ignore any white spaces at the beginning of the line.
while( line == ' ' && line == '\r' && line == '\t' && i < line.length() )
i++;
// Ignore the first word
while( line != ' ' && line != '\r' && line != '\t' && i < line.length() )
i++;
body = line.substr( i, line.length() );
}
void Parser::GetBodyTokens( std::vector<std::string>& bodyTokens )
{
bodyTokens.clear();
string buff;
size_t from = 0;
GetNextToken( buff, from );
while( from < line.length() )
{
GetNextToken( buff, from );
bodyTokens.push_back( buff );
}
}
bool Parser::Good( void )
{
return input.good();
}
void Parser::TrimLine( string& line )
{
if( ignoring != -1 )
{
size_t incPos = line.find( includeDelims[ignoring] );
if( incPos != string::npos )
{
line = line.substr( incPos, line.length() );
ignoring = -1;
TrimLine( line );
}
else
line.clear();
}
else
{
for( size_t i = 0; i < excludeDelims.size(); i++ )
{
size_t excPos = line.find( excludeDelims );
if( excPos != string::npos )
{
string tail = line.substr( excPos, line.length() );
line = line.substr( 0, excPos );
// If the includeDelim is the end of the line just return the head.
if( includeDelims == "\n" )
return;
ignoring = i;
TrimLine( tail );
line += tail;
return;
}
}
}
}
void Parser::GetNextToken( string& container, size_t& from )
{
size_t to = from;
while( from != line.length() && ( line[from] == ' ' || line[from] == '\t' || line[from] == '\r' ) )
from++;
to = from + 1;
while( to != line.length() && line[to] != ' ' && line[to] != '\t' && line[to] != '\r' )
to++;
container = line.substr( from, to - from );
from = to;
}
Which is a shame because I think string streams are a really elegant way of parsing and formatting data, but I don't know how to use them in a way that isn't mega mega slow.