[C++] - Parsing: Moving Around In Files

Started by
8 comments, last by caldiar 15 years, 9 months ago
Im working on writing a parser to convert COLLADA (.dae) files to .xmodel_export files and I've been using a mix of the C++ <string> library and C <string.h>. What Im wondering is if there's some sort of function where I can move back a line to re-read stuff that has been skipped through the use of getline(). I have several loops that each check for different substrings but calling getline automatically jumps to the next line in the file and checks it. What I want to be able to do is to jump backwards after the loop is finished so I can take another peek for a different substring. If there's not does anybody have any suggestions on a way to implement what I want to do? Some psuedo-code of what I'm currently doing:

ifstream file(file.txt)
string line

while not end of file
getline(file, line)
if strstr(line, "what I want")
dump it into a vector

next if loop
The problem is that sometimes it will skip lines I want to check for with other loops.
Advertisement
check out the reference section at cplusplus.com. They may have something useful there.
There was a saying we had in college: Those who walk into the engineering building are never quite the same when they walk out.
Why are you using cstrings in combination with std::string?

Anyway, a simple solution:
ifstream file(file.txt);string line;streampos filepos;while(!file.eof()){ filepos = file.tellg(); getline(file, line); if(condition)  file.seekg(0, filepos);}
-------------Please rate this post if it was useful.
Obviously we can't give the best advice without seeing more of the code, however:

Why can't you simply check all conditions against each getline?

while (!eof()){  getline()  if(line.find("first case") != npos)  {    do first thing  }  if(line.find("second case") != npos)  {    do second thing  }}
thanks for the reply guys.

Im using a mix of C and C++ strings because I had started programming with C-style and decided halfway through I wanted to work with strings instead. The code works so I dont feel like Ill be bothered making it all C++.

As for checking for everything in a single line that wont work because Im looking for data in between two different tags and then Im looking for specific lines in those tags.

I need to be able to tell, when there are two of the same tags, which data belongs in which tag.
1. Load the file into RAM as a dynamic array of chars with new[], (reading the size of the file first) then close the file. Build a vector of the positions of all '\n' characters and the eof. Step through this array to find the start index of each line, and the length. Be careful not to overflow. Free the char array when done.
This way would be optimal for disk access.

2. Create a std::vector<std::string>
Initially, read all the lines of the file with get_line and add each to the vector with push_back(), then close the file. Parse the strings in the vector. Keep a line number variable when parsing, or a reversable iterator.

3. (Quickest hack - requires fewest changes)
replace getline with a a quick function my_getline
std::vector<streampos> pos; // global :(my_getline(std::ifstream &file, std::string &line) {    pos.push_back(file.tellg());    getline(file, line);}my_goback() {    file.seekg(pos.back());    pos.pop_back();}

my_goback() should take you back one line. So to get the line prior to the one you just read, call my_goback() twice, then my_getline(). Or you could make a my_goback(int lines).



I suspect that key information is missing here. Could you provide a sample of what part of the file looks like, and the problem that it presents?
Quote:Original post by yacwroy
1. Load the file into RAM as a dynamic array of chars with new[], (reading the size of the file first) then close the file. Build a vector of the positions of all '\n' characters and the eof. Step through this array to find the start index of each line, and the length. Be careful not to overflow. Free the char array when done.
This way would be optimal for disk access.

2. Create a std::vector<std::string>
Initially, read all the lines of the file with get_line and add each to the vector with push_back(), then close the file. Parse the strings in the vector. Keep a line number variable when parsing, or a reversable iterator.

3. (Quickest hack - requires fewest changes)
replace getline with a a quick function my_getline
*** Source Snippet Removed ***
my_goback() should take you back one line. So to get the line prior to the one you just read, call my_goback() twice, then my_getline(). Or you could make a my_goback(int lines).



3rd solution will work quite well for what I want.

I actually made a slight change to it because I want to be able to specify the number of lines to go back by.

void move_back(int count){	if(count == NULL || count == 0){		file.seekg(pos.back());		pos.pop_back();	}else{	    for(int i = 0; i < count; i++){		    file.seekg(pos.back());	        pos.pop_back();	    }	}}


Thanks again everybody :)

[Edited by - caldiar on July 17, 2008 12:41:08 AM]
It sounds to me you should take some time to study how parsing is done: convert the input to tokens with a lexer etc. Will make it a lot easier to implement.
It's just a small program for personal use. Nothing extravagent.

The main problem was my unfamiliarity with seekg() and tellg().

The data I'm interested in is very small. Only about 2% of the COLLADA file is needed so it's much easier to just hack this together than actually write a full-fledged parser capable of isolating any piece of data on request, brewing my coffee, and even cleaning the house :D

This topic is closed to new replies.

Advertisement