Sign in to follow this  
BlackWind

reading a file....

Recommended Posts

Hi, i have some problem reading a file i want to parse.... it has a format like this (is a .raw file generated by milkshape3d): object1 36.065701 4.730784 1.899187 36.973099 5.891685 1.898429 34.065498 3.886485 1.898370 34.065498 3.886485 1.898370 36.973099 5.891685 1.898429 37.416302 7.511585 1.898335 34.065498 3.886485 1.898370 37.416302 7.511585 1.898335 31.540001 3.410384 1.897815 37.416302 7.511585 1.898335 37.851398 9.623985 1.898234 31.540001 3.410384 1.897815 37.851398 9.623985 1.898234 29.062401 3.400085 1.898075 31.540001 3.410384 1.897815 29.062401 3.400085 1.898075 37.851398 9.623985 1.898234 26.679501 4.336085 1.896880 37.851398 9.623985 1.898234 24.538500 5.668586 1.898248 26.679501 4.336085 1.896880 22.969200 7.943485 1.899194 24.538500 5.668586 1.898248 37.851398 9.623985 1.898234 22.127001 10.779885 1.898426 22.969200 7.943485 1.899194 37.851398 9.623985 1.898234 21.901501 13.360915 1.898337 22.127001 10.779885 1.898426 37.851398 9.623985 1.898234 21.901501 13.360915 1.898337 37.851398 9.623985 1.898234 38.195499 11.935485 1.897456 32.375198 21.947132 1.898189 31.283899 22.036659 1.898314 33.621899 21.475061 1.897475 31.283899 22.036659 1.898314 30.191500 21.883495 1.898425 33.621899 21.475061 1.897475 object2 73.928902 -14.603115 2.261673 74.228600 -13.175915 2.261673 71.259003 -13.976112 2.260420 71.259003 -13.976112 2.260420 74.228600 -13.175915 2.261673 72.029999 -12.199013 2.259457 74.228600 -13.175915 2.261673 75.138298 -10.267612 2.261672 72.669403 -10.115314 2.260881 74.228600 -13.175915 2.261673 72.669403 -10.115314 2.260881 72.029999 -12.199013 2.259457 object3 -12.199013 2.259457 74.228600 -13.175915 2.261673 75.138298 -10.267612 2.261672 72.669403 -10.115314 2.260881 etc.... it has 9 float values per line, and a string in some lines (which lines has a string or the 9 float numbers is unknown) before i got a file with strings, i was reading the file like this:
for(int i = 0; i < g_NumberOfVerts; i+=3)
{		
		
			fscanf(fp, "%f %f %f   ", &g_pVertices[i].x, &g_pVertices[i].y, &g_pVertices[i].z );	
			fscanf(fp, "%f %f %f   ", &g_pVertices[i+1].x, &g_pVertices[i+1].y, &g_pVertices[i+1].z );	
			fscanf(fp, "%f %f %f\n", &g_pVertices[i+2].x, &g_pVertices[i+2].y, &g_pVertices[i+2].z );	
		

}
but now, with a string i some lines, i dont know how to read the file, because i need to ignore that string, i only need the float numbers to be stored into my g_pVertices array (in order)..... how can i do it? it doesnt matter if its with C or C++.....

Share this post


Link to post
Share on other sites
ifstream in(filename);
if (!in)
{
// handle error
}

while (!in.eof())
{
string line;
getline(in, line);

int numRead = sscanf(line.c_str(), "%lf %lf %lf %lf %lf %lf %lf %lf %lf",
&x1, &y1, &z1, &x2, &y2, &z2, &x3, &y3, &z3);

// if all the values weren't read in for this line,
// we might be at a line with the string
if (numRead == 9)
{
// add vertices
}
}


I've not tried to compile this, but this should work.

Share this post


Link to post
Share on other sites
Please don't use sscanf() or any of that other stdio.h stuff in C++. There's really no call for it any more.


ifstream in(filename);
if (!in)
{
// handle error
}

// Also, prefer this idiom for reading all lines of a file:
string line;
while (getline(in, line)) {
if (stringstream(line) >> x1 >> y1 >> z1 >> x2 >> y2 >> z2 >> x3 >> y3 >> z3) {
// We successfully read 9 vertices out of this line, so add them.
}
// Otherwise, there was something wrong with the contents of this line;
// for now, we just ignore these lines.
}


If it doesn't work, I'll need to see the code where you collect the vertices.

Share this post


Link to post
Share on other sites
Wow!
thanks a lot Zahlman, it worked perfectly, but now i have 2 questions:

Quote:
Original post by Zahlman
Please don't use sscanf() or any of that other stdio.h stuff in C++. There's really no call for it any more.

1.-why? is there any perfomance issue with the stdio?

2.- How does ifstream works? doesnt it has something like the rewind(fileptr) in the FILE* "class"?
i ask this, because, as you can see, in this program, i need first to read how many vertices does the file has, so then i can allocate enought memory for my g_pVertices array.
So what i do is:
1.- read the file
2.- get the vertice count
3.- allocate the memory based on the vertice count
4.- read the file again
5.- set the vertice data to my array

the problem i have, is that i have to create a new instance of ifstream:
ifstream in1 <-- to read the file for the first time
ifstream in2 <-- to read the file for the second time

beacuase if i try to use the same, the second time i try to read the file, it doesnt read anything, even if i close and reopen the file....

so... isnt there any way to solve this problem?

Share this post


Link to post
Share on other sites
Quote:
I need first to read how many vertices does the file has, so then i can allocate enought memory for my g_pVertices array.


You might want to use a std::vector as a temporary to hold you vertex data, then alocate g_pVerticesbased on the number of elements you vector reports to have, or, you might want to use a std::vector *instead* of g_pVertices ;)

But of course there is probably a way to do what you're trying to do the way you want do do it (rewinding the file stream)

Share this post


Link to post
Share on other sites
Quote:

i ask this, because, as you can see, in this program, i need first to read how many vertices does the file has, so then i can allocate enought memory for my g_pVertices array.
So what i do is:
1.- read the file
2.- get the vertice count
3.- allocate the memory based on the vertice count
4.- read the file again
5.- set the vertice data to my array


If you used a std::vector, you wouldn't have to worry about this. Since this is still contiguous memory, you can still refer to the raw array data later if you need to (though you shouldn't need to hehe).

edit: arrrr, beaten to it... but I have a link :D

Share this post


Link to post
Share on other sites
well, the problem here, is that i have my 3DVector class, that handles a lot of 3dmath, camera and other things, and my pVertices array, is an instance of that class; and i already have a lot of code that use that kind of data, to modify all the code will be a big problem (i also use std::vector, but for other things...)

Share this post


Link to post
Share on other sites
Quote:
Original post by BlackWind
well, the problem here, is that i have my 3DVector class, that handles a lot of 3dmath, camera and other things, and my pVertices array, is an instance of that class; and i already have a lot of code that use that kind of data, to modify all the code will be a big problem (i also use std::vector, but for other things...)
Well, pVertices probably isn't an instance of your 3DVector class; I'm guessing it's of type 3DVector*.

In any case, I suggest that you go ahead and make the switch to std::vector<3DVector>. Depending on how much code we're talking about here it might take you an hour or two, but the time spent will be well worth it.

First you'll have to do a global replace on the name 'pVertices' (which by the way is a good example of why you shouldn't use Hungarian notation or its derivatives). Direct access via the [] operator can be left as is, but you'll need to change occurrences of 'pVertices' to '&vertices.front()' when passing around the raw data (e.g. to third-party APIs).

I think once you get started you'll find that the switch requires fewer changes to your code than you might guess. Just make sure to back everything up before diving in :)

As for sscanf(), the issue is not performance, but rather robustness, flexibility, and safety (remember, performance is not the end all and be all of programming, game or otherwise, especially not these days).

As for parsing your .raw file, there are any number of ways you can go about it, but you can easily do it using only std::string and getline() (similar to what's been posted already, but with a little extra code to handle the 'object' identifiers).

To make it even easier though, grab Boost and use either the Boost Tokenizer or Boost String Algorithms library, perhaps along with boost::lexical_cast, to manage the details of splitting the lines into tokens, trimming the individual tokens if necessary, and performing the appropriate lexical conversions. Note that the libraries mentioned are header-only, so you don't have to build Boost to use them.

The above is not the only (or perhaps even the best) means of parsing your file, but is certainly to be preferred over the C library functions mentioned earlier. And if you get stuck on something, you can always ask :)

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk
First you'll have to do a global replace on the name 'pVertices' (which by the way is a good example of why you shouldn't use Hungarian notation or its derivatives).


Theres plenty more reasons why its good to use hungarian notation derivatives though, especially weak ones. Readability being the main one.

Share this post


Link to post
Share on other sites
Quote:
Original post by BlackWind
Quote:
Original post by Zahlman
Please don't use sscanf() or any of that other stdio.h stuff in C++. There's really no call for it any more.

1.-why? is there any perfomance issue with the stdio?


There might be, but performance is almost never relevant when you're talking about I/O The underlying work of I/O is incredibly slow compared to any normal work done by the CPU itself - clock cycle times are literally millions of times shorter than the hard disk seek time, and there really isn't anything you can do about it. (Of course, for interactive I/O it's even worse, because waiting for a person to type something is yet again much much slower :) )

That said, the iostream classes do take care of buffering I/O automatically, are type-safe, are designed to work with other standard library types, work polymorphically (so you can work with an in-memory stream object just the same way you work with a file stream object or the console; and you never have to worry about distinguishing sprintf() vs. fprintf() - it automatically does the right thing according to which object you're using and what its type is), are newer and better thought out (yet still quite mature), and most importantly, have no realistic disadvantages vs. stdio (unless you really, really like that formatting syntax; in which case, you can get it back - in a way that doesn't cost you type-safety the way the C versions do - with boost::format).

Quote:
2.- How does ifstream works? doesnt it has something like the rewind(fileptr) in the FILE* "class"?


"How it works" is of course quite complicated, just as how FILE* works. Of course, FILE* isn't a real class as you seem to be aware ;) and rewind() isn't "in" the FILE struct; it's a free function that accepts a pointer to a FILE in order to get at the data in that struct (because that's how they handled these things in C in order to appear "object-oriented").

Anyway, ifstream does have 'seek' and 'tell' member functions, and you can seek to the beginning to start over. BUT, thinking that you need to do this usually means you are doing something wrong (this is a design issue, not a language issue).

In your case, what you want to do is just use a std::vector, in order to avoid the need for counting-and-allocation. This container automatically resizes itself to accept additional contents.

Let's say we have a Triangle struct made of three Vertex's, each of which is a struct of three floats. Then, we simply make a vector of Triangles, and push_back() each Triangle as we create it from the stream. To simplify the syntax, and also provide good factoring for the code, we can overload the operator>> to read in a Vertex or a Triangle (another thing that doesn't work with stdio at all :) ) - then we'll have those functions if we ever need them for something else.


struct Vertex {
float x, y, z;
};

istream& operator>>(istream& is, Vertex& v) {
return is >> v.x >> v.y >> v.z;
}

struct Triangle {
Vertex a, b, c;
};

istream& operator>>(istream& is, Triangle& t) {
return is >> t.a >> t.b >> t.c;
}

vector<Triangle> triangles;
ifstream in(filename);

if (!in) {
// handle error
}

string line;
Triangle tri;
while (getline(in, line)) {
if (stringstream(line) >> tri) {
triangles.push_back(tri);
}
}

cerr << "I read in " << triangles.size() << " triangles.";

Share this post


Link to post
Share on other sites
well, thank you all guys for the recommendations, i'll see what can i do to change my code.... also, zahlman, i liked a lot the last code you posted for reading the file

but now, i have one more question....
Quote:
Original post by jyk
First you'll have to do a global replace on the name 'pVertices' (which by the way is a good example of why you shouldn't use Hungarian notation or its derivatives).

why shouldnt i use hungarian notation or derivates?


Share this post


Link to post
Share on other sites
Quote:
Original post by BlackWind
well, thank you all guys for the recommendations, i'll see what can i do to change my code.... also, zahlman, i liked a lot the last code you posted for reading the file

but now, i have one more question....
Quote:
Original post by jyk
First you'll have to do a global replace on the name 'pVertices' (which by the way is a good example of why you shouldn't use Hungarian notation or its derivatives).

why shouldnt i use hungarian notation or derivates?


Well, one reason is that it leads to the need - as jyk is pointing out - for global replaces on names: when you change the type, you need to change the name, because otherwise the name says something about the type which isn't true.

Another reason is that it tries to do the compiler's work on behalf of the compiler. If you try to use a pointer as if it weren't a pointer, or a non-pointer as if it were a pointer, the compiler will catch that and complain. (You might need to turn up your warning level to catch all the kinds of mistakes that you find yourself commonly making.)

I could go on, but basically: it's ugly, it's harder to read, it doesn't really add information that isn't already there (because it's easy to look up the type anyway, and if you're constantly looking up types to figure out what you're doing then there's something wrong with your design), it adds extra maintenance work (never good), and it adds a potential source of *dis*information (when you forget the maintenance work, and it *will* happen eventually).

Share this post


Link to post
Share on other sites
Quote:
Original post by BlackWind
ok so....
how should i name my variables?


Generally speaking, just apply common sense and you'll be fine. Try to make sure the name explains what the purpose of the variable is; and put more effort into (that usually means longer names) naming variables that are more important (the reason we use stuff like 'i' for things like loop counters is because they're not important at all; the loop isn't about incrementing 'i' a million times, but rather about doing something to each of a million elements of some container, for example).

Share this post


Link to post
Share on other sites
A variable name (or any symbol in general, really) should describe as much information about the variable as possible which you cannot also find someplace else - this means no type information, specifically, since you can find the variable's type very easily in any good IDE.

To put it another way, name a variable after why it exists instead of what kind of data it is.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this