Sign in to follow this  
serratemplar

ifstream, textfile, floats...there must be a better way

Recommended Posts

I'm trying to load in vertices that are stored line by line in a text file (out of a .ply) Each line looks something like this: 0.787083 0.762639 0.000000 I admit I'm a huge n00b when it comes to file i/o, but here was my best idea.:

string tmp, token;
char buffer[160];
memset(buffer, NULL, sizeof(char) * 160);

f.getline(buffer, 160); tmp = buffer; tmp.append("X");

string::size_type a = tmp.find(" ", 0);
token.assign(tmp, 0, a-1);
float x = (float)atof(token.c_str());

string::size_type b = tmp.find(" ", a+1);
token.assign(tmp, a+1, b-1);
float y = (float)atof(token.c_str());

string::size_type c = tmp.find("X", b+1);
token.assign(tmp, b+1, c-1);
float z = (float)atof(token.c_str());



So far this has resulted in lines like this: 0.78708 0.762639 0.00000 0.000000 with one of the numbers duplicated, and some accuracy loss. The accuracy loss is acceptable (the 7th decimal place I can live without, and I bet OpenGL can too), but my inability to parse it is remarkably frustrating. There has got to be an easier way, but so far it has eluded me. If anyone has any ideas, I would really appreciate the input. This seems so simple. =( Thanks in advance. EDIT: Just noticed I copied over my memset call wrong (cause I typed it by hand up there); added in the *160 that's there in my actual code.

Share this post


Link to post
Share on other sites
Ok, so turns out I misread the specifications on std::string::assign (the last argument is the length and not the next index)...however I'm still curious to know if there is a cleaner way of doing this?

Share this post


Link to post
Share on other sites
Looks like ApochPiQ already answered your question, but I wrote up a little example, so I'll go ahead and post it:


#include <boost/algorithm/string.hpp>
#include <boost/lexical_cast.hpp>
#include <fstream>
#include <iostream>
#include <sstream>
#include <vector>

using std::cout;
using std::endl;

struct vector { float x, y, z; };

std::ostream& operator<<(std::ostream& os, const vector& v) {
return os << "(" << v.x << "," << v.y << "," << v.z << ")";
}

std::istream& operator>>(std::istream& is, vector& v) {
return is >> v.x >> v.y >> v.z;
}

// NOTE: To keep the example concise, I'm not doing any error checking. In
// practice you'll of course want to check that e.g. the file was opened
// successfully.

int main()
{
// In our first pass we'll ignore the issue of lines entirely, and just
// stream the vertex data directly from the file in sequence:
{
std::ifstream file("test.txt");
while (!file.eof()) {
float x, y, z;
file >> x >> y >> z;
cout << "(" << x << "," << y << "," << z << ")" << endl;
}
cout << endl;
}

// This becomes cleaner if we use a proper vector class or struct,
// along with the appropriate overloaded operators (defined above):
{
std::ifstream file("test.txt");
while (!file.eof()) {
vector v;
file >> v;
cout << v << endl;
}
cout << endl;
}

// For reading the file a line at a time, you'll want to use the
// global function getline(), which accepts a std::string (there's no
// need to use raw character buffers):
{
std::ifstream file("test.txt");
while (!file.eof()) {
std::string line;
std::getline(file, line);

vector v;

// Now what to do with the line? We could throw it into a
// stringstream object and perform streaming just as we did
// before:
std::istringstream s;
s.str(line);
s >> v;
cout << v << endl;

// We could also split up the line into tokens, like this:
std::vector<std::string> tokens;
boost::split(tokens, line, boost::is_space());

// And then grab an element at a time using lexical_cast<>():
v.x = boost::lexical_cast<float>(tokens.at(0));
v.y = boost::lexical_cast<float>(tokens.at(1));
v.z = boost::lexical_cast<float>(tokens.at(2));
cout << v << endl;

// The above methods are somewhat roundabout given the
// simplicity of your file format, but are simply intended to
// introduce you to some of the tools you have available.
}
cout << endl;
}

// There are other options as well, such as using Boost.Tokenizer or
// algorithms and/or iterators from the standard library.

// The choice of method depends largely on your file format. Generally
// speaking, I'd recommend sticking with the simplest method that will
// suffice for the file format in question.
}


The above example demonstrates a few different ways you might read in vertex data from a text file, and introduces a few of the different tools available in the SC++L and Boost libraries for streaming, parsing, tokenizing, and so on.

The example has been compiled and tested. However, note that I'm not a C++ guru, so there may be stylistic errors or other oversights. Also note that it's not intended to be comprehensive, but rather just to introduce a few concepts and provide illustrative examples.

Share this post


Link to post
Share on other sites
In C++, ifstreams are istreams, which means they provide the same kind of interface that std::cin does. That is, you can read stuff with operator>> just fine.

Reading a line at a time and then re-parsing the line is a good idea, actually (it lets you parse things more robustly: if there's an error on one line of the input, you can just throw it away, and the underlying stream doesn't need to be "reset"). But there are two things to be aware of here:

1) Don't ever use the .getline() member function of the stream again, please. It's more or less a mistake (or for use when you *need* to work with specific character length limits because of the file format, or there is some really bizarre optimization concern at work, etc. etc. For normal applications, these things just don't happen). Instead, use the free function std::getline(). It reads directly into a std::string object (and will read any length of line, and automatically tell the string to accomodate the incoming info).

1a) Don't use memset() in C++. Use std::fill instead. Also, NULL isn't a character; it's supposed to refer to a pointer value, and that fact that it works here is coincidental. The proper spelling for a null byte is either 0 or '\0'. (NULL is a macro that in C++ is normally defined as 0, because defining it as (void*)0 doesn't play very nicely with C++'s type system. Many C++ experts advocate just writing 0 for pointers anyway; at least until C++0x, which is likely to pick up some sort of 'nullptr' construct, likely a keyword.)

1b) You didn't need to clear out the buffer anyway, because the .getline() member function (which we're not going to use anyway; but it's the principle of the thing) would have null-terminated the read-in text, and the values of data beyond the null terminator are inconsequential.

2) We can re-parse the line by wrapping it in yet another kind of input stream: an istringstream. This is another object providing the same basic interface, but using a std::string as its data source (instead of a file or the console).

2a) Even doing things the other way, you didn't really need to append the 'X' to the string and then look for it. You can just use the end of the string as a parameter. Also, .assign() is rarely called for; std::string provides an operator=, and you can create a temporary string with the same parameters that .assign takes. Or, you could have used the free function std::find(), which returns an iterator, and used the two-iterator constructor for std::strings (e.g. token = string(tmp.begin(), find(tmp.begin(), tmp.end(), ' ')), if I'm thinking clearly). But we're not going to do that, because the stream approach is much friendlier, and because

2b) don't use atof(), atoi() etc. unless you really know what you're doing and want their side-effect behaviour, which is that they return 0 when the data cannot be interpreted as the appropriate type. (This gives you no way to determine whether the input was garbage, or was valid and actually corresponded to 0.) And even then, you can wrap the stream approach easily enough to get that behaviour... get modern! :)


std::string tmp;

std::getline(f, tmp);

float x, y, z;

if (!(std::istringstream(tmp) >> x >> y >> z) {
// one of the reads failed.
} else {
// x, y, and z now contain the appropriate values.
}


Yep, it's really that easy. Notice that we can use a temporary stringstream object for this, even. Stringstreams live in the header <sstream>.

(And no, there's nothing you can do about the precision loss - except switching to doubles, and that still doesn't address it properly - native floating-point types just don't work that way; you can't express .787083 exactly in a float or a double for the same reason you can't express "one third" exactly in decimal.)

Share this post


Link to post
Share on other sites
Quote:
Original post by jyk

std::ifstream file("test.txt");
while (!file.eof()) {



Don't iterate over files like this. (It catches the end-of-file one iteration too late; .eof() only triggers after a read has already failed due to the EOF condition.) Instead, use the read as the condition for the while loop; the read operations actually do their work as a side effect, and return the original stream object (by reference - so that chaining will work). In a boolean context, a stream is evaluated as if .good() were being called implicitly, so this has the desired effect - attempt the read; if it succeeded, perform another iteration with the just-read-in data.

Another explanation here.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this