Jump to content
  • Advertisement
Sign in to follow this  
alx119

.obj file text parsing

Recommended Posts

Hi,

I'm trying to load an .obj model to OpenGL and I need to parse the file. What I've done until now:
v 0.000000 3.897879 -1.378193
v -0.309017 3.856391 -1.352228

vt 0.314607 0.256292
vt 0.226357 0.365001

vn -0.156435 0.837235 -0.523990
vn -0.309017 0.806183 -0.504556

f 13/3/45 14/6/46 24/29/47
f 24/29/47 14/6/46 25/31/48

So for the v / vt / vn the code source is:

 std::vector<glm::vec3> vertex;
    std::vector<glm::vec2> texture;
    std::vector<glm::vec3> normals;
    
   
    std::string lineStr;

    
    while (inf)
    {
        getline(inf, lineStr);

        //std::cout << "LINESTR: " << lineStr << std::endl;

        if (lineStr[0] == 'v' && lineStr[1] == ' ')
        {
            std::stringstream os;
            std::string unsused;
            double posx, posy, posz;
            os << lineStr;
            os >> unsused >> posx >> posy >> posz;
            std::cout << " Pos X: " << posx << " Pos Y: " << posy << " Pos Z: " << posz << std::endl;

            vertex.push_back(glm::vec3(posx, posy, posz));
        }
       

// The same source code is for vt and vn
 }

It seems that it works. But I have some problems parsing the faces. I don't know exactly how to do it. Well I could use a lot of "if", but I'm not sure if this would be a good idea.
Any advices ? :P

Thank you!

Edited by alx119

Share this post


Link to post
Share on other sites
Advertisement

I used a set of switch statements in my obj parser like this

//based on graphics API (GL/VK) defined GHandle
GHandle Asset::ObjRead(IDataReader& stream)
{
   ...
   while(!stream.Eof())
   {
       String::Skip(stream, " \t\r\n");
       switch(stream.Peek())
       {
           case '#': //skip comment  
           case 'v':
           {
               stream.Get();
               switch(stream.Peek())
               {
                   case 't': //process texcoord
                   case 'n': //process normal
                   default: //process vertex
               }
           }
           break;
           case 'f': //process face group
           case 'g': //process group
           case 'o': //process object
           case 'u':
           {
               ...
               if(bufferLength == 6 && memcmp(buffer, "usemtl", 6) == 0) //process material use
           }
           break;
           case 'm':
           {
               ...
               if(bufferLength == 6 && memcmp(buffer, "mtllib", 6) == 0) //process material include
           }
           break;
       }
   }
   ...
     
   return vboHandle;
}

My engine uses exessive streaming rather than pure text-array support so it is setup be stream optimized and do as low string reading as possible except it is needed (for those material tags and group/object names for example) and a strict string comparsion that skip early outs as soon as possible

Share this post


Link to post
Share on other sites

I would personally read 'word' by 'word', letting spaces and newlines be the separators.

For the faces, I would simply read an integer, a char, an integer, a char and an integer again, so with something like this:

 

	int vert1,vert2,vert3; // face indices
	char dump; // for the slash
	ifs >> vert1 >> dump >> vert2 >> dump >> vert3;
	

Edited by _Silence_

Share this post


Link to post
Share on other sites

I already did something. I didn't to use the char for the slash. It would have been more easy.:P. Instead I eliminated the slash from the line. 
Here is my code. :P

 

if (lineStr[0] == 'f')
		{
			std::cout << "LINE STR FACE: " << lineStr << std::endl;

			for (int i = 0; i < lineStr.length(); i++)
			{
				if (lineStr == '/')
				{
					lineStr = ' ';
				}
			}

			std::stringstream os;
			std::string unused;

			os << lineStr;

			unsigned int v1, t1, n1, v2, t2, n2, v3, t3, n3;

			os >> unused >> v1 >> t1 >> n1 >> v2 >> t2 >> n2 >> v3 >> t3 >> n3;
              
              //...
              }

 

Share this post


Link to post
Share on other sites

That certainly works. However, it's not the most efficient way to solve this since you essentialy parse the string twice. You could use a combination of peek(), ignore() and operator>> to do this better. You also have to consider the different combinations of indices. There are 4 possible ways a face vertex can be given:

  1. f v v v ...
  2. f v/vt v/vt v/vt ...
  3. f v//vn v//vn v//vn ...
  4. f v/vt/vn v/vt/vn v/vt/vn ...

You can assume that a vertex position is always given and then check with peek() if the next character is a slash. This way you can distinguish between the different cases. If it's a slash just use ignore() to skip it. Also, if you want to support even more possibilities then you have to consider negative indices. For more in depth information about the format you can read the relevant parts on http://paulbourke.net/dataformats/obj/ and http://paulbourke.net/dataformats/mtl/

Edited by Batzer

Share this post


Link to post
Share on other sites

Here, is what I did for parsing a single vertex:   
   

 template < typename VertexT >
 const XMUINT3 OBJReader< VertexT >::ReadOBJVertexIndices() {
     
     const char *token = ReadChars();

     U32 vertex_index = 0;
     U32 texture_index = 0;
     U32 normal_index = 0;

     if (str_contains(token, "//")) {
       // v1//vn1
       const char *index_end = strchr(token, '/');
       if (StringToU32(token, index_end, vertex_index) == TokenResult::Invalid) {
         throw FormattedException(
           "%ls: line %u: invalid vertex index value found in %s.", 
           GetFilename().c_str(), GetCurrentLineNumber(), token);
       }
       if (StringToU32(index_end + 2, normal_index) == TokenResult::Invalid) {
         throw FormattedException(
           "%ls: line %u: invalid normal index value found in %s.", 
           GetFilename().c_str(), GetCurrentLineNumber(), token);
       }
     }
     else if (str_contains(token, '/')) {
       // v1/vt1 or v1/vt1/vn1
       const char *index_end = strchr(token, '/');
       if (StringToU32(token, index_end, vertex_index) == TokenResult::Invalid) {
         throw FormattedException(
           "%ls: line %u: invalid vertex index value found in %s.", 
           GetFilename().c_str(), GetCurrentLineNumber(), token);
       }

       if (str_contains(index_end + 1, '/')) {
         const char *texture_end = strchr(index_end + 1, '/');
         if (StringToU32(index_end + 1, texture_end, texture_index) == TokenResult::Invalid) {
           throw FormattedException(
             "%ls: line %u: invalid texture index value found in %s.", 
             GetFilename().c_str(), GetCurrentLineNumber(), token);
         }
         if (StringToU32(texture_end + 1, normal_index) == TokenResult::Invalid) {
           throw FormattedException(
             "%ls: line %u: invalid normal index value found in %s.", 
             GetFilename().c_str(), GetCurrentLineNumber(), token);
         }
       }
       else if (StringToU32(index_end + 1, texture_index) == TokenResult::Invalid) {
         throw FormattedException(
           "%ls: line %u: invalid texture index value found in %s.", 
           GetFilename().c_str(), GetCurrentLineNumber(), token);
       }
     }
     else if (StringToU32(token, vertex_index) == TokenResult::Invalid) {
       throw FormattedException(
         "%ls: line %u: invalid vertex index value found in %s.", 
         GetFilename().c_str(), GetCurrentLineNumber(), token);
     }

     return XMUINT3(vertex_index, texture_index, normal_index);
   }

Full OBJ/MTL has some difficulties for quick scanning due to optional tokens (especially MTL). What I basically did for all my ANSI file formats is to provide a method for reading and one for checking beyond the read head without advancing the read head, for all basic types. That way lexing and parsing is quite easy and result in small code blocks. Unfortunately, OBJ face definitions needed to be different for some reason, so only for that method I have such a giant code blob.

Edited by matt77hias

Share this post


Link to post
Share on other sites
2 hours ago, _Silence_ said:

I would personally read 'word' by 'word', letting spaces and newlines be the separators.

In most of these ANSI file formats you can use " \t\n\r" as your string of delimiter characters. (Note the tab character which you didn't mention but is quite common.) Load the file in memory, parse line by line (e.g. C's fgets) and tokenize the line (e.g. C's strtok_s).

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!