Jump to content
  • Advertisement
Sign in to follow this  
csisy

Speed up x parser

This topic is 2560 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi

I've written an x parser which reads datas to a hiearchy. I read positions, position indices, normals, normal indices and texcoords per mesh. I have to "mix" these datas to create an array from my vertex structure.
It is the slowest pass of my parser. I downloaded the crytek-boosted sponza model (which is ~300k poly), worked a little with it and exported to a text x file (it is ~20mb), and the "mesh-creating" pass was impossible slow... I am wondering if anyone can help me. Here is the source:



int positionIndex = 0;
Vector3 normal = Vector3::Zero;

int size = m_PositionIndices.size();
for (int i = 0; i != size; i++)
{
positionIndex = m_PositionIndices;
normal = m_Normals[m_NormalIndices];

//position, texcoord, normal, tangent, binormal

VertexNTB vertex = VertexNTB(m_Positions[positionIndex], m_Texcoords[positionIndex], normal, Vector3::Zero, Vector3::Zero);


//vertex exist?
bool exist = false;
int vertSize = m_Vertices.size();
for (int k = 0; k != vertSize; k++)
{
//vertex exist
if (m_Vertices[k] == vertex)
{
exist = true;
m_VertexIndices.push_back(k);
break;
}
}


//if vertex exist
if (exist)
{
//move to next index
continue;
}

//create vertex and add new index
m_Vertices.push_back(vertex);
m_VertexIndices.push_back(m_Vertices.size()-1);
}



Thanks for your help, and sorry for my bad english.

Share this post


Link to post
Share on other sites
Advertisement
I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

Share this post


Link to post
Share on other sites

I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

But I don't know the size of these vectors.

For example, I have a cube. One corner of the cube is defined in the x file, like this:

//position
{0, 0, 0}
//normal
{0, 1, 0}
{1, 0, 0}
{0, 0, -1}


In the x file this is one vertex with 3 normal index, but I have to "separate" it into 3 vertex.
So I need

//position
{0, 0, 0}
{0, 0, 0}
{0, 0, 0}
//normal

{0, 1, 0}
{1, 0, 0}
{0, 0, -1}



This is a bad example, but I want to show you that I have to create vertices and reindex the index array, so the size isn't known.

The VertexNTB is a struct

EDIT:
Oh, you're right, I was wrong. :) I call a resize for the m_VertexIndices vector. Now, I have to call the push_back only for the m_Vertices.
Are there any algorithm that could speed up this part?

Thanks for your help

PS.: The DirectX Viewer load this modell in a few seconds.

Share this post


Link to post
Share on other sites
I'm not familiar with the X file format, but it surprises me to hear that they don't store the number of vertices. If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file. This still results in multiple allocations though, so it's not perfect. Another method is to do two passes. First determine the number of vertices, allocate the vector, then read the values.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx

Share this post


Link to post
Share on other sites
This is the slowest part of the parser but the whole parser is slow :(

Reading the words from the file is about 10 secs (the file contains ~744k words). I use


std::ifstream stream(path.c_str());
std::string str;

while (stream >> str)
m_Words.push_back(str);

stream.close();



I've seen the dxviewer's source code, but it just calls LoadMeshFromX or something like that. I haven't idea how they can do it faster... :)

Here, what I do:

- read all words from file
- search templates (start with "template" word) and store names (like "Frame")
- create a root node and start reading nodes while we aren't in the end of the file
- 1th node's parent is the root node
- get node's "type" and "name" (type like "Frame", "Mesh", ...)
- watch '{' and '}' words (I have to know when we are in the end of the current node)
- iterating words and if the templates contains the word (f.e.: "Frame"), call LoadNode (recursively)
- else add word to the current node's lines


Now, I have nodes in a hiearchy but it costs (with reading) about 20-30 secs... :(

Parsing loaded nodes is about 15 secs, and the mesh-creating is.... a lot. :)

Share this post


Link to post
Share on other sites

If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx


Thanks for your suggestions, I'll try it

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!