Speed up x parser

Started by
4 comments, last by csisy 12 years, 4 months ago
Hi

I've written an x parser which reads datas to a hiearchy. I read positions, position indices, normals, normal indices and texcoords per mesh. I have to "mix" these datas to create an array from my vertex structure.
It is the slowest pass of my parser. I downloaded the crytek-boosted sponza model (which is ~300k poly), worked a little with it and exported to a text x file (it is ~20mb), and the "mesh-creating" pass was impossible slow... I am wondering if anyone can help me. Here is the source:



int positionIndex = 0;
Vector3 normal = Vector3::Zero;

int size = m_PositionIndices.size();
for (int i = 0; i != size; i++)
{
positionIndex = m_PositionIndices;
normal = m_Normals[m_NormalIndices];

//position, texcoord, normal, tangent, binormal

VertexNTB vertex = VertexNTB(m_Positions[positionIndex], m_Texcoords[positionIndex], normal, Vector3::Zero, Vector3::Zero);


//vertex exist?
bool exist = false;
int vertSize = m_Vertices.size();
for (int k = 0; k != vertSize; k++)
{
//vertex exist
if (m_Vertices[k] == vertex)
{
exist = true;
m_VertexIndices.push_back(k);
break;
}
}


//if vertex exist
if (exist)
{
//move to next index
continue;
}

//create vertex and add new index
m_Vertices.push_back(vertex);
m_VertexIndices.push_back(m_Vertices.size()-1);
}



Thanks for your help, and sorry for my bad english.
sorry for my bad english
Advertisement
I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

Direct3D has need of instancing, but we do not. We have plenty of glVertexAttrib calls.


I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

But I don't know the size of these vectors.

For example, I have a cube. One corner of the cube is defined in the x file, like this:

//position
{0, 0, 0}
//normal
{0, 1, 0}
{1, 0, 0}
{0, 0, -1}


In the x file this is one vertex with 3 normal index, but I have to "separate" it into 3 vertex.
So I need

//position
{0, 0, 0}
{0, 0, 0}
{0, 0, 0}
//normal

{0, 1, 0}
{1, 0, 0}
{0, 0, -1}



This is a bad example, but I want to show you that I have to create vertices and reindex the index array, so the size isn't known.

The VertexNTB is a struct

EDIT:
Oh, you're right, I was wrong. :) I call a resize for the m_VertexIndices vector. Now, I have to call the push_back only for the m_Vertices.
Are there any algorithm that could speed up this part?

Thanks for your help

PS.: The DirectX Viewer load this modell in a few seconds.
sorry for my bad english
I'm not familiar with the X file format, but it surprises me to hear that they don't store the number of vertices. If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file. This still results in multiple allocations though, so it's not perfect. Another method is to do two passes. First determine the number of vertices, allocate the vector, then read the values.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx
This is the slowest part of the parser but the whole parser is slow :(

Reading the words from the file is about 10 secs (the file contains ~744k words). I use


std::ifstream stream(path.c_str());
std::string str;

while (stream >> str)
m_Words.push_back(str);

stream.close();



I've seen the dxviewer's source code, but it just calls LoadMeshFromX or something like that. I haven't idea how they can do it faster... :)

Here, what I do:

- read all words from file
- search templates (start with "template" word) and store names (like "Frame")
- create a root node and start reading nodes while we aren't in the end of the file
- 1th node's parent is the root node
- get node's "type" and "name" (type like "Frame", "Mesh", ...)
- watch '{' and '}' words (I have to know when we are in the end of the current node)
- iterating words and if the templates contains the word (f.e.: "Frame"), call LoadNode (recursively)
- else add word to the current node's lines


Now, I have nodes in a hiearchy but it costs (with reading) about 20-30 secs... :(

Parsing loaded nodes is about 15 secs, and the mesh-creating is.... a lot. :)
sorry for my bad english

If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx


Thanks for your suggestions, I'll try it
sorry for my bad english

This topic is closed to new replies.

Advertisement