• Create Account

Speed up x parser

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

5 replies to this topic

#1csisy  Members

Posted 14 December 2011 - 02:13 PM

Hi

I've written an x parser which reads datas to a hiearchy. I read positions, position indices, normals, normal indices and texcoords per mesh. I have to "mix" these datas to create an array from my vertex structure.
It is the slowest pass of my parser. I downloaded the crytek-boosted sponza model (which is ~300k poly), worked a little with it and exported to a text x file (it is ~20mb), and the "mesh-creating" pass was impossible slow... I am wondering if anyone can help me. Here is the source:


int positionIndex = 0;
Vector3 normal = Vector3::Zero;

int size = m_PositionIndices.size();
for (int i = 0; i != size; i++)
{
positionIndex = m_PositionIndices[i];
normal = m_Normals[m_NormalIndices[i]];

//position, texcoord, normal, tangent, binormal

VertexNTB vertex = VertexNTB(m_Positions[positionIndex], m_Texcoords[positionIndex], normal, Vector3::Zero, Vector3::Zero);

//vertex exist?
bool exist = false;
int vertSize = m_Vertices.size();
for (int k = 0; k != vertSize; k++)
{
//vertex exist
if (m_Vertices[k] == vertex)
{
exist = true;
m_VertexIndices.push_back(k);
break;
}
}

//if vertex exist
if (exist)
{
//move to next index
continue;
}

//create vertex and add new index
m_Vertices.push_back(vertex);
m_VertexIndices.push_back(m_Vertices.size()-1);
}

#2mhagain  Members

Posted 14 December 2011 - 04:58 PM

I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

It appears that the gentleman thought C++ was extremely difficult and he was overjoyed that the machine was absorbing it; he understood that good C++ is difficult but the best C++ is well-nigh unintelligible.

#3csisy  Members

Posted 15 December 2011 - 03:42 AM

I'm not certain what your call to VertexNTB does, but if it involves any dynamic allocations or other heavy work then it's going to slow you down.

Your most likely trouble spot however is those push_back calls. They will definitely involve quite a lot of dynamic allocation, freeing and moving memory around, and are not suitable for use in inner loops like this. You know up-front that you're always going to have 'size' indices so you should allocate the vector for that before even entering the first loop and then just fill it in using array indexing. For vertices you also know that your number of vertices will never exceed 'size', so likewise allocate that up-front and fill it in using array indexing, then copy it off to a new vector when done.

But I don't know the size of these vectors.

For example, I have a cube. One corner of the cube is defined in the x file, like this:
//position
{0, 0, 0}
//normal
{0, 1, 0}
{1, 0, 0}
{0, 0, -1}

In the x file this is one vertex with 3 normal index, but I have to "separate" it into 3 vertex.
So I need
//position
{0, 0, 0}
{0, 0, 0}
{0, 0, 0}
//normal

{0, 1, 0}
{1, 0, 0}
{0, 0, -1}

This is a bad example, but I want to show you that I have to create vertices and reindex the index array, so the size isn't known.

The VertexNTB is a struct

EDIT:
Oh, you're right, I was wrong. I call a resize for the m_VertexIndices vector. Now, I have to call the push_back only for the m_Vertices.
Are there any algorithm that could speed up this part?

PS.: The DirectX Viewer load this modell in a few seconds.

#4Rene Z  Members

Posted 15 December 2011 - 06:55 AM

I'm not familiar with the X file format, but it surprises me to hear that they don't store the number of vertices. If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file. This still results in multiple allocations though, so it's not perfect. Another method is to do two passes. First determine the number of vertices, allocate the vector, then read the values.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx

#5csisy  Members

Posted 15 December 2011 - 06:57 AM

This is the slowest part of the parser but the whole parser is slow

Reading the words from the file is about 10 secs (the file contains ~744k words). I use

std::ifstream stream(path.c_str());
std::string str;

while (stream >> str)
m_Words.push_back(str);

stream.close();

I've seen the dxviewer's source code, but it just calls LoadMeshFromX or something like that. I haven't idea how they can do it faster...

Here, what I do:
- read all words from file
- search templates (start with "template" word) and store names (like "Frame")
- create a root node and start reading nodes while we aren't in the end of the file
- 1th node's parent is the root node
- get node's "type" and "name" (type like "Frame", "Mesh", ...)
- watch '{' and '}' words (I have to know when we are in the end of the current node)
- iterating words and if the templates contains the word (f.e.: "Frame"), call LoadNode (recursively)
- else add word to the current node's lines

Now, I have nodes in a hiearchy but it costs (with reading) about 20-30 secs...

Parsing loaded nodes is about 15 secs, and the mesh-creating is.... a lot.

#6csisy  Members

Posted 15 December 2011 - 06:59 AM

If push_back really is your bottleneck, an easy way to improve performance is to use a deque instead of a vector. Copy the contents of the deque to a vector after parsing the file.

If you really want to know where most of the time is spent, make a test program which only loads an x file and profile it. AMD CodeAnalyst is a decent free profiler:
http://developer.amd...es/default.aspx

Thanks for your suggestions, I'll try it