Sign in to follow this  
Jacob Jingle

Using Intel Threading Building Blocks to add parallelism to a recursive class

Recommended Posts

Jacob Jingle    226
What would be the best way to add parallelism(using tbb) to the class below? Is it a good candidate, due to the fact that it's set up like a tree, for parallelism(using tbb)? Or should I maintain two vectors with one being a tree like below and the other being a flat layout?
struct Directory
{
     void Add()
     {
          C = A * B;

          for_each(m_vChild.begin(), m_vChild.end(),[](std::tr1::shared_ptr<Directory> dir){ dir->Add();});
     }

Matrix A, B, C;
std::vector<std::tr1::shared_ptr<Directory>> m_vChild;
}

Thx ahead of time.

Share this post


Link to post
Share on other sites
Antheus    2409
shared_ptr is thread-safe, so passing it around means incurring an interlocked increment. This is undesirable.

Ideally, for good paralellism, the code would be structured for streaming processing.

struct Transform {
Matrix a,b,c;
};

vector<Transform> transforms;

someTreeStructure relations;

In the example you have no dependencies between parents/children, which means there is no reason to cram that information in same structure. Using above hot/cold separation means all transformations are processed in parallel, and in a very memory efficient manner (which could be improved).

The tree depends. If transforms are stored externally, then it avoids the need for shared pointers, just use raw pointers or even indices into transform array. Unless the hierarchy needs to be modified frequently, this is not a problem.

If hierarchy does need to be modified at all, then keeping a rich, linked hierarchy separately might be better, and on each change it is serialized into above structures.

Share this post


Link to post
Share on other sites
Jacob Jingle    226
Quote:
Original post by Antheus
shared_ptr is thread-safe, so passing it around means incurring an interlocked increment. This is undesirable.

Damn, didn't think about that.

Quote:
Original post by Antheus
Ideally, for good paralellism, the code would be structured for streaming processing.

struct Transform {
Matrix a,b,c;
};

vector<Transform> transforms;

someTreeStructure relations;


In the example you have no dependencies between parents/children, which means there is no reason to cram that information in same structure. Using above hot/cold separation means all transformations are processed in parallel, and in a very memory efficient manner (which could be improved).

The tree depends. If transforms are stored externally, then it avoids the need for shared pointers, just use raw pointers or even indices into transform array. Unless the hierarchy needs to be modified frequently, this is not a problem.

If hierarchy does need to be modified at all, then keeping a rich, linked hierarchy separately might be better, and on each change it is serialized into above structures.

Thx for the thoughtful answer.

I really wish I had thought about all of this(as well as some of the things in intel docs) way before starting my first simple game. I would have done things entirely differently...With task scheduler, pipelines, etc.

You really have to take this stuff into account from the start. [Giant pain in the arse to work it in later]

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this