• Advertisement
Sign in to follow this  

Cache coherence and containers

This topic is 1246 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hellof folks, following question:

 

Say, I have a vector and a map that references entries in this vector:

 

std::map< int* > mymap;

std::vector< int > myvector;

mymap.insert( &vector[0] ); ...
mymap.insert( &vector );

 

Now I write an algorithm that walks through the map, dereferences the pointers and processes the elements.

Can I get a speed improvement through cache coherency by making sure the vector and the map have the same sorting criterion so elements are accessed in the same order in the map and the vector when I walk through the map?

 

Regards

Share this post


Link to post
Share on other sites
Advertisement

We still need more information about how you are using the map to give you a helpful answer.

 

Do you really need to put the pointers to the vector elements in the map? You are trying to setup the map so that you can iterate over to give the same sequence as the vector. In doing this, you propose to store a pointer to each element, this means that you need to do a pointer dereference for every element in the vector. Can you not simply iterate over the vector and then the map or vice versa? Can the elements in the map not be added to the vector?

 

You mention cache coherence so I'm assuming this is for performance purposes? Have you got a performance problem that needs resolving? Have you used a profiler to ensure you are 'optimising' the correct part of your program?

Share this post


Link to post
Share on other sites

This was just for example. The actual map looks somewhat like this:

 

std::map< double, SomeStructIMade* > mymap;

std::vector< SomeStructIMade > myvector;

 

I need the sorting and fast insertion/deletion from the map, that's why I use it. I iterate over the values in the map and access the pointers from that and during walkthrough I may insert new key/value pairs into the map.

Share this post


Link to post
Share on other sites

Now I write an algorithm that walks through the map, dereferences the pointers and processes the elements.
Can I get a speed improvement through cache coherency


Yes, by getting rid of std::map and std::set entirely. Walking the vector in strictly-increasing order is better, sure, but you're still going to be walking the absolutely atrocious std::map.

While it's not a drop-in replacement for every situation, consider using Boost's flat_map instead of std::map (same algorithmic complexity, much faster lookup and iteration in practice due to its cache-friendly contiguous layout, and only marginally slower insert/delete if your data set is small and your keys/values are trivially copyable).

Also, just profile your results and find out if something is faster or not. Even if you get an answer here there's a good chance it's complete and utter nonsense (including this one) given how contextually-dependent anything performance-related is.

Share this post


Link to post
Share on other sites

Regardless of whether or not you end up with that solution, storing pointers (or iterators) to anything in an std::vector is a Bad Idea, since modifying the vector could potentially invalidate some (or all) of them. If you absolutely need to indirectly reference elements in an vector, use indices instead. That's not a perfect solution either though, since you'd still need to manually fix them up if elements are inserted to or removed from the middle of the vector, so you should avoid this approach if you can.

Edited by Zipster

Share this post


Link to post
Share on other sites

This was just for example. The actual map looks somewhat like this:

 

std::map< double, SomeStructIMade* > mymap;

std::vector< SomeStructIMade > myvector;

 

I need the sorting and fast insertion/deletion from the map, that's why I use it. I iterate over the values in the map and access the pointers from that and during walkthrough I may insert new key/value pairs into the map.

You're probably better off using a sorted vector. Try it and profile.

 

Although you may think that a vector is slow in insertion and deletions, in actual fact it can be faster because finding which element to remove is that much faster.

Share this post


Link to post
Share on other sites

The vector is basically my input set. The map is used to store pointers because a) fast find b) one entry from the vector can be referenced multiple times and I can reference othe stuff from different sources c) I can insert more entries into the map while walkthrough. This is why fast_map isn't an option for me as it invalidates the iterator at insertion. My next thought would be to use a sorted vector for the base set and a map only for the new entries, then look which container has the next element according to my sorting criterion.

 

Is std::sort the best thing for sorting a vector?

Share this post


Link to post
Share on other sites
I don't think you understand something very important.

Your strategy does not work. If the vector reallocates at any time, which it is free to do under certain circumstances/operations, then your map will be full of bogus pointers. A crash would be lucky, but you might also be cursed with massive state and/or data corruption.


Why don't you just store the index of the item in the vector instead?


[edit] You could help out immensely by describing what you're trying to do in detail. There might be a vastly better solution out there entirely. Edited by ApochPiQ

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement