• Advertisement
Sign in to follow this  

Help with optimizations

This topic is 2375 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

[size=2]Here is the test: You have an std::vector full of random numbers and some of the numbers are repeated several times randomly. You must find the most efficient[size=2] way to create another vector with all numbers in the list referenced[size=2] only once and ordered from the lowest to the highest.

Share this post


Link to post
Share on other sites
Advertisement
std::sort followed by std::unique, done. (For better performance you might want to use unique_copy to avoid O(n^2) complexity).

Share this post


Link to post
Share on other sites
Or just create a set from the data:

std::set<int>(input.begin(), input.end());

The set doesn't allow duplicates, and uses a binary search tree to sort from low to high.

Share this post


Link to post
Share on other sites

std::sort followed by std::unique, done. (For better performance you might want to use unique_copy to avoid O(n^2) complexity).


std::unique has O(n) complexity.

Share this post


Link to post
Share on other sites

Or just create a set from the data:

std::set<int>(input.begin(), input.end());

The set doesn't allow duplicates, and uses a binary search tree to sort from low to high.


std::set definitely appears to be the best way to do it. Now I need a way to get a vector of the locations of values within the first vector that match the given set.

Example: {5,3,2,1,2,5}
Search: 5
Result: {0,5}

Share this post


Link to post
Share on other sites

Or just create a set from the data:

std::set<int>(input.begin(), input.end());

The set doesn't allow duplicates, and uses a binary search tree to sort from low to high.


I know this is going to sound stupid but how do I read an item from the set? Tried this but it does not work:

DWORD foo= *(theSet.begin() + n);

EDIT: This works:

DWORD foo= (DWORD)(&theSet+(n*sizeof(DWORD)));

However I assume there must be a better way?...

Share this post


Link to post
Share on other sites
You know, if you actually had a question you'd probably be better off phrasing your thread title as something other than a pissing contest. In any case, how you read elements from a set depend on how you want to use it. If you're iterating through the elements then you'd use a standard iterator loop:

for (SetType::iterator i = my_set.begin(); i != my_set.end(); ++i) {
// do something with *i
}

Share this post


Link to post
Share on other sites

You know, if you actually had a question you'd probably be better off phrasing your thread title as something other than a pissing contest. In any case, how you read elements from a set depend on how you want to use it. If you're iterating through the elements then you'd use a standard iterator loop:

for (SetType::iterator i = my_set.begin(); i != my_set.end(); ++i) {
// do something with *i
}



Yeah I already had code that did that and knew there must be a way to optimize it. I should have just tiled it something like "optimize this." Anyway, I was basically trying to create separate index buffers for each subset of my mesh given an attribute buffer.

Here is what I have now:

// Create Index buffers.
std::set<DWORD> usedAttributes(attributes.begin(), attributes.end());
std::vector<std::vector<Ovgl::Face>> index_subsets;
index_subsets.resize(usedAttributes.size());
for( unsigned int i = 0; i < attributes.size(); i++ )
{
unsigned int s = 0;
for( std::set<DWORD>::iterator j = usedAttributes.begin(); j != usedAttributes.end(); ++j)
{
if( attributes == *j )
{
index_subsets[s].push_back(faces);
}
s++;
}
}



See anything I can do to optimize that?

Share this post


Link to post
Share on other sites

std::unique has O(n) complexity.


Oh yeah, true that. I blame being tired for thinking of such a naive implementation.

Share this post


Link to post
Share on other sites

EDIT: This works:

DWORD foo= (DWORD)(&theSet+(n*sizeof(DWORD)));

[/quote]
You're getting (un)lucky. That behaviour is totally undefined. You're basically taking the address of the set, moving n DWORDS in memory, and casting the ADDRESS as a DWORD. It makes no sense.


See anything I can do to optimize that?
[/quote]
You're doing a "linear" search on the set (the set might be implemented as a tree, so you're not necessarily search memory linearly). A simple improvement is to use std::set::find() rather than iterating through it.

You might find that the time taken to build the set (lots of memory allocations!) and search the set (following pointers!) is actually larger than just linearly searching the original vector, duplicates and all. Vector's memory contiguity could win the day, depending on the size of the data.

Consider testing this with the various types of meshes you expect to load (and maybe stress test it with some larger ones). You might be surprised at the performance of an extremely "naive" implementation here.

Share this post


Link to post
Share on other sites
There is a sweet spot for using a tree structure in most programs, and std::set and std::map are some of the more common tree structures used in C++.

Too few elements, and the overhead of memory allocation per node in the tree will kill you. Too many elements, and cache coherency becomes a serious problem, unless you want to go to a lot of work writing a custom allocator. (And believe me, writing an allocator that gets good locality of reference for an arbitrary tree is a bitch of a problem.)


As with all things performance related: profile heavily, or you're basically just pissing into the wind. What its applicable for your very specific case may not be at all the same as the relevant solutions from everyone else's experience. Even algorithmic-level optimization isn't always a clear winner on modern processing architectures.


(Also, I fixed your title.)

Share this post


Link to post
Share on other sites
You guys had the answer in the very first reply. Sort the vector and then use the std::unique function. This will run much, much faster than any implementation using std::set. (But feel free to profile anyway!)

Share this post


Link to post
Share on other sites

You guys had the answer in the very first reply. Sort the vector and then use the std::unique function. This will run much, much faster than any implementation using std::set. (But feel free to profile anyway!)


For objects with very heavy copy constructors, such as meshes or shaders, it can be faster to use a set which does no copying during the sort/uniqueness-guarantee operations.

Share this post


Link to post
Share on other sites
The best algorithmic complexity you can achieve is O(n) space, O(n) time (a counting sort). If you want O(1) space you'll have to settle for O(nlogn) time (an inplace O(nlogn) comparison sort).

The fastest way depends on the size of your data set and the implementations of the contained types copy constructors, move constructors and swap functions.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement