Help with an operation on std::set?

Started by
10 comments, last by iMalc 18 years, 9 months ago
Hi everyone. This might show my lack of STL knowledge, but here goes... Today, I had a simple operation I wanted to perform. I eventually just wrote it the long way how I knew it would work, but I'm hoping that there's a simpler way (It just feels like there should be). What I have is three std::set's. The first set (lets call it setA), is a master set of all items. The second set (lets call it setB), is a set containing all of the items that I wish to remove from setA. The third set (lets call it setC), is a set containing all of the items that I wish to add to setA. I wish to essentially perform (in psuedocode): setA -= setB setA += setC If you know what I mean. Now, what I first tried (stupid me) was:
setA.erase(setB.begin(), setB.end());
setA.insert(setC.begin(), setC.end());
The second part on it's own is fine (so I don't need help with that part), but it seems you can't use erase with iterators from another set :-(, so the first part is bad. Actually I ended up with a set with less items than size() thought it contained! It would be nice if the documentation I had made it clear that I couldn't do this, I mean the reverse works fine using insert()..., but that's another matter. Alright, now I know about 'set_difference' etc. However that essentially performs the operation: setD = setA - setB After which I have to assign setD to setA. Obviously not very efficient! What I am looking for is something similiar to that, where it simply removes from setA, items that are in setB. My current option is to write a for-loop to iterate through setB, and erase each item from setA. Anyone else have any better ideas, or comments?
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Advertisement
EDIT: Removed a totally unworking piece of code. Apologies for posting something so stupid.

[Edited by - Sharlin on July 1, 2005 4:12:10 AM]
Quote:Original post by Sharlin
Try this out: (untested)
setA.erase(setA.find(*setB.begin()), setA.find(*setB.end()));

Basically, because sets are sorted by definition, the elements corresponding to those of setB form a continuous subsequence inside setA, so they can be erased in one pass.
Oooh, I didn't think of that!

Does anyone have anything to say about how the efficiency of that relates to this?...
for (someTypeOfSet::iterator it = setB.begin(); it != setB.end(); ++it)    setA.erase(*it);
Okay, I'm off to bed. Will check back in the morning.
"In order to understand recursion, you must first understand recursion."
My website dedicated to sorting algorithms
Sharlin, that won't work. Assume A contains the numbers 1 to 10. Assume B only contains the numbers 1 and 10. Your method will remove all numbers from A, while you want to have a set containing the numbers 2 to 9.

Regarding the OP, what's wrong with:
for (MyIterator it = setB.begin(); it != setB.end(); ++it)   setA.erase(*it);


Edit: Apart from the fact that I was a bit too slow, consider the following code as an alternative if setA.size() < setB.size():
for (MyIteratir it = setA.begin(); it != setA.end();){   if (setB.find(*it) == setB.end())      ++it;   else      it = setA.erase(it);}
Sharlin's code absolutly would not work. For starters, you can't dereference the end iterator (use back() on the container to get the last element). Secondly, the logic of what it removes is totally wrong.


What you are looking for are the set algorithms in the STL, in your case I think what you are after is probably set_difference and merge (or inplace_merge). Take a look at what is available.

Alternitivly, you could use remove_if. The performance, however, is going to be pretty terrible.
Quote:Original post by BitMaster
Sharlin, that won't work. Assume A contains the numbers 1 to 10. Assume B only contains the numbers 1 and 10. Your method will remove all numbers from A, while you want to have a set containing the numbers 2 to 9.


Woah, indeed. What the hell was I thinking again...

Quote:Original post by iMalc
Obviously not very efficient!


I don't think its as bad as it may appear to be, you can make doing both operations together better & efficient by using a single variable temp, swap (& clear) operation, and a custom pooled allocator to speed up allocations [grin]:

#include <cmath>#include <algorithm>#include <iterator>#include <set>#include <vector>#include <iostream>#include <boost/pool/pool_alloc.hpp>int main() {  typedef std::set<int, std::less<int>, boost::fast_pool_allocator<int> > seti;   seti i1, i2, i3;   seti::size_type N = 10;   std::generate_n(std::inserter(i1, i1.begin()), N, std::rand);   N = 6;   std::generate_n(std::inserter(i2, i2.begin()), N, std::rand);   N = 4;   std::generate_n(std::inserter(i3, i3.begin()), N, std::rand);   seti temp;   std::set_difference( i1.begin(), i1.end(),			i2.begin(), i2.end(),			std::inserter(temp, temp.begin()));   temp.swap(i1);   temp.clear();   std::set_union(i1.begin(), i1.end(), i3.begin(), i3.end(),		std::inserter(temp, temp.begin()));   temp.swap(i1);   std::copy(i1.begin(), i1.end(),       std::ostream_iterator<int>(std::cout, ", "));}


Note that you don't need to make the "temp" a set container as the result will be sorted but this way we can use a cheap swap/clear op & possibly reuse of previous allocation so if all is well for doing the difference & intersection together give us a cost of just one extra allocation.

Quote:Original post by iMalc
What I am looking for is something similiar to that, where it simply removes from setA, items that are in setB.

My current option is to write a for-loop to iterate through setB, and erase each item from setA. Anyone else have any better ideas, or comments?


The other alternative is:

#include <cmath>#include <algorithm>#include <iterator>#include <set>#include <vector>#include <iostream>#include <boost/pool/pool_alloc.hpp>#include <boost/bind.hpp>int main() {    typedef std::set<int, std::less<int>, boost::fast_pool_allocator<int> > seti;    typedef seti::size_type (seti::*memfunc)(const seti::key_type&);    seti i1, i2, i3;    seti::size_type N = 10;    std::generate_n(std::inserter(i1, i1.begin()), N, std::rand);    N = 6;    std::generate_n(std::inserter(i2, i2.begin()), N, std::rand);    N = 4;    std::generate_n(std::inserter(i3, i3.begin()), N, std::rand);    std::for_each(i2.begin(), i2.end(),	boost::bind(static_cast<memfunc>(&seti::erase), boost::ref(i1), _1));    seti temp;    std::merge( i1.begin(), i1.end(),		i2.begin(), i2.end(),		std::inserter(temp, temp.begin()));    temp.swap(i1);	    std::copy(i1.begin(), i1.end(),                std::ostream_iterator<int>(std::cout, ", "));}


Profile and see which one is better.
std::set<int> tmp( C);std::set_difference( A.begin(), A.end(), B.begin(), B.end(), std::inserter( tmp, tmp.end()));tmp.swap( A);


if you're allowed to trash C you can get rid of the temporary.

edit: beaten by the kid
HardDrop - hard link shell extension."Tread softly because you tread on my dreams" - Yeats
Quote:Original post by snk_kid
Profile and see which one is better.


Okay so i did it myself [grin], its a very brief profile on VC++ 8.0 mind you.

The version with set operations on average was: 0.00513417 (sec or msec i forget heh).

The other version, on average was: 0.00489294 (sec or msec)

So the one using set operations isn't bad at all.
@snk_kid: would you mind running my variation to? It does basicly the same thing as your first one without going the roundabout way using two set_functions.
HardDrop - hard link shell extension."Tread softly because you tread on my dreams" - Yeats

This topic is closed to new replies.

Advertisement