C++: problem erasing vector items

Started by
16 comments, last by Alessandro 11 years, 7 months ago

std::set_intersection can do that.

std::vector<int> results;
std::set_intersection(someValues.begin(), someValues.end(), someValuesToCheck.begin(), someValuesToCheck.end(), std::back_inserter(results));

someValuesToCheck = result;



That would be very useful. Unfortunately can't use it because I'm using different type of sets (someValues is a int, someValuesToCheck is a structure) and from what I've read in the documentation the back_inserter expects to compare two equal type of sets.
Thanks anyway, this will come useful in other occasions...
Advertisement

It sounds relatively straightforward, in pseudo code: erase(remove_if(someValuesToCheck, not( ismember(someValues)))).

What part is causing you trouble?


Thanks, I'm going to try this...
someValues.erase( std::remove_if( someValues.begin(), someValues.end(),
[&]( int val )
{
return (std::find( someValuesToCheck.begin(), someValuesToCheck.end(), val ) == someValuesToCheck.end() );
} ),
someValues.end() );


or


struct myPred
{
myPred(const std::vector<int>& toCheck )
:someValuesToCheck(toCheck)
{}
bool operator()(int val)
{
return (std::find( someValuesToCheck.begin(), someValuesToCheck.end(), val ) == someValuesToCheck.end() );
}
const std::vector<int>& someValuesToCheck;
};
someValues.erase( std::remove_if( someValues.begin(), someValues.end(), myPred(someValuesToCheck) ), someValues.end() );


(Just realized I got someValues and someValuesToCheck backwards, but you should get the idea)

What are you actually trying to remove? As the others have said, if you want to erase all but a given ID, then use the following:

int id = /* determine id */;
myVec.erase(std::remove_if(myVec.begin(), myVec.end(), myInversePred(id)), myVec.end());


You can also use the std::not1 predicate to invert an existing predicate:

#include <functional>

int id = /* determine id */;
myVec.erase(std::remove_if(myVec.begin(), myVec.end(), std::not1(myPred(id))), myVec.end());



I'm trying to use the std::not1 as suggested but I've troubles coding the predicate.

Here is the code:

[source]std::vector<myStruct::myStructLib> myVector;

for (int f=0; f<tempValues.size(); f++)
{
int faceID=tempValues.at(f);
myVector.erase(std::remove_if(myVector.begin(), myVector.end(), std::not1(myPredicate(faceID))), myVector.end());
}

// predicate
struct myPredicate : std::unary_function<int,bool>{
bool operator()(const myStruct::myStructLib &obj) {
return obj.faceID==id;
}
int id;
};[/source]

When I compile I get several errors:
error: no type named 'argument_type' in 'struct myPredicate'
error: no match for call to '(std::unary_negate<myPredicate>) (myStruct::myStructLib&)'

[quote name='Brother Bob' timestamp='1345740634' post='4972644']
std::set_intersection can do that.

std::vector<int> results;
std::set_intersection(someValues.begin(), someValues.end(), someValuesToCheck.begin(), someValuesToCheck.end(), std::back_inserter(results));

someValuesToCheck = result;



That would be very useful. Unfortunately can't use it because I'm using different type of sets (someValues is a int, someValuesToCheck is a structure) and from what I've read in the documentation the back_inserter expects to compare two equal type of sets.
Thanks anyway, this will come useful in other occasions...
[/quote]
The function allows the two sets to be of different types, as long as each of the two types are less than comparable with itself, and that the type of the first set is less than comparable with the type of the second set (alternatively you provide a predicate to less than compare an element of the first set with an element of the second set). I should have realized that from your first post that the types were different, sorry, so I didn't mention more about that.

That is, if the first set is of type T1, and the second set is of type T2, then you need the following operators:

  1. bool operator<(T1, T1)
  2. bool operator<(T2, T2)
  3. bool operator<(T1, T2), or a predicate bool pred(T1, T2)

Operator 1 or 2 is the built in int operator (depending on which one is the integer of course), and the other one you have to define as a sorting criteria for your structure. Operator 3 can either be an operator or a predicate you pass a parameter to the set intersect function.

edit: And I should add now that since you probably shouldn't define an arbitrary operator < for your class for 1 or 2 above, and since that you cannot provide that operator as an explicit predicate, you probably shouldn't go this route. It requires an operator that may not be obvious and unambiguous, so you should not define it.
@Brother Bob

std::set_intersection requires sorted ranges, which the OP hasn't definitively stated (though the example data is sorted, that could be human convenience).

@Alessandro
The following compiles for me:

#include <vector>
#include <functional>
#include <algorithm>

struct Example {
int id;
};

struct myPredicate : std::unary_function<Example, bool> {

myPredicate(int id) : id(id) {
}

bool operator()(const Example &obj) const {
return obj.id == id;
}

private:
int id;
};

void foo(std::vector<Example> &examples, const std::vector<int> &identifiers) {
for (unsigned i = 0 ; i < identifiers.size(); ++i) {
int id = identifiers;
examples.erase(std::remove_if(examples.begin(), examples.end(), std::not1(myPredicate(id))), examples.end());
}
}

Your code was incorrectly using std::unary_function (the first type parameter must match the parameter type of the operator().

Note that there are two simple ways of implementing this (naive) algorithm, essentially swapping the inner and outer loops. These have different performance characteristics, depending on the relative sizes of the two containers. You want the inner loop to be the smaller one, if you can predict this in advance (i.e. if there are always less identifiers than objects, then the identifiers should be the inner loop).

If this is a performance sensitive application and you are frequently attempting to look up an item by ID, you may also want to consider storing the data in a more search friendly manner, e.g. a binary tree or hash map based on identifier. You could also sort the list of identifiers, this way you can quickly binary search for a given element.

If both ranges are sorted, then you can use some of the more advanced C++ algorithms to help you, as Brother Bob is suggesting. You would need, of course, to balance the cost of lookups against the iteration cost (if you frequently visit the entire collection) and the insert/removal cost (depending on frequency of insertion/removal, and in particular if you need random access insertion).

for (int f=0; f<tempValues.size(); f++)
{
int faceID=tempValues.at(f);
myVector.erase(std::remove_if(myVector.begin(), myVector.end(), std::not1(myPredicate(faceID))), myVector.end());
}

// predicate
struct myPredicate : std::unary_function<int,bool>{
bool operator()(const myStruct::myStructLib &obj) {
return obj.faceID==id;
}
int id;
};


I don't get it. Unless tempValues is empty or contains only one element, you can just as well call myVector.clear() and be done with it. As soon as this for-loop has executed the second time, your vector will always be empty.

At this point it's not a matter of syntax, it's a matter of the whole approach being simply wrong. The only place where a loop should iterate over all tempValues is inside your predicate (and even there I'd prefer find).


struct IDcontainedIn
{
IDcontainedIn(const Container& c) : container(c) {}

const Container& container;

bool operator()(ObjectType obj)
{
return find(container.begin(), container.end(), obj.id) != container.end();
}
};


myVector.erase(std::remove_if(myVector.begin(), myVector.end(), std::not1(containedIn(tempValues))), myVector.end());
f@dzhttp://festini.device-zero.de

I don't get it. Unless tempValues is empty or contains only one element, you can just as well call myVector.clear() and be done with it. As soon as this for-loop has executed the second time, your vector will always be empty.

At this point it's not a matter of syntax, it's a matter of the whole approach being simply wrong. The only place where a loop should iterate over all tempValues is inside your predicate (and even there I'd prefer find).


That's right, exactly what's happening.

The solution I used is the one suggested by krippy2k8, which uses std::find into the predicate:


struct myPred
{
myPred(const std::vector<int>& toCheck )
:someValuesToCheck(toCheck)
{}
bool operator()(int val)
{
return (std::find( someValuesToCheck.begin(), someValuesToCheck.end(), val ) == someValuesToCheck.end() );
}
const std::vector<int>& someValuesToCheck;
};
someValues.erase( std::remove_if( someValues.begin(), someValues.end(), myPred(someValuesToCheck) ), someValues.end() );


Nonetheless, it was very interesting to read rip-off, Brother Bob and all other folks suggestions. I'm very new to vectors and very confused about those. Hopefully I learned something new today.

This topic is closed to new replies.

Advertisement