Sign in to follow this  
King of Men

Tricky problem with std::sort

Recommended Posts

King of Men    394
I have a class Actor. I am trying to sort an std::vector<Actor*>. During this sort I get a segmentation violation, apparently when one of the elements in my vector suddenly points to something totally different. I will try to post all the relevant code:
// Creation and filling of my vector

  static std::vector<Actor*> buyers;
  buyers.clear();

  // allActors is a static std::vector<Actor*>
  for (unsigned int i = 0; i < allActors.size(); i++) {
    allActors[i]->myDeals[resource].clear();
    
    int price = allActors[i]->myPrices[resource];
    if (price == 0) continue;  // Not trading in this good.
    else if ((price < 0) && (allActors[i]->myResources[resource] > 0)) sellers.push_back(allActors[i]);
    else {
      buyers.push_back(allActors[i]);
      log("Added allActors " + stringify(i) + " " + stringify(allActors[i]));
    }
  }

  log("doTrade " + stringify(resource) + " Line 1");

  if (buyers.size() == 0) return;

  // Sort by price. 
  tradeResource = resource;

  // Static vector for debugging purposes
  theSorts = &buyers;
  std::sort(buyers.begin(), buyers.end(), ActorDescendingSorter());
The ActorDescendingSorter is just a functor for sorting Actor*:
  struct ActorDescendingSorter {
    // Comparator to sort Actor pointers by price, from highest to lowest, for a resource. 
    inline bool operator() (Actor* a1, Actor* a2) {
      // Note negation! This is so sort will return a list from highest to lowest. 
      checkActors();
      std::cout << (int) a1 << " " << (int) a2 << std::endl;
      return (!((*a1) < (*a2)));
    }
  };
You will note that it now contains some debugging code. checkActors is just a static method to print out the contents of the list being sorted:
void Actor::checkActors () {
  for (std::vector<Actor*>::iterator i = theSorts->begin(); i < theSorts->end(); ++i) {
    std::cout << "Checking : " << (*i) << " " << (int)(*i) << std::endl;
  }
}
You should note that Actor overwrites the << operator to print out its name, and I like printing addresses as ints. Now finally, we can look at the output:
164240728 164231640
Checking : London Cloth 164228968
Checking : London 1 164231640
Checking : London 2 164231904
Checking : London 3 164232168
Checking : Kent 1 164232432
Checking : Sussex 1 164232696
Checking : Sussex 2 164232960
Checking : Wessex 1 164233224
Checking : Wessex 2 164233488
Checking : Essex 1 164233752
Checking : Essex 2 164234016
Checking : Bergen 1 164234280
Checking : Bergen 2 164234544
Checking : Trondhjem 164239936
Checking : Nordland 164240200
Checking : Ostlandet 164240464
Checking : Stavanger 164240728
164240728 164228968
Checking : London Cloth 164228968
Checking : London 1 164231640
Checking : London 2 164231904
Checking : London 3 164232168
Checking : Kent 1 164232432
Checking : Sussex 1 164232696
Checking : Sussex 2 164232960
Checking : Wessex 1 164233224
Checking : Wessex 2 164233488
Checking : Essex 1 164233752
Checking : Essex 2 164234016
Checking : Bergen 1 164234280
Checking : Bergen 2 164234544
Checking : Trondhjem 164239936
Checking : Nordland 164240200
Checking : Ostlandet 164240464
Checking : Stavanger 164240728
164240728 5433
Segmentation fault
So clearly, what's going on is that the sorter is going happily along, comparing Actor pointers. (Actually they are subclasses, Actor being abstract.) Then it comes to the point where it tries to compare the Actor named Stavanger to something totally random, and weirdness results. I should mention that this behaviour disappears if I have fewer Actor objects. Remove just one, and it works! More accurately, remove just one of the ones that go into the list to be sorted. The magic number appears to be 17. I'm puzzled and confused. Help me?

Share this post


Link to post
Share on other sites
iMalc    2466
Quote:
Original post by King of Men
I have a class Actor. I am trying to sort an std::vector<Actor*>. During this sort I get a segmentation violation, apparently when one of the elements in my vector suddenly points to something totally different. I will try to post all the relevant code:

    inline bool operator() (Actor* a1, Actor* a2) {
// Note negation! This is so sort will return a list from highest to lowest.
checkActors();
std::cout << (int) a1 << " " << (int) a2 << std::endl;
return (!((*a1) < (*a2)));
}
Here's one bug I was able to spot straight away, after having fixed this very issue a few months back.
The return statement is wrong. That is not a valid comparator for sorting because it does not conform to strict weak ordering. It must only return true when *a1 is less than *a2. That includes not returning true when they are equal. Currently if they are equal it also returns true.
You need to change it to return *a2 < *a1 instead.
In short, never negate the comparison, reverse it instead.
This might also fix your seg-fault problem.

Share this post


Link to post
Share on other sites
LorenzoGatti    4442

inline bool operator() (Actor* a1, Actor* a2) {
//...
return (!((*a1) < (*a2)));
}

This method looks right, but it calls operator< on Actor instances, which is likely to be the origin of the segmentation fault.
Complex objects like your Actor might have NULL or dangling pointers that should be left alone; without code, it's hard to guess.

Share this post


Link to post
Share on other sites
LorenzoGatti    4442
Quote:
Original post by iMalc
Here's one bug I was able to spot straight away, after having fixed this very issue a few months back.
The return statement is wrong. That is not a valid comparator for sorting because it does not conform to strict weak ordering. It must only return true when *a1 is less than *a2. That includes not returning true when they are equal. Currently if they are equal it also returns true.
You need to change it to return *a2 < *a1 instead.
In short, never negate the comparison, reverse it instead.
This might also fix your seg-fault problem.

Definitely a bug, the function isn't correct as I stated; but while incorrect boolean logic could lead to infinite loops or to not sorted or not stably sorted results in std::sort, I don't see how it can cause segmentation faults. Did it happen to you?

Share this post


Link to post
Share on other sites
iMalc    2466
Quote:
Original post by LorenzoGatti

inline bool operator() (Actor* a1, Actor* a2) {
//...
return (!((*a1) < (*a2)));
}

This method looks right, but it calls operator < on Actor instances, which is likely to be the origin of the segmentation fault.
Complex objects like your Actor might have NULL or dangling pointers that should be left alone; without code, it's hard to guess.
No, it is NOT right, as I just explained. The comparator must not return true for items that are equal. Putting my explanation another way; The opposite of (a < b) is (a >= b), NOT (a > b). In fact the checked implementation of the SC++L in VS2005 will complain about this very bug at run-time.
You are right that it could be helpful to see the Actor operator less-than.

Quote:
Definitely a bug, the function isn't correct as I stated; but while incorrect boolean logic could lead to infinite loops or to not sorted or not stably sorted results in std::sort, I don't see how it can cause segmentation faults. Did it happen to you?
I don't have enough code to run the program. Although it might not directly be the cause of the seg fault, in general one bug can easily lead to a worse fault later on.

Actually, here is another bug:
  for (std::vector<Actor*>::iterator i = theSorts->begin(); i < theSorts->end(); ++i) {
You are not supposed to use less-than on an iterator. You should always use != when comparing with the end() iterator.

Share this post


Link to post
Share on other sites
King of Men    394
Thank you for the bug-finding. I fixed those issues, and this seems to move my crash to a different, later sort; but it still occurs. Here is my operator code:


inline bool Actor::operator< (Actor &a) {
return (myPrices[tradeResource] < a.myPrices[tradeResource]);
}


Looking more closely at the printouts just before my crash, I found a possible clue. As you recall, every time the comparison functor is called, it prints out the list being sorted. Here is the output:


Checking : Ostlandet agri 10 165143720
Checking : Kent agri 10 165144064
Checking : Sussex agri 10 165144408
Checking : Sussex dairy 10 165144752
Checking : Essex agri 10 165145096
Checking : Wessex agri 10 165145440
Checking : Essex wool 10 165145784
Checking : Wessex wool 10 165146128
Checking : Bergen fish 10 165147160
Checking : Stavanger fish 10 165147504
Checking : Trondhjem ore 10 165147848
Checking : London Cloth 5 165146472
Checking : London Tools 5 165146816
Checking : Nordland ore 10 165148456
165148456 165143720

Checking : Ostlandet agri 10 165143720
Checking : Kent agri 10 165144064
Checking : Sussex agri 10 165144408
Checking : Sussex dairy 10 165144752
Checking : Essex agri 10 165145096
Checking : Wessex agri 10 165145440
Checking : Essex wool 10 165145784
Checking : Wessex wool 10 165146128
Checking : Bergen fish 10 165147160
Checking : Stavanger fish 10 165147504
Checking : Trondhjem ore 10 165147848
Checking : London Cloth 5 165146472
Checking : London Tools 5 165146816
Checking : Nordland ore 10 165148456
165148456 165146816

Checking : Ostlandet agri 10 165143720
Checking : Kent agri 10 165144064
Checking : Sussex agri 10 165144408
Checking : Sussex dairy 10 165144752
Checking : Essex agri 10 165145096
Checking : Wessex agri 10 165145440
Checking : Essex wool 10 165145784
Checking : Wessex wool 10 165146128
Checking : Bergen fish 10 165147160
Checking : Stavanger fish 10 165147504
Checking : Trondhjem ore 10 165147848
Checking : London Cloth 5 165146472
Checking : London Tools 5 165146816
Checking : London Tools 5 165146816
165148456 165146472

Checking : Ostlandet agri 10 165143720
Checking : Kent agri 10 165144064
Checking : Sussex agri 10 165144408
Checking : Sussex dairy 10 165144752
Checking : Essex agri 10 165145096
Checking : Wessex agri 10 165145440
Checking : Essex wool 10 165145784
Checking : Wessex wool 10 165146128
Checking : Bergen fish 10 165147160
Checking : Stavanger fish 10 165147504
Checking : Trondhjem ore 10 165147848
Checking : London Cloth 5 165146472
Checking : London Cloth 5 165146472
Checking : London Tools 5 165146816
165148456 165147848


The format is name, current price, address. You'll note that near the end, the actor named "Nordland ore" disappears, replaced by two copies of "London Tools". Then, at the last sort, one of those copies is replaced by a "London Cloth", making two copies of that. Clearly this is not correct behaviour, although the list does end up sorted. Can anyone suggest what may cause this?

Share this post


Link to post
Share on other sites
King of Men    394
Ah, wait, I see what's going on - just ignore the second half of my last post. The sorter will be overwriting one of the copies of "London Cloth", and then it won't do any more compares so we don't get to see the final, sorted list. So that's not the problem.

Share this post


Link to post
Share on other sites
King of Men    394
Never mind, I found the bug - an uninitialised pointer in a totally different part of the code. Thanks for helping, though, it did improve my program. :)

Share this post


Link to post
Share on other sites
Washu    7829
Quote:
Original post by iMalc
Actually, here is another bug:
  for (std::vector<Actor*>::iterator i = theSorts->begin(); i < theSorts->end(); ++i) {
You are not supposed to use less-than on an iterator. You should always use != when comparing with the end() iterator.

Dunno who told you that you're not supposed to use less than on an iterator, but they were lying to you. It is perfectly legal and OK to use any of the relational operators on random access iterators (of which the vector iterators and deque iterators are a member of). It may not necessarily be useful to do so, for instance in this case it serves the same purpose of !=, but it does no harm.

Share this post


Link to post
Share on other sites
iMalc    2466
Quote:
Original post by Washu
Quote:
Original post by iMalc
Actually, here is another bug:
  for (std::vector<Actor*>::iterator i = theSorts->begin(); i < theSorts->end(); ++i) {
You are not supposed to use less-than on an iterator. You should always use != when comparing with the end() iterator.

Dunno who told you that you're not supposed to use less than on an iterator, but they were lying to you. It is perfectly legal and OK to use any of the relational operators on random access iterators (of which the vector iterators and deque iterators are a member of). It may not necessarily be useful to do so, for instance in this case it serves the same purpose of !=, but it does no harm.
Okay sorry, I have been misinformed.
However using != is still a good idea in a for-loop termination condition because it can allow changing the container type to a set/list etc without having to change any other code.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this