Jump to content

  • Log In with Google      Sign In   
  • Create Account

#ActualCornstalks

Posted 29 June 2013 - 03:16 PM

std:set does faster lookups for a specific object.

std::set is faster at determining if a specific object is in the set (because they're sorted, so a binary search can be done). However, for iterating over, std::vector is faster, because it's a contiguous block of memory that's more cache friendly. The thing is though, to lookup an object in a std::set doesn't make a lot of sense, because you have to pass it the object to look up (which means you already have the object to begin with...).
 

When you use a vector, which is basically an array, a lookup involves scanning the entire array. If you have 100 objects in the array, a lookup will take 100 operations (go over 100 objects until you find what you wanted).

It won't necessarily take 100. On average, it will take about 50 (assuming a randomly sorted array/vector and finding a random element (also assuming no duplicates)). Worst case is 100 comparisons. Best case is 1 comparison.

 

Erasing is even worse, since in a vector, erasing something from the middle involves scanning the entire array to find it, and then copying the tail of the array over the "empty" space caused by deleting.

Note that he's already iterating through the entire array, so the only extra cost of an erase() is the compacting of the data structure. In his case, there isn't really an added cost for finding the element to erase. Not necessarily. He has an iterator to the vector. Calling erase() is O(1). He's already iterating over the whole vector. The call to erase is not linear, though. It's constant time. [edit: that was embarrassing; thanks SiCrane]

 

Anyway, @Moonkis: unless I'm misreading/misunderstanding that stack trace, it's actually crashing on ++it. The problem is that erase() (for both vector and set) will return end() if you just erased the last element. Then the loop will try to ++it (before checking it != end()), which is undefined behavior. You need to check if it == end() (and NOT do ++it if it is).

 

Basically, this can be caused if your program follows this execution order:

  1. auto it = objects.begin()
  2. it != objects.end()
  3. execute loop body
  4. it = objects.erase(it); // Let's say you erased the last element and now erase() returns end()
  5. ++it // It will crash here, because it == end()!
  6. it != objects.end()
    1. Go to #3 if true
    2. Break out of the loop if false

 

The fact that it crashes with std::set and not std::vector is just dumb luck (or perhaps it's unlucky?). It's undefined behavior.

 

Also, for how your using objects, (assuming you aren't doing more with it), using std::vector will be faster. std::set is slower when it comes to iterating over, like you're doing in that loop. However, use the data structure that fits your needs best. If profiling reveals it to be too slow and problematic, then try to optimize.


#4Cornstalks

Posted 29 June 2013 - 02:41 PM

std:set does faster lookups for a specific object.

std::set is faster at determining if a specific object is in the set (because they're sorted, so a binary search can be done). However, for iterating over, std::vector is faster, because it's a contiguous block of memory that's more cache friendly. The thing is though, to lookup an object in a std::set doesn't make a lot of sense, because you have to pass it the object to look up (which means you already have the object to begin with...).
 

When you use a vector, which is basically an array, a lookup involves scanning the entire array. If you have 100 objects in the array, a lookup will take 100 operations (go over 100 objects until you find what you wanted).

It won't necessarily take 100. On average, it will take about 50 (assuming a randomly sorted array/vector and finding a random element (also assuming no duplicates)). Worst case is 100 comparisons. Best case is 1 comparison.

 

Erasing is even worse, since in a vector, erasing something from the middle involves scanning the entire array to find it, and then copying the tail of the array over the "empty" space caused by deleting.

Not necessarily. He has an iterator to the vector. Calling erase() is O(1). He's already iterating over the whole vector. The call to erase is not linear, though. It's constant time.

 

Anyway, @Moonkis: unless I'm misreading/misunderstanding that stack trace, it's actually crashing on ++it. The problem is that erase() (for both vector and set) will return end() if you just erased the last element. Then the loop will try to ++it (before checking it != end()), which is undefined behavior. You need to check if it == end() (and NOT do ++it if it is).

 

Basically, this can be caused if your program follows this execution order:

  1. auto it = objects.begin()
  2. it != objects.end()
  3. execute loop body
  4. it = objects.erase(it); // Let's say you erased the last element and now erase() returns end()
  5. ++it // It will crash here, because it == end()!
  6. it != objects.end()
    1. Go to #3 if true
    2. Break out of the loop if false

 

The fact that it crashes with std::set and not std::vector is just dumb luck (or perhaps it's unlucky?). It's undefined behavior.

 

Also, for how your using objects, (assuming you aren't doing more with it), using std::vector will be faster. std::set is slower when it comes to iterating over, like you're doing in that loop. However, use the data structure that fits your needs best. If profiling reveals it to be too slow and problematic, then try to optimize.


#3Cornstalks

Posted 29 June 2013 - 02:33 PM

std:set does faster lookups for a specific object.

std::set is faster at determining if a specific object is in the set (because they're sorted, so a binary search can be done). However, for iterating over, std::vector is faster, because it's a contiguous block of memory that's more cache friendly. The thing is though, to lookup an object in a std::set doesn't make a lot of sense, because you have to pass it the object to look up (which means you already have the object to begin with...).
 

When you use a vector, which is basically an array, a lookup involves scanning the entire array. If you have 100 objects in the array, a lookup will take 100 operations (go over 100 objects until you find what you wanted).

It won't necessarily take 100. On average, it will take about 50 (assuming a randomly sorted array/vector and finding a random element (also assuming no duplicates)). Worst case is 100 comparisons. Best case is 1 comparison.

 

Erasing is even worse, since in a vector, erasing something from the middle involves scanning the entire array to find it, and then copying the tail of the array over the "empty" space caused by deleting.

Not necessarily. He has an iterator to the vector. Calling erase() is O(1). He's already iterating over the whole vector. The call to erase is not linear, though. It's constant time.

 

Anyway, @Moonkis: unless I'm misreading/misunderstanding that stack trace, it's actually crashing on ++it. The problem is that erase() (for both vector and set) will return end() if you just erased the last element. Then the loop will try to ++it (before checking it != end()), which is undefined behavior. You need to check if it == end() (and NOT do ++it if it is).

 

Basically, this can be caused if your program follows this execution order:

  1. auto it = objects.begin()
  2. it != objects.end()
  3. execute loop body
  4. it = objects.erase(it); // Let's say you erased the last element and now erase() returns end()
  5. ++it // It will crash here, because it == end()!
  6. it != objects.end()
    1. Go to #3 if true
    2. Break out of the loop if false

 

The fact that it crashes with std::set and not std::vector is just dumb luck (or perhaps it's unlucky?). It's undefined behavior.

 

Also, for how your using objects, (assuming you aren't doing more with it), using std::vector will be faster. std::set is slower when it comes to iterating over, like you're doing in that loop. However, use the data structure that fits your needs best. If profiling reveals it to be too slow and problematic, then try to optimize.


#2Cornstalks

Posted 29 June 2013 - 02:32 PM

std:set does faster lookups for a specific object.

std::set is faster at determining if a specific object is in the set (because they're sorted, so a binary search can be done). However, for iterating over, std::vector is faster, because its a contiguous block of memory that's more cache friendly. The thing is though, to lookup an object in a std::set doesn't make a lot of sense, because you have to pass it the object to look up (which means you already have the object to begin with...).
 

When you use a vector, which is basically an array, a lookup involves scanning the entire array. If you have 100 objects in the array, a lookup will take 100 operations (go over 100 objects until you find what you wanted).

It won't necessarily take 100. On average, it will take about 50 (assuming a randomly sorted array/vector and finding a random element (also assuming no duplicates)). Worst case is 100 comparisons. Best case is 1 comparison.

 

Erasing is even worse, since in a vector, erasing something from the middle involves scanning the entire array to find it, and then copying the tail of the array over the "empty" space caused by deleting.

Not necessarily. He has an iterator to the vector. Calling erase() is O(1). He's already iterating over the whole vector. The call to erase is not linear, though. It's constant time.

 

Anyway, @Moonkis: unless I'm misreading/misunderstanding that stack trace, it's actually crashing on ++it. The problem is that erase() (for both vector and set) will return end() if you just erased the last element. Then the loop will try to ++it (before checking it != end()), which is undefined behavior. You need to check if it == end() (and NOT do ++it if it is).

 

Basically, this can be caused if your program follows this execution order:

  1. auto it = objects.begin()
  2. it != objects.end()
  3. execute loop body
  4. it = objects.erase(it); // Let's say you erased the last element and now erase() returns end()
  5. ++it // It will crash here, because it == end()!
  6. it != objects.end()
    1. Go to #3 if true
    2. Break out of the loop if false

 

The fact that it crashes with std::set and not std::vector is just dumb luck (or perhaps it's unlucky?). It's undefined behavior.

 

Also, for how your using objects, (assuming you aren't doing more with it), using std::vector will be faster. std::set is slower when it comes to iterating over, like you're doing in that loop. However, use the data structure that fits your needs best. If profiling reveals it to be too slow and problematic, then try to optimize.


#1Cornstalks

Posted 29 June 2013 - 02:28 PM

std:set does faster lookups for a specific object.

std::set is faster at determining if a specific object is in the set (because they're sorted, so a binary search can be done). However, for iterating over, std::vector is faster, because its a contiguous block of memory that's more cache friendly. The thing is though, to lookup an object in a std::set doesn't make a lot of sense, because you have to pass it the object to look up (which means you already have the object to begin with...).
 

When you use a vector, which is basically an array, a lookup involves scanning the entire array. If you have 100 objects in the array, a lookup will take 100 operations (go over 100 objects until you find what you wanted).

It won't necessarily take 100. On average, it will take about 50 (assuming a randomly sorted array/vector and finding a random element (also assuming no duplicates)). Worst case is 100 comparisons. Best case is 1 comparison.

 

Erasing is even worse, since in a vector, erasing something from the middle involves scanning the entire array to find it, and then copying the tail of the array over the "empty" space caused by deleting.

Not necessarily. He has an iterator to the vector. Calling erase() is O(1). He's already iterating over the whole vector. The call to erase is not linear, though.

 

Anyway, @Moonkis: unless I'm misreading/misunderstanding that stack trace, it's actually crashing on ++it. The problem is that erase() (for both vector and set) will return end() if you just erased the last element. Then the loop will try to ++it (before checking it != end()), which is undefined behavior. You need to check if it == end() (and NOT do ++it if it is).

 

Basically, this can be caused if your program follows this execution order:

  1. auto it = objects.begin()
  2. it != objects.end()
  3. execute loop body
  4. it = objects.erase(it); // Let's say you erased the last element and now erase() returns end()
  5. ++it // It will crash here, because it == end()!
  6. it != objects.end()
    1. Go to #3 if true
    2. Break out of the loop if false

 

The fact that it crashes with std::set and not std::vector is just dumb luck (or perhaps it's unlucky?). It's undefined behavior.


PARTNERS