• Create Account

Posted 17 May 2013 - 06:31 PM

I wanted to do a quick performance (very unscientific though) test on std::vector and std::list and I was very surprised by the results.  Maybe it's just me not understanding or knowing how each implements everything internally.

My surprise came when I sorted a vector vs. sorted a equally size linked list.  I understand that std::sort requires a random access iterator that std::list doesn't provide which is why their is a list::sort method.  Though I thought that using std::sort would require a swap internally (not sure why I do think that, I'm sure I should be doing more research about the internals of each) which wold require each and every object to be copied over.  Now the test object I use is just a very basic struct that "simulates" a Rectangle that has a length, width, x position, and y position, all stored as doubles.  I overloaded the < operator in it to do the sort and it should sort it based on area (just went ahead and calculated the length*width instead of storing an area member variable, not trying to make this scientific or find performances).

In my test I store 1,000,000 rectangles in a vector a list and set the length, width, xPosition, and yPosition to random numbers for each element in the vector.  I then set each elements length, width, xPosition, and yPosition in the list to the vectors corresponding element.  Calculate the time for each.  Maybe I should use a more scientific approach to this, as I am only using std::clock_t and CLOCKS_PER_SEC to calculate.

When doing this I am very surprised at the performance gap between the two!

First, here's my code to do this quick test, although it isn't anything special or scientific

// Tests Sorting Lists and Vectors Performance

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <ctime>

struct Rectangle
{
double length;
double width;
double xPosition;
double yPosition;

bool operator<(const Rectangle& rect) const
{
return (length*width) < (rect.length*rect.width);
}
};

int main()
{
const int NUM_ELEMENTS = 1000000;
std::clock_t startTime;
std::clock_t stopTime;

std::vector<Rectangle> myVector(NUM_ELEMENTS);
std::vector<Rectangle>::iterator vectIter;

// Insert rectangle data in vector
startTime = clock();
for(vectIter = myVector.begin(); vectIter != myVector.end(); ++vectIter)
{
vectIter->length = 1.0*rand()/RAND_MAX;
vectIter->width = 1.0*rand()/RAND_MAX;
vectIter->xPosition = 1.0*rand()/RAND_MAX;
vectIter->yPosition = 1.0*rand()/RAND_MAX;
}
stopTime = clock();
double totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in vector in: " << totalTime <<std::endl;

std::list<Rectangle> myList(NUM_ELEMENTS);
std::list<Rectangle>::iterator listIter;

// Insert rectangle data in list
startTime = clock();
for(vectIter = myVector.begin(), listIter = myList.begin(); listIter != myList.end() && vectIter != myVector.end(); ++listIter, ++vectIter)
{
listIter->length = vectIter->length;
listIter->width = vectIter->width;
listIter->xPosition = vectIter->xPosition;
listIter->yPosition = vectIter->yPosition;
}
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in list in: " << totalTime <<std::endl;

// sort the vector using std::sort algoorithm
startTime = clock();
std::sort(myVector.begin(), myVector.end());
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a vector: " << totalTime << std::endl;

// sort the list
// can't use std::sort, use the list member function
startTime = clock();
myList.sort();
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a list: " << totalTime << std::endl;

return 0;
} 

The output/performance I get

Inserted Rectangles in Vector in: 1.597

Inserted Rectangles in List in: 2.989

Time for sorting a vector: 3.384

Time for sorting a list: 111.252

What's with the HUGE perforamance decrease in sorting that list?  I understand list is slow at random access and that is what std::vector is strong at, but that big a performance hit when using a list?  I think list::sort uses a merge sort, which is O(NlogN), I believe.  What would std::sort be using in this case?  is their any guarantee or way to see what implementation it is using (maybe I didn't research enough), but does the standard guarantee what it will use? Is their a worst case Big-Oh run time that std::sort guarantees?  Average Big-Oh run time?

I understand that this isn't a scientific test nor would I use something like that in a real world code, but is their something just crazy wrong with my code that I'm missing that is causing the list to be that much slower?  I do still understand where benefits of using a list would come in, but if this is correct I can totally understand why I've always been told "look at std::vector first."  Is this correct?  Maybe it's because since the object I am sorting in each is very basic and if I was using a larger and more advanced class that would take longer to copy and results could be different?

Interested in any comments on this.

Note: this is just curiosity and I am not trying to pre-optimize current real world code that I have or anything.  Just my Computer Science brain being curious.

Posted 17 May 2013 - 06:24 PM

I wanted to do a quick performance (very unscientific though) test on std::vector and std::list and I was very surprised by the results.  Maybe it's just me not understanding or knowing how each implements everything internally.

My surprise came when I sorted a vector vs. sorted a equally size linked list.  I understand that std::sort requires a random access iterator that std::list provides which is why their is a list::sort method.  Though I thought that using std::sort would require a swap internally (not sure why Id o think that, I'm sure I should be doing more research about the internals of each) which wold require each and every object to be copied over.  Now the test object I use is just a very basic struct that "simulates" a Rectangle that has a length, width, x position, and y position, all stored as doubles.  I overloaded the < operator in it to do the sort and it should sort it based on area (just went ahead and calculated the length*width instead of storing an area member variable, not trying to make this scientific or find performances).

In my test I store 1,000,000 rectangles in a vector a list and set the length, width, xPosition, and yPosition to random numbers for each element in the vector.  I then set each elements length, width, xPosition, and yPosition in the list to the vectors corresponding element.  Calculate the time for each.  Maybe I should use a more scientific approach to this, as I am only using std::clock_t and CLOCKS_PER_SEC to calculate.

When doing this I am very surprised at the performance gap between the two!

First, here's my code to do this quick test, although it isn't anything special or scientific

// Tests Sorting Lists and Vectors Performance

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <ctime>

struct Rectangle
{
double length;
double width;
double xPosition;
double yPosition;

bool operator<(const Rectangle& rect) const
{
return (length*width) < (rect.length*rect.width);
}
};

int main()
{
const int NUM_ELEMENTS = 1000000;
std::clock_t startTime;
std::clock_t stopTime;

std::vector<Rectangle> myVector(NUM_ELEMENTS);
std::vector<Rectangle>::iterator vectIter;

// Insert rectangle data in vector
startTime = clock();
for(vectIter = myVector.begin(); vectIter != myVector.end(); ++vectIter)
{
vectIter->length = 1.0*rand()/RAND_MAX;
vectIter->width = 1.0*rand()/RAND_MAX;
vectIter->xPosition = 1.0*rand()/RAND_MAX;
vectIter->yPosition = 1.0*rand()/RAND_MAX;
}
stopTime = clock();
double totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in vector in: " << totalTime <<std::endl;

std::list<Rectangle> myList(NUM_ELEMENTS);
std::list<Rectangle>::iterator listIter;

// Insert rectangle data in list
startTime = clock();
for(vectIter = myVector.begin(), listIter = myList.begin(); listIter != myList.end() && vectIter != myVector.end(); ++listIter, ++vectIter)
{
listIter->length = vectIter->length;
listIter->width = vectIter->width;
listIter->xPosition = vectIter->xPosition;
listIter->yPosition = vectIter->yPosition;
}
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in list in: " << totalTime <<std::endl;

// sort the vector using std::sort algoorithm
startTime = clock();
std::sort(myVector.begin(), myVector.end());
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a vector: " << totalTime << std::endl;

// sort the list
// can't use std::sort, use the list member function
startTime = clock();
myList.sort();
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a list: " << totalTime << std::endl;

return 0;
} 

The output/performance I get

Inserted Rectangles in Vector in: 1.597

Inserted Rectangles in List in: 2.989

Time for sorting a vector: 3.384

Time for sorting a list: 111.252

What's with the HUGE perforamance decrease in sorting that list?  I understand list is very slow at random access and that is what std::vector is strong at, but that big a performance hit when using a list?  I think list::sort uses a merge sort, which is O(NlogN), I believe.  What would std::sort be using in this case?  is their any guarantee or way to see what implementation it is using (maybe I didn't research enough), but does the standard guarantee what it will use? Is their a worst case Big-Oh run time that std::sort guarantees?  Average Big-Oh run time?

I understand that this isn't a scientific test nor would I use something like that in a real world code, but is their something just crazy wrong with my code that I'm missing that is causing the list to be that much slower?  I do still understand where benefits of using a list would come in, but if this is correct I can totally understand why I've always been told "look at std::vector first."  Is this correct?

Interested in any comments on this.

Note: this is just curiosity and I am not trying to pre-optimize current real world code that I have or anything.  Just my Computer Science brain being curious.

PARTNERS