I wanted to do a quick performance (very unscientific though) test on std::vector and std::list and I was very surprised by the results. Maybe it's just me not understanding or knowing how each implements everything internally.
My surprise came when I sorted a vector vs. sorted a equally size linked list. I understand that std::sort requires a random access iterator that std::list doesn't provide which is why their is a list::sort method. Though I thought that using std::sort would require a swap internally (not sure why I do think that, I'm sure I should be doing more research about the internals of each) which wold require each and every object to be copied over. Now the test object I use is just a very basic struct that "simulates" a Rectangle that has a length, width, x position, and y position, all stored as doubles. I overloaded the < operator in it to do the sort and it should sort it based on area (just went ahead and calculated the length*width instead of storing an area member variable, not trying to make this scientific or find performances).
In my test I store 1,000,000 rectangles in a vector a list and set the length, width, xPosition, and yPosition to random numbers for each element in the vector. I then set each elements length, width, xPosition, and yPosition in the list to the vectors corresponding element. Calculate the time for each. Maybe I should use a more scientific approach to this, as I am only using std::clock_t and CLOCKS_PER_SEC to calculate.
When doing this I am very surprised at the performance gap between the two!
First, here's my code to do this quick test, although it isn't anything special or scientific
// Tests Sorting Lists and Vectors Performance
#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <ctime>
struct Rectangle
{
double length;
double width;
double xPosition;
double yPosition;
bool operator<(const Rectangle& rect) const
{
return (length*width) < (rect.length*rect.width);
}
};
int main()
{
const int NUM_ELEMENTS = 1000000;
std::clock_t startTime;
std::clock_t stopTime;
std::vector<Rectangle> myVector(NUM_ELEMENTS);
std::vector<Rectangle>::iterator vectIter;
// Insert rectangle data in vector
startTime = clock();
for(vectIter = myVector.begin(); vectIter != myVector.end(); ++vectIter)
{
vectIter->length = 1.0*rand()/RAND_MAX;
vectIter->width = 1.0*rand()/RAND_MAX;
vectIter->xPosition = 1.0*rand()/RAND_MAX;
vectIter->yPosition = 1.0*rand()/RAND_MAX;
}
stopTime = clock();
double totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in vector in: " << totalTime <<std::endl;
std::list<Rectangle> myList(NUM_ELEMENTS);
std::list<Rectangle>::iterator listIter;
// Insert rectangle data in list
startTime = clock();
for(vectIter = myVector.begin(), listIter = myList.begin(); listIter != myList.end() && vectIter != myVector.end(); ++listIter, ++vectIter)
{
listIter->length = vectIter->length;
listIter->width = vectIter->width;
listIter->xPosition = vectIter->xPosition;
listIter->yPosition = vectIter->yPosition;
}
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Inserted rectangles in list in: " << totalTime <<std::endl;
// sort the vector using std::sort algoorithm
startTime = clock();
std::sort(myVector.begin(), myVector.end());
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a vector: " << totalTime << std::endl;
// sort the list
// can't use std::sort, use the list member function
startTime = clock();
myList.sort();
stopTime = clock();
totalTime = (stopTime-startTime)/(double) CLOCKS_PER_SEC;
std::cout << "Time for sorting a list: " << totalTime << std::endl;
return 0;
}
The output/performance I get
Inserted Rectangles in Vector in: 1.597
Inserted Rectangles in List in: 2.989
Time for sorting a vector: 3.384
Time for sorting a list: 111.252
What's with the HUGE perforamance decrease in sorting that list? I understand list is slow at random access and that is what std::vector is strong at, but that big a performance hit when using a list? I think list::sort uses a merge sort, which is O(NlogN), I believe. What would std::sort be using in this case? is their any guarantee or way to see what implementation it is using (maybe I didn't research enough), but does the standard guarantee what it will use? Is their a worst case Big-Oh run time that std::sort guarantees? Average Big-Oh run time?
I understand that this isn't a scientific test nor would I use something like that in a real world code, but is their something just crazy wrong with my code that I'm missing that is causing the list to be that much slower? I do still understand where benefits of using a list would come in, but if this is correct I can totally understand why I've always been told "look at std::vector first." Is this correct? Maybe it's because since the object I am sorting in each is very basic and if I was using a larger and more advanced class that would take longer to copy and results could be different?
Interested in any comments on this.
Note: this is just curiosity and I am not trying to pre-optimize current real world code that I have or anything. Just my Computer Science brain being curious.