Sign in to follow this  
polyfrag

Comparison of STL list performance

Recommended Posts

I'm confused by the question. What does C90 have to do with STL (which is a C++ thing)?

 

And what STL list are you talking about that's different than MSVS 2012's list implementation (that's an STL list too, but maybe different than what you're comparing too).

 

You'll have to give more details if you want an answer.

Share this post


Link to post
Share on other sites

https://github.com/polyf/cppvsc

 

https://github.com/polyf/cppvsc/blob/master/2c/test/test.c

https://github.com/polyf/cppvsc/blob/master/2cpp/test/test.cpp

 

Compares milliseconds elapsed pushing, iterating and calling, and popping between the two implementations.

Edited by polyfrag

Share this post


Link to post
Share on other sites

Using a pointer to the data to be held in each list node made all the difference. Now it's as slow as the C90 version.

 

Test 2: STL list vs. C linked list      

C90 g=3000000 passed push,iterate+call,pop in 109,0,47 ms  

CPP g=3000000 passed push,iterate+call,pop in 47,16,15 ms  

CPP with pointer data g=3000000 passed push,iterate+call,pop in 109,0,47 ms

Edited by polyfrag

Share this post


Link to post
Share on other sites

I'm not sure if I have this exactly right, but I think std::list is not really implemented as a linked list, so your comparison to linked lists makes no sense.

 

It does act like a doubly-linked list, but under the hood, it uses a little trick to reduce the number of pointer dereferencing operations which would be required for a straight-forward doubly-linked list implementation.

 

Instead, I think std::list uses multiple buckets (memory buffers which hold multiple elements), same as std::string and other sequence types, and instead of keeping "Next" and "Prev" pointers in each element (or Node), it only has to keep track of the memory buckets this way, and any operations it does on elements inside a bucket do not have to use the "Next" and "Prev" pointers to move between elements, because those elements are contiguous - they're accessed simply by index.

 

This is just a simple view. In reality, std::list is much more complicated than this.

Share this post


Link to post
Share on other sites


There's no good reason in 2015 to switch to C from C++. Anyone claiming otherwise is an old-school programmer who hasn't paid much attention to compiler or language technology in the last decade.

 

Are you talking about game development or general development? Because there are plenty of reasons people use C instead of C++ in 2015, it's just not generally the case in game development because we need to treat a lot of information and resources, and the facilities that C++ has over C are very handy there.

 

But yeah, people that use C don't do it to outperform std::list because that would be a complete waste of time. Instead of switching to C, I'd rather spend that time doing like EA and making my own version of the STL containers, with the features I find superfluous dropped out...

Share this post


Link to post
Share on other sites


Because there are plenty of reasons people use C instead of C++ in 2015,

 

And those reasons have always been those reasons -- If you have those reasons, you were never using C++ in the first place, hence no need to switch. C has its place, but I struggle to think of an instance where it makes sense to swtich to C from where C++ has been used.

Share this post


Link to post
Share on other sites

Behold the power of templates. To achieve similar performance in C, you'd have to write a separate list-node for each small data type you wanted to use, and the whole set of linked-list functions to operate on it.

 

Not necessarily. I can conceive of doing some sort of intrusive container using zero-length arrays. That might seem a little smelly, and you'd have to pass the size of the node type into the node allocator you were using, but I think you could make it work. Totally off the top of my head, no compiler has touched this code so it may contain typos:

 

[source]

/* intrusive linked list node - note the zero-length array */

typedef struct _IntrusiveNode {

    _IntrusiveNode* next;

    char data[0];

} INTRUSIVE_LIST;

 

INTRUSIVE_LIST* push_uninitialized_node(INTRUSIVE_LIST* head, size_t data_size) {

    INTRUSIVE_LIST* new_head;

    new_head = malloc(sizeof(INTRUSIVE_LIST) + data_size);

    new_head->next = head;

    return new_head;

}

 

INTRUSIVE_LIST* push_node(INTRUSIVE_LIST* head, void* data, size_t data_size) {

    INTRUSIVE_LIST* new_head;

    new_head = malloc(sizeof(INTRUSIVE_LIST) + data_size);

    new_head->next = head;

    memcpy(new_head->data, data, data_size);

    return new_head;

}

 

/* helper macros */

#define INTRUSIVE_LIST_PUSH(h,d) push_node(h,d,sizeof(*d))

#define INTRUSIVE_LIST_PUSH_UNINITIALIZED(h, s) push_uninitialized_node(h, sizeof(s))

 

/* usage */

head = INTRUSIVE_LIST_PUSH_UNINITIALIZED(NULL, int);

 

/* look, I can even mix and match data types!

float data = 0.0f;

head = INTRUSIVE_LIST_PUSH(head, &data);

[/source]

 

Granted,you now have a memory copy step if you want it initialized, but you could elide that if you just wanted to emplace an uninitialized node on the list (as I've shown). Oh, and you don't have the typesafety of the templated version.

 

Can anyone find anything wrong with this?

 

edit: But really, depending on your platform, you shouldn't be using linked lists, anyway.

Edited by Oberon_Command

Share this post


Link to post
Share on other sites


Instead, I think std::list uses multiple buckets (memory buffers which hold multiple elements), same as std::string and other sequence types, and instead of keeping "Next" and "Prev" pointers in each element (or Node), it only has to keep track of the memory buckets this way, and any operations it does on elements inside a bucket do not have to use the "Next" and "Prev" pointers to move between elements, because those elements are contiguous - they're accessed simply by index.

 

I believe you're thinking of the std::deque, which acts somewhat like a vector, but isn't guaranteed to be contiguous, and which guarantees constant complexity for pushing and popping from the front and back. In regards to the std::list, I'm not 100% sure, but I believe Visual Studio's implementation, and most other implementations, use a fairly standard doubly-linked list. Visual Studio has a debugging setting to show the raw structure of objects in the watch window, if you want to see who is right.

Share this post


Link to post
Share on other sites

I certainly don't mean to contradict any of the good points raised for real applications, but the particular synthetic "benchmark" posted is much much simpler, and can be pretty nicely explained by comparing the running time of the following for-loops.

for(i=0; i < 1000*1000*10; i++) {
  delete (new int);
}

for(i=0; i < 1000*1000*10; i++) {
  free(malloc(4));
  free(malloc(4));
}

Share this post


Link to post
Share on other sites

What do you mean, Erik? I think there would be more allocations/deallocations in the c++ version due to copy constructor (if the compiler doesn't compile that out). There's also if statements in pushing and iterating and popping.

The timings were more shaky this time around, varying by as much as 90 ms. But I tried to get the best of 5 tries.

 

By the way, zero-length arrays seem to make the C custom linked list implementation as fast as the STL version. Only 1 allocation is required for the linked node together with the data, and one deallocation. Plus maybe no jumping in memory because of a data pointer, allowing it all to be fetched into the cache.

Edited by polyfrag

Share this post


Link to post
Share on other sites


I think there would be more allocations/deallocations in the c++ version due to copy constructor (if the compiler doesn't compile that out).

 

In your custom implementation, you first malloc() the 'Widget', and then you malloc a 'ListLink' to store it in, 2 allocations for each node.

std::list will create an internal type that is 'sizeof(Widget) + sizeof(ListLink)' and do one single allocation per node.

That's why you saw the same performance when you changed the std::list store pointers that were allocated with new, as that version also had 2 allocations per node with that setup, 'new Widget' and new 'sizeof(pointer) + sizeof(ListLink)'.

 

Try timing the loop with 10 million allocations only without any list, and I think you will find that it's not too far from the time of the allocation-step in your list-test.

 

If you want to count the number of allocations for a C++ implementation, you can override operator new and increment a counter before calling malloc from it.

int g_countNew = 0;
int g_countDelete = 0;

void* operator new(size_t size) {
  ++g_countNew;
  return malloc(size);
}

void operator delete(void *mem) {
  ++g_countDelete;
  free(mem);
}

Share this post


Link to post
Share on other sites


What do you mean, Erik? I think there would be more allocations/deallocations in the c++ version due to copy constructor (if the compiler doesn't compile that out). There's also if statements in pushing and iterating and popping.

 

The cost of a single branch to check a null pointer during iteration is miniscule compared to the cost of calling new/malloc/delete/free. And things like the copy constructor for a simple type don't matter - assuming that you're just operating on simple structures or built-in types (not even necessarily POD), any "copies" allocated for temporaries are on the stack. Again, for the purposes of this benchmark, that's fast enough to be irrelevant. The only thing that matters in this situation is how many times you're hitting the system allocator to allocate and free nodes and/or payload.

Share this post


Link to post
Share on other sites

I thought this was about iterations... Why are we talking about multiple allocations and copy constructors now? It's obvious that allocations are slow, since they usually end up being run by the OS, which runs an entire memory manager (or even just the heap manager, which is probably worse) which needs to do a bit of work before it can allocate the memory. Worst case scenario (although this is probably not the case here), there is no available physical memory, and it has to swap something else to disk to free some.

 

That's why I mentioned that std::list uses buckets - they also speed up the allocation (even more than just the iteration), if that is the new concern here. It's a really bad idea to ever allocate memory in a loop anyway. If you know how much memory you need beforehand, you should allocate it all at once, and use it. If you don't know how much you need, then just allocate a reasonable initial ammount, then add more when you need it. And by "reasonable", I mean "more than you think you need" in Release builds, and "as little as you can work with" in Debug builds, so you can catch any problems with your memory allocation during development. Std::list does this perfectly for you.

 

And if you need to copy entire lists of data element by element, then your entire design is probably flawed to begin with.

 

(Too much coffee. Going to bed...)

Edited by tonemgub

Share this post


Link to post
Share on other sites

std::list does not allocate in buckets; it allocates per-element. You may be thinking of std::deque.

Maybe you're right about std::list, but I don't think std::string allocates every character separately. And by "allocate" I mean that it reserves the memory using the internal allocator, not that it also initializes it. So initialization (via copy constructor) has nothing to do with allocation. Maybe I'll check it out sometime, though. To me it sounds a bit too much that std::list would use additional "next" and "prev" pointers for every element, when it could speed up performance simply by using buckets.

 

I also found this: http://www.gamedev.net/topic/292334-how-does-stdlist-allocate-its-items/ .

And this: http://accu.org/index.php/journals/1308

Edited by tonemgub

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this