• Create Account

## Linked List vs std::vector

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

14 replies to this topic

### #1Hawkblood  Members

Posted 06 April 2014 - 08:42 AM

What is it about std::vector that makes it so much faster than a linked list? This question was asked by someone else in an earlier post but not stated quite like this and I don't think a solution/reason was actually concluded. Here is what I used to test this out:

The declaration:

vector <int> testingVector;
struct LLTEST{
LLTEST(){last=start=NULL;}
int val;
void Destroy(void){
if (next!=NULL) next->Destroy();
delete this;
}
};
void Append(int val);
void LLDestroy(void);
}LLtest;
void LLTEST::Append(int val){
if (last==NULL){//it's the first one
last=start;
}
else{
last=tmp;
}
}
void LLTEST::LLDestroy(void){
start->Destroy();
}



first test with Linked List:

	GE->Timer.Update();
__int64 ct=GE->Timer.TotalTime;

for (int q=0;q<100000;q++){
LLtest.Append(q);
}
LLtest.LLDestroy();
GE->Timer.Update();
tp1=float(GE->Timer.TotalTime-ct);



tp1 gives me roughly 26500 in total time.

Now for the std::vector:

	GE->Timer.Update();
__int64 ct=GE->Timer.TotalTime;
for (int q=0;q<100000;q++){
testingVector.push_back(q);
}
testingVector.clear();

GE->Timer.Update();
tp1=float(GE->Timer.TotalTime-ct);



tp1 gives me around 2700.... That's an order of magnitude 10 difference!!!! What in my LL is wrong enough to cause this much of a difference?

### #2fastcall22  Moderators

Posted 06 April 2014 - 09:12 AM

EDIT: What rip-off said

Edited by fastcall22, 06 April 2014 - 09:21 AM.

zlib: eJzVVLsSAiEQ6/1qCwoK i7PxA/2S2zMOZljYB1TO ZG7OhUtiduH9egZQCJH9 KcJyo4Wq9t0/RXkKmjx+ cgU4FIMWHhKCU+o/Nx2R LEPgQWLtnfcErbiEl0u4 0UrMghhZewgYcptoEF42 YMj+Z1kg+bVvqxhyo17h nUf+h4b2W4bR4XO01TJ7 qFNzA7jjbxyL71Avh6Tv odnFk4hnxxAf4w6496Kd OgH7/RxC

### #3rip-off  Moderators

Posted 06 April 2014 - 09:16 AM

POPULAR

In this specific instance, there are probably two primary factors.

One is probably memory allocation / deallocation. Using something like "new" and "delete" involve invoking another algorithm on another data structure (the heap). The linked list calls new for every insertion and delete for every element, where as std::vector<> uses amortised growth, meaning that it allocates a certain sized backing store, and every time that fills up, it might double the size when allocating again. So each time it needs to grow, it essentially calls a new[], copies the current contents across, and delete[]s the old storage. Cleanup for std::vector in this case is a single delete[]. This means that very few (de)allocations are performed, most elements end up being constructed in reserved space. The downside is that when you're finished inserting elements, the vector may contain reserved memory that won't end up being used (there are techniques to trim this though, if necessary).

One way to avoid this affecting your test is to use a custom allocator that is much simpler.

Another factor is cache performance. The linked list uses a lot more memory, as each elements consists of the data plus a pointer (assuming you can drop the unused "prev" pointer). Memory is slow relative to the processor, so when the list is being destroyed, there are likely far more cache misses as the elements are cleaned up. In addition, it is possible that the default heap allocator will not given you a cache friendly memory layout, even when the allocations are done in a block like this. If your allocations end up scattered all over memory, then this makes the scenario worse, as lots of data that gets optimistically loaded into the cache cannot be used (cache is designed with linear access in mind).

Now, this is for this specific test. It is possible that in other tests, particularly ones involving removing elements at an arbitrary location in the container, or tests involving an object which is itself expensive to copy, the linked list could perform similarly, or even better, than std::vector, at least at a "sweet spot" of data set size. Algorithmic complexity theory teaches that linked lists are much better at this, but for large data sets on modern computers the cache effects become so dominant that std::vector will probably win past a certain point.

It really depends. This is a very artificial test, most applications do not behave this way. There will be a mixture of insertions, deletions, and iteration. These operations may have a nice pattern which can be exploited.

Finally, you can use std::list<> for a pre-made linked list, rather than rolling your own.

Edited by rip-off, 06 April 2014 - 09:18 AM.

### #4plainoldcj  Members

Posted 06 April 2014 - 10:10 AM

POPULAR

This issue is covered by a lot of good talks.
See by Stroustrup or
http://channel9.msdn.com/Events/Build/2014/2-661 by Sutter, using the
same example.

### #5Hawkblood  Members

Posted 06 April 2014 - 12:11 PM

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies. I was initially worried about the vector resizing each time I add new data, but I can just use vector::resize(...) to initialize it if I know I'm going to have a lot of elements.....

### #6cdoubleplusgood  Members

Posted 06 April 2014 - 01:29 PM

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies.

There are advantages in specific scenarios. E.g. an iterator to a list element does not get invalid when other elements are inserted or removed. Or, as rip-off said, elements of lists do not get copied / moved, when other elements are inserted or deleted.

### #7ApochPiQ  Moderators

Posted 06 April 2014 - 01:50 PM

The advantages should be considered in light of implementation complexity, not performance. A naive linked list will basically never outperform a cache-coherent data structure. However, if iterator invalidation poses problems for you, or excessive copying on insert/remove is a problem, they can be useful.

As far as over-copying: profile first, and profile after using a list. Never assume that one solution is faster than another without testing it.
Wielder of the Sacred Wands

### #8Hawkblood  Members

Posted 06 April 2014 - 04:01 PM

How about something more specific-- a particle system where particles are created and destroyed on a regular basis. I think performance would trump any other concerns, right?

### #9ApochPiQ  Moderators

Posted 06 April 2014 - 04:46 PM

So? There are many systems where performance is more important than other factors. The existence of such systems in no way changes the answers provided in this thread.
Wielder of the Sacred Wands

### #10Hodgman  Moderators

Posted 06 April 2014 - 05:41 PM

I work as an engine consultant, which involves a lot of optimization work -- the first thing on my mind *isnt* "how can I make the CPU do this work faster", it's "how can I get this data to/from the CPU faster"!!
CPUs are incredibly fast these days, but memory is incredibly slow. In relative terms, e.g. Bytes moved per CPU clock, memory is actually getting slower and slower every year, so our focus is continually shifting away from counting CPU cycles and towards micro-optimizing the data layouts of our structures ;(

How about something more specific-- a particle system where particles are created and destroyed on a regular basis. I think performance would trump any other concerns, right?

for that kind of usage, a pool is common. You can allocate a large contiguous array of objects, but use a trick where when a object isn't in use, it is reinterpreted as a node in a linked list - called a free list.
To spawn a new particle/etc, you just remove the first node from the free list and cast it to your object type. To release an object, you cast it to a node and push it onto the front of the list.
When properly implemented, this is one of the fastest possible memory allocators around, plus it lets you run your system-update loop on a large contiguous array, whih is great for cache performance.

### #11Bacterius  Members

Posted 06 April 2014 - 06:18 PM

Linked lists have applications, for instance in kernels, where it may be physically impossible to get all the list items in a nice preallocated array in memory (for instance, in interrupt handlers, where the relevant pieces of data may be strewn across arbitrarily many address spaces throughout various processes and threads). In that case a linked list or other indirect data structure may be the only option available. But if you have the ability to modify your data layout so that it is laid out contiguously and coherently in memory, you will see really big gains using an array data structure because of cache coherency, the memory access pattern is predictable and that is what CPU's are good at, rather than fetching random bits of memory scattered all over the place. In most cases you can do that, so that is why linked lists are not too popular, they have a rather large constant factor that typically makes them worse than simple arrays despite any theoretical advantages they may have, and that constant factor is growing over time as memory becomes slower and slower compared to CPU speed.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

### #12Hawkblood  Members

Posted 06 April 2014 - 06:46 PM

Thank you for your responses. I'm not so advanced that I understand all the terms given, but I have very specific reasons for using a list/vector. I am wanting to store the data in as fast a system as possible for objects like particle systems.

### #13Satharis  Members

Posted 06 April 2014 - 07:50 PM

Simplest rule really is to prefer an array based collection like vector unless you gain some practical purpose from a list, although that really goes down to the old standby of "guessing isn't profiling" and you shouldn't really be worried about speed overtly in cases that seem sensible unless it shows to be a bottleneck.

Kinda goes down to the entire point of performance tuning code, there are a billion optimizations you can make to code but making a few small changes may yield more performance gains than a thousand small ones, which is why profiling is important.

But in regards to the basic question, it mainly comes down to the cache. Due to physical limitations with CPU design, cache can be expected to slowly become even more key in the future much as adding more cores became a better improvement a few years ago. Cache can make things that sound much slower in theory, be much faster in practicality.

### #14King Mir  Members

Posted 10 April 2014 - 09:03 PM

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies. I was initially worried about the vector resizing each time I add new data, but I can just use vector::resize(...) to initialize it if I know I'm going to have a lot of elements.....

The advantage of a linked list on a modern pc is not performance, but for the ability for iterators and addresses to remain valid when the container changes. So yes, you should prefer a dynamic array when you can, but there are cases when using a list is simpler. In some cases you can use a vector with a reserved size, but that's not such a good solution when the typical case has only a few elements. Note, if you just need addresses to remain valid on resize, a deque may be more suitable.

### #15Rattrap  Members

Posted 10 April 2014 - 10:07 PM

http://channel9.msdn.com/Events/Build/2014/2-661

I watched this earlier this week.  The stuff he talked about prefetching is pretty interesting, especially in regards to insertion/deletion and accessing a array/vector (contiguous memory) vs a link-list (random memory).

In the case of an insert into a link list, you have to traverse the list until you find the node you wish to insert after, Create the new node, then update the "bookkeeping" to the existing nodes.  Since the element locations are random, plus have the bookkeeping overhead, they don't cache well.

In the case of random inserts into an array/vector, you still have to traverse the list until you find where you want to insert the new element.  The difference this time is that the processors prefetcher can pretty accurately guess that you are going to be accessing the next memory locations and try to have the elements available in the processors caches, which speeds up locating the insertion point.  Insertion still requires copying of all of the elements after the new location, but since this again is a linear operation, the prefetcher can make this run very fast in the processor cache.

This is not a guarantee for all situations, but in cases where very linear operations are taking place, the array/vector is going to win almost every time.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

Old topic!

Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.