Linked List vs std::vector

Started by
13 comments, last by Rattrap 10 years ago

Linked lists have applications, for instance in kernels, where it may be physically impossible to get all the list items in a nice preallocated array in memory (for instance, in interrupt handlers, where the relevant pieces of data may be strewn across arbitrarily many address spaces throughout various processes and threads). In that case a linked list or other indirect data structure may be the only option available. But if you have the ability to modify your data layout so that it is laid out contiguously and coherently in memory, you will see really big gains using an array data structure because of cache coherency, the memory access pattern is predictable and that is what CPU's are good at, rather than fetching random bits of memory scattered all over the place. In most cases you can do that, so that is why linked lists are not too popular, they have a rather large constant factor that typically makes them worse than simple arrays despite any theoretical advantages they may have, and that constant factor is growing over time as memory becomes slower and slower compared to CPU speed.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

Advertisement

Thank you for your responses. I'm not so advanced that I understand all the terms given, but I have very specific reasons for using a list/vector. I am wanting to store the data in as fast a system as possible for objects like particle systems.

Simplest rule really is to prefer an array based collection like vector unless you gain some practical purpose from a list, although that really goes down to the old standby of "guessing isn't profiling" and you shouldn't really be worried about speed overtly in cases that seem sensible unless it shows to be a bottleneck.

Kinda goes down to the entire point of performance tuning code, there are a billion optimizations you can make to code but making a few small changes may yield more performance gains than a thousand small ones, which is why profiling is important.

But in regards to the basic question, it mainly comes down to the cache. Due to physical limitations with CPU design, cache can be expected to slowly become even more key in the future much as adding more cores became a better improvement a few years ago. Cache can make things that sound much slower in theory, be much faster in practicality.

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies. I was initially worried about the vector resizing each time I add new data, but I can just use vector::resize(...) to initialize it if I know I'm going to have a lot of elements.....

The advantage of a linked list on a modern pc is not performance, but for the ability for iterators and addresses to remain valid when the container changes. So yes, you should prefer a dynamic array when you can, but there are cases when using a list is simpler. In some cases you can use a vector with a reserved size, but that's not such a good solution when the typical case has only a few elements. Note, if you just need addresses to remain valid on resize, a deque may be more suitable.


http://channel9.msdn.com/Events/Build/2014/2-661

I watched this earlier this week. The stuff he talked about prefetching is pretty interesting, especially in regards to insertion/deletion and accessing a array/vector (contiguous memory) vs a link-list (random memory).

In the case of an insert into a link list, you have to traverse the list until you find the node you wish to insert after, Create the new node, then update the "bookkeeping" to the existing nodes. Since the element locations are random, plus have the bookkeeping overhead, they don't cache well.

In the case of random inserts into an array/vector, you still have to traverse the list until you find where you want to insert the new element. The difference this time is that the processors prefetcher can pretty accurately guess that you are going to be accessing the next memory locations and try to have the elements available in the processors caches, which speeds up locating the insertion point. Insertion still requires copying of all of the elements after the new location, but since this again is a linear operation, the prefetcher can make this run very fast in the processor cache.

This is not a guarantee for all situations, but in cases where very linear operations are taking place, the array/vector is going to win almost every time.

"I can't believe I'm defending logic to a turing machine." - Kent Woolworth [Other Space]

This topic is closed to new replies.

Advertisement