Jump to content

  • Log In with Google      Sign In   
  • Create Account

Linked List vs std::vector


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
14 replies to this topic

#1 Hawkblood   Members   -  Reputation: 725

Like
0Likes
Like

Posted 06 April 2014 - 08:42 AM

What is it about std::vector that makes it so much faster than a linked list? This question was asked by someone else in an earlier post but not stated quite like this and I don't think a solution/reason was actually concluded. Here is what I used to test this out:

 

The declaration:

vector <int> testingVector;
struct LLTEST{
	LLTEST(){last=start=NULL;}
	struct LINKLIST{
		LINKLIST(int v,LINKLIST*p){val=v;next=NULL;if (p!=NULL) p->next=this;}
		LINKLIST *prev,*next;
		int val;
		void Destroy(void){
			if (next!=NULL) next->Destroy();
			delete this;
		}
	};
	LINKLIST *start,*last;
	void Append(int val);
	void LLDestroy(void);
}LLtest;
void LLTEST::Append(int val){
	if (last==NULL){//it's the first one
		start=new LINKLIST(val,NULL);
		last=start;
	}
	else{
		LINKLIST *tmp=new LINKLIST(val,last);
		last=tmp;
	}
}
void LLTEST::LLDestroy(void){
	start->Destroy();
}

first test with Linked List:

	GE->Timer.Update();
	__int64 ct=GE->Timer.TotalTime;

	for (int q=0;q<100000;q++){
		LLtest.Append(q);
	}
	LLtest.LLDestroy();
	GE->Timer.Update();
	tp1=float(GE->Timer.TotalTime-ct);

tp1 gives me roughly 26500 in total time.

 

Now for the std::vector:

	GE->Timer.Update();
	__int64 ct=GE->Timer.TotalTime;
	for (int q=0;q<100000;q++){
		testingVector.push_back(q);
	}
	testingVector.clear();

	GE->Timer.Update();
	tp1=float(GE->Timer.TotalTime-ct);

tp1 gives me around 2700.... That's an order of magnitude 10 difference!!!! What in my LL is wrong enough to cause this much of a difference?

 



Sponsor:

#2 fastcall22   Crossbones+   -  Reputation: 4461

Like
0Likes
Like

Posted 06 April 2014 - 09:12 AM

EDIT: What rip-off said tongue.png

Edited by fastcall22, 06 April 2014 - 09:21 AM.

c3RhdGljIGNoYXIgeW91cl9tb21bMVVMTCA8PCA2NF07CnNwcmludGYoeW91cl9tb20sICJpcyBmYXQiKTs=

#3 rip-off   Moderators   -  Reputation: 8737

Like
7Likes
Like

Posted 06 April 2014 - 09:16 AM

In this specific instance, there are probably two primary factors.

One is probably memory allocation / deallocation. Using something like "new" and "delete" involve invoking another algorithm on another data structure (the heap). The linked list calls new for every insertion and delete for every element, where as std::vector<> uses amortised growth, meaning that it allocates a certain sized backing store, and every time that fills up, it might double the size when allocating again. So each time it needs to grow, it essentially calls a new[], copies the current contents across, and delete[]s the old storage. Cleanup for std::vector in this case is a single delete[]. This means that very few (de)allocations are performed, most elements end up being constructed in reserved space. The downside is that when you're finished inserting elements, the vector may contain reserved memory that won't end up being used (there are techniques to trim this though, if necessary).

One way to avoid this affecting your test is to use a custom allocator that is much simpler.

Another factor is cache performance. The linked list uses a lot more memory, as each elements consists of the data plus a pointer (assuming you can drop the unused "prev" pointer). Memory is slow relative to the processor, so when the list is being destroyed, there are likely far more cache misses as the elements are cleaned up. In addition, it is possible that the default heap allocator will not given you a cache friendly memory layout, even when the allocations are done in a block like this. If your allocations end up scattered all over memory, then this makes the scenario worse, as lots of data that gets optimistically loaded into the cache cannot be used (cache is designed with linear access in mind).

Now, this is for this specific test. It is possible that in other tests, particularly ones involving removing elements at an arbitrary location in the container, or tests involving an object which is itself expensive to copy, the linked list could perform similarly, or even better, than std::vector, at least at a "sweet spot" of data set size. Algorithmic complexity theory teaches that linked lists are much better at this, but for large data sets on modern computers the cache effects become so dominant that std::vector will probably win past a certain point.

It really depends. This is a very artificial test, most applications do not behave this way. There will be a mixture of insertions, deletions, and iteration. These operations may have a nice pattern which can be exploited.

Finally, you can use std::list<> for a pre-made linked list, rather than rolling your own.

Edited by rip-off, 06 April 2014 - 09:18 AM.


#4 plainoldcj   Members   -  Reputation: 848

Like
5Likes
Like

Posted 06 April 2014 - 10:10 AM

This issue is covered by a lot of good talks.
See by Stroustrup or
http://channel9.msdn.com/Events/Build/2014/2-661 by Sutter, using the
same example.

#5 Hawkblood   Members   -  Reputation: 725

Like
0Likes
Like

Posted 06 April 2014 - 12:11 PM

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies. I was initially worried about the vector resizing each time I add new data, but I can just use vector::resize(...) to initialize it if I know I'm going to have a lot of elements.....



#6 cdoubleplusgood   Members   -  Reputation: 848

Like
0Likes
Like

Posted 06 April 2014 - 01:29 PM


So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies.

There are advantages in specific scenarios. E.g. an iterator to a list element does not get invalid when other elements are inserted or removed. Or, as rip-off said, elements of lists do not get copied / moved, when other elements are inserted or deleted.



#7 ApochPiQ   Moderators   -  Reputation: 16401

Like
2Likes
Like

Posted 06 April 2014 - 01:50 PM

The advantages should be considered in light of implementation complexity, not performance. A naive linked list will basically never outperform a cache-coherent data structure. However, if iterator invalidation poses problems for you, or excessive copying on insert/remove is a problem, they can be useful.

As far as over-copying: profile first, and profile after using a list. Never assume that one solution is faster than another without testing it.

#8 Hawkblood   Members   -  Reputation: 725

Like
0Likes
Like

Posted 06 April 2014 - 04:01 PM

How about something more specific-- a particle system where particles are created and destroyed on a regular basis. I think performance would trump any other concerns, right?



#9 ApochPiQ   Moderators   -  Reputation: 16401

Like
1Likes
Like

Posted 06 April 2014 - 04:46 PM

So? There are many systems where performance is more important than other factors. The existence of such systems in no way changes the answers provided in this thread.

#10 Hodgman   Moderators   -  Reputation: 31851

Like
4Likes
Like

Posted 06 April 2014 - 05:41 PM

I work as an engine consultant, which involves a lot of optimization work -- the first thing on my mind *isnt* "how can I make the CPU do this work faster", it's "how can I get this data to/from the CPU faster"!!
CPUs are incredibly fast these days, but memory is incredibly slow. In relative terms, e.g. Bytes moved per CPU clock, memory is actually getting slower and slower every year, so our focus is continually shifting away from counting CPU cycles and towards micro-optimizing the data layouts of our structures ;(

How about something more specific-- a particle system where particles are created and destroyed on a regular basis. I think performance would trump any other concerns, right?

for that kind of usage, a pool is common. You can allocate a large contiguous array of objects, but use a trick where when a object isn't in use, it is reinterpreted as a node in a linked list - called a free list.
To spawn a new particle/etc, you just remove the first node from the free list and cast it to your object type. To release an object, you cast it to a node and push it onto the front of the list.
When properly implemented, this is one of the fastest possible memory allocators around, plus it lets you run your system-update loop on a large contiguous array, whih is great for cache performance.

#11 Bacterius   Crossbones+   -  Reputation: 9286

Like
1Likes
Like

Posted 06 April 2014 - 06:18 PM

Linked lists have applications, for instance in kernels, where it may be physically impossible to get all the list items in a nice preallocated array in memory (for instance, in interrupt handlers, where the relevant pieces of data may be strewn across arbitrarily many address spaces throughout various processes and threads). In that case a linked list or other indirect data structure may be the only option available. But if you have the ability to modify your data layout so that it is laid out contiguously and coherently in memory, you will see really big gains using an array data structure because of cache coherency, the memory access pattern is predictable and that is what CPU's are good at, rather than fetching random bits of memory scattered all over the place. In most cases you can do that, so that is why linked lists are not too popular, they have a rather large constant factor that typically makes them worse than simple arrays despite any theoretical advantages they may have, and that constant factor is growing over time as memory becomes slower and slower compared to CPU speed.


The slowsort algorithm is a perfect illustration of the multiply and surrender paradigm, which is perhaps the single most important paradigm in the development of reluctant algorithms. The basic multiply and surrender strategy consists in replacing the problem at hand by two or more subproblems, each slightly simpler than the original, and continue multiplying subproblems and subsubproblems recursively in this fashion as long as possible. At some point the subproblems will all become so simple that their solution can no longer be postponed, and we will have to surrender. Experience shows that, in most cases, by the time this point is reached the total work will be substantially higher than what could have been wasted by a more direct approach.

 

- Pessimal Algorithms and Simplexity Analysis


#12 Hawkblood   Members   -  Reputation: 725

Like
0Likes
Like

Posted 06 April 2014 - 06:46 PM

Thank you for your responses. I'm not so advanced that I understand all the terms given, but I have very specific reasons for using a list/vector. I am wanting to store the data in as fast a system as possible for objects like particle systems.



#13 Satharis   Members   -  Reputation: 1264

Like
0Likes
Like

Posted 06 April 2014 - 07:50 PM

Simplest rule really is to prefer an array based collection like vector unless you gain some practical purpose from a list, although that really goes down to the old standby of "guessing isn't profiling" and you shouldn't really be worried about speed overtly in cases that seem sensible unless it shows to be a bottleneck.

Kinda goes down to the entire point of performance tuning code, there are a billion optimizations you can make to code but making a few small changes may yield more performance gains than a thousand small ones, which is why profiling is important.

But in regards to the basic question, it mainly comes down to the cache. Due to physical limitations with CPU design, cache can be expected to slowly become even more key in the future much as adding more cores became a better improvement a few years ago. Cache can make things that sound much slower in theory, be much faster in practicality.

#14 King Mir   Members   -  Reputation: 2050

Like
0Likes
Like

Posted 10 April 2014 - 09:03 PM

So what I got out of the replies is "NEVER USE LINKED LISTS". I can't see any reason to use them with the given information of your replies. I was initially worried about the vector resizing each time I add new data, but I can just use vector::resize(...) to initialize it if I know I'm going to have a lot of elements.....

The advantage of a linked list on a modern pc is not performance, but for the ability for iterators and addresses to remain valid when the container changes. So yes, you should prefer a dynamic array when you can, but there are cases when using a list is simpler. In some cases you can use a vector with a reserved size, but that's not such a good solution when the typical case has only a few elements. Note, if you just need addresses to remain valid on resize, a deque may be more suitable.



#15 Rattrap   Members   -  Reputation: 1788

Like
0Likes
Like

Posted 10 April 2014 - 10:07 PM


http://channel9.msdn.com/Events/Build/2014/2-661

 

I watched this earlier this week.  The stuff he talked about prefetching is pretty interesting, especially in regards to insertion/deletion and accessing a array/vector (contiguous memory) vs a link-list (random memory).

 

In the case of an insert into a link list, you have to traverse the list until you find the node you wish to insert after, Create the new node, then update the "bookkeeping" to the existing nodes.  Since the element locations are random, plus have the bookkeeping overhead, they don't cache well.

 

In the case of random inserts into an array/vector, you still have to traverse the list until you find where you want to insert the new element.  The difference this time is that the processors prefetcher can pretty accurately guess that you are going to be accessing the next memory locations and try to have the elements available in the processors caches, which speeds up locating the insertion point.  Insertion still requires copying of all of the elements after the new location, but since this again is a linear operation, the prefetcher can make this run very fast in the processor cache.

 

This is not a guarantee for all situations, but in cases where very linear operations are taking place, the array/vector is going to win almost every time.






Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS