[.net] Why is Collections.Generic.List so damn slow?

Started by
28 comments, last by Niksan2 16 years, 8 months ago
I was doing a lot of heavy indexer work in my program and noticed that accessing elements in an Array seemed a hell of a lot faster then accessing them in a List. Anyway I wrote this little test program to time how long they take.

using System;
using Collections = System.Collections.Generic;
using Threading = System.Threading;

namespace Bob
{
	
	public static class Application
	{
		
		[STAThread]
		public static void Main( string[] ArgumentArray )
		{
			int Ticks = 0;
			int[] Array = new int[320];
			Collections.List<int> List = new Collections.List<int>();
			for( int x = 0; x < Array.Length; x++ )
			{
				List.Add( x );
			}
			
			Threading.Thread.Sleep( 1000 );
			Ticks = System.Environment.TickCount;
			for( int i = 0; i < 5000000; i++ )
			{
				for( int j = 0; j < Array.Length; j++ )
				{
					Array[j] = j;
				}
			}
			Console.WriteLine( "Array performance: {0}", System.Environment.TickCount - Ticks );
			
			Threading.Thread.Sleep( 1000 );
			Ticks = System.Environment.TickCount;
			for( int i = 0; i < 5000000; i++ )
			{
				for( int j = 0; j < List.Count; j++ )
				{
					List[j] = j;
				}
			}
			Console.WriteLine( "List performance: {0}", System.Environment.TickCount - Ticks );
			
			Console.ReadLine();
			return;
		}
		
	}
	
}




The results where 2620 for the Array and 13200 for the List. I was under the impression that the .Net 2.0 List class operated similar to a C++ std::vector and that there would be very little(or no) difference between accessing it and an array. I wrote a simple class to mimic the C++ std::vector to see how it could perform.

		public sealed class Vector<T>
		{
			
			private T[] Array = null;
			
			public int Size
			{
				get
				{
					return SizeProperty;
				}
				private set
				{
					SizeProperty = value;
				}
			}
			
			private int SizeProperty = 0;
			
			public T this[int Index]
			{
				get
				{
					return Array[Index];
				}
				set
				{	
					Array[Index] = value;
				}
			}

			public Vector()
			{
				Array = new T[8];
				return;
			}
			
			public Vector( int Reserved )
			{
				Array = new T[Reserved];
				return;
			}
			
			public void Add( T Value )
			{
				Array[Size] = Value;
				Size++;
				if( Size >= Array.Length )
				{
					System.Array.Resize<T>( ref Array, Array.Length << 1 );
				}
				return;
			}
			
			public T[] ToArray()
			{
				if( Size == Array.Length )
				{
					return Array;
				}
				T[] NewArray = new T[Size];
				System.Array.Copy( Array, NewArray, Size );
				return NewArray;
			}
			
		}




Using the same style test as the program above, the result was exactly the same as the Arrays. Can any of the .Net gurus explain how the List class works and why its indexer performance is so bad? I did run the tests multiple times to make sure some other process wasn't hogging the CPU and messing it up and I got around the same results every time. I'm on an Athlon64 at 2Ghz, maybe it was something to do with the way the JIT compiled it. Perhaps other people could run the test program and post their results. Edit: Changed loop size from 10000000 to 5000000 to match other posts. [Edited by - Scet on July 22, 2007 4:18:04 PM]
Advertisement
Have you tried running the tests on a reference type instead of a value type?
Quote:Original post by Headkaze
Have you tried running the tests on a reference type instead of a value type?


Not until now, here are the results from assigning a simple class(5000000 times):

Array: 13719
List: 24515
Vector: 13172

I ran the tests multiple times using 5000000, 10000000 and 20000000 loops, and every time the numbers had the same ratios to each other(10000000 loops equals ~26000 vs ~48000) so I don't think there's some number where the List would overtake them.
Quote:Original post by Headkaze
Have you tried running the tests on a reference type instead of a value type?


The point in the generic list is supposed to be that it handles value types without boxing them, so that shouldn't make a difference.

And I've never benchmarked it, but I'd expected List to perform better than this... Looking forward to see if anyone can come up with explanations :D

(By the way, I assume you are running release builds?)
Quote:Original post by Spoonbender
(By the way, I assume you are running release builds?)


Yes, although the debug version doesn't really change anything other then increasing all the results.

I've run two more tests, one using structs and the other using the same struct as a nullable type(with a ? on the end of the type), both use 5000000 loops.

Results from the struct test. This is weird, I expected it to perform similar to the first one(using an integer) since they're both value types, but the Array beats both the List and Vector.

Array: 4600
List: 12750
Vector: 12700

Results from nullable types test:

Array: 24500
List: 36000
Vector: 30000
Quote:Original post by Spoonbender
The point in the generic list is supposed to be that it handles value types without boxing them, so that shouldn't make a difference.


That's what I thought too, but worth a try I guess.

I'm also interested in any explaination of this.
I'm putting forward a vote for crappy code generation. I broke your code into functions and looked at the ildasm output. And yes, this is release. The functions:
    static void IndexArray(int [] array) {      for (int i = 0; i < 10000; i++)      {        for (int j = 0; j < array.Length; j++)        {          array[j] = j;        }      }    }    static void IndexList(Collections.List<int> list) {      for (int i = 0; i < 10000; i++)      {        for (int j = 0; j < list.Count; j++)        {          list[j] = j;        }      }    }

IndexArray's MSIL:
.method private hidebysig static void  IndexArray(int32[] 'array') cil managed{  // Code size       35 (0x23)  .maxstack  3  .locals init ([0] int32 i,           [1] int32 j)  IL_0000:  ldc.i4.0  IL_0001:  stloc.0  IL_0002:  br.s       IL_001a  IL_0004:  ldc.i4.0  IL_0005:  stloc.1  IL_0006:  br.s       IL_0010  IL_0008:  ldarg.0  IL_0009:  ldloc.1  IL_000a:  ldloc.1  IL_000b:  stelem.i4  IL_000c:  ldloc.1  IL_000d:  ldc.i4.1  IL_000e:  add  IL_000f:  stloc.1  IL_0010:  ldloc.1  IL_0011:  ldarg.0  IL_0012:  ldlen  IL_0013:  conv.i4  IL_0014:  blt.s      IL_0008  IL_0016:  ldloc.0  IL_0017:  ldc.i4.1  IL_0018:  add  IL_0019:  stloc.0  IL_001a:  ldloc.0  IL_001b:  ldc.i4     0x2710  IL_0020:  blt.s      IL_0004  IL_0022:  ret} // end of method Application::IndexArray

IndexList's MSIL:
.method private hidebysig static void  IndexList(class [mscorlib]System.Collections.Generic.List`1<int32> list) cil managed{  // Code size       42 (0x2a)  .maxstack  3  .locals init ([0] int32 i,           [1] int32 j)  IL_0000:  ldc.i4.0  IL_0001:  stloc.0  IL_0002:  br.s       IL_0021  IL_0004:  ldc.i4.0  IL_0005:  stloc.1  IL_0006:  br.s       IL_0014  IL_0008:  ldarg.0  IL_0009:  ldloc.1  IL_000a:  ldloc.1  IL_000b:  callvirt   instance void class [mscorlib]System.Collections.Generic.List`1<int32>::set_Item(int32,                                                                                                        !0)  IL_0010:  ldloc.1  IL_0011:  ldc.i4.1  IL_0012:  add  IL_0013:  stloc.1  IL_0014:  ldloc.1  IL_0015:  ldarg.0  IL_0016:  callvirt   instance int32 class [mscorlib]System.Collections.Generic.List`1<int32>::get_Count()  IL_001b:  blt.s      IL_0008  IL_001d:  ldloc.0  IL_001e:  ldc.i4.1  IL_001f:  add  IL_0020:  stloc.0  IL_0021:  ldloc.0  IL_0022:  ldc.i4     0x2710  IL_0027:  blt.s      IL_0004  IL_0029:  ret} // end of method Application::IndexList

So the List generic version is actually making function calls in the generated code.
It would be interesting to put the ArrayList in for comparison as well.
@SiCrane:

Yes of course there has to be some extra instructions it to be slower. The question is why they're there. I know my Vector class can't do everything the List one can(Remove etc.), but I expected indexers to be at least close in performance.

@Headkaze:

Results from tests simialr to the first one except with an ArrayList

Integers(value type):

Array: 1860
ArrayList: 48400
List: 14500
Vector: 3600

Classes(reference type):

Array: 13640
ArrayList: 32000
List: 24400
Vector: 13600

Yeah, stick to generics.
I found some more tests here

http://www.experts-exchange.com/Programming/Programming_Languages/Dot_Net/Q_22020171.html

System.Diagnostics.Stopwatch looks like a handy way to measure speeds.

This topic is closed to new replies.

Advertisement