Performance Quiz #9 : IList<T> List and array speed

Article
03/09/2006

A short and sweet quiz with lots of juicy discussion possibilities:

public int Sum(IList<ushort> indices)
{
int result = 0;
for (int i = 0; i < indices.Count; i++)
result += indices[i];
return result;
}

Considering only the time it takes to do the Sum (i.e. assuming we had already set up the array/list) which gives better performance and why?

// #1
ushort[] tmp = new ushort[500000]; // this doesn't count
Sum(tmp); // this is what we are timing

// #2
List<ushort> tmp = new List<ushort>(500000); // this doesn't count
for (int i = 0; i < 500000; i++) tmp.Add(0); // this doesn't count
Sum(tmp); // this is what we are timing

What say you gentle readers?

(my solution is now posted here)

Comments

Anonymous
March 09, 2006
Mmm, rough.

I want to say #2, because with #1 you've got an array of type ushort, while in the Sum method you're working with an IList<ushort> - so the array has to be cast to an IList<ushort>.

Just my intuition talking - I guess I'd need to time it myself to be sure.
Anonymous
March 09, 2006
The comment has been removed
Anonymous
March 09, 2006
The comment has been removed
Anonymous
March 09, 2006
The comment has been removed
Anonymous
March 09, 2006
Seems like they would perform the same. I expect that once built, List<ushort> is internally nothing but a ushort[].
Anonymous
March 09, 2006
I believe #1 will be faster.

With #1 .NET will optimise getting array values by removing any bounds checking. Because of the for loop it knows it will never access values beyond the length of the array.

The Item property on the List<ushort> in #2 will however will still do bounds checking for each get.
Anonymous
March 09, 2006
Actually, is bounds checking optimization done at runtime or during the compile? If it is during compiling then I don't think it wouldn't know that the IList<ushort> is an array.
Anonymous
March 09, 2006
I think #1 is faster.

Because both use the same sum jitted code for sum(), which means JIT doesn't make an optimzie code for array, and List<short> is just a wrappar of short[].
So #1 is faster for the overhead of List<>.
Anonymous
March 09, 2006
On the second thought, maybe I was wrong...

On #2, List<short> uses optimized code for array indexing. But #1 accesses an array via interface, which is slower than the optimized code.

So, hmm, I don't know!
Anonymous
March 09, 2006
I would assume that #1 if faster, since it has no management overhead, while the List<> does.
The cost of interface invocation (Count) is something that the List needs to handle as well, since it is handles an array internally.
Anonymous
March 09, 2006
Well, firstly the test code runs slower than necessary - this small change speeds it up by 50% or so:

public int Sum(IList<ushort> indices)
{
int result = 0;
int n = indices.Count; // cache the Count rather than accessing it on every loop iteration
for (int i = 0; i < n; i++)
result += indices[i];
return result;
}

Now, Im finding that the summing an array as an IList<> is about 4 times slower than summing a List<> as an IList, and that summing up an array directly is 2.5 tims faster than summing up an IList<>. Hard to imagine what is gobbling up that factor 10 in performance.

Is there some boxing/unboxing going on in there?
Anonymous
March 09, 2006
I've got the same result as damien's (#1 is 3-4x slower). I thought at least #1 was faster on debug build, but not.

I think accessing an array via interface is VERY slow than accessing directly, or the optimizer is pretty good to deal with array directly.
Anonymous
March 09, 2006
I mean, JIT makes a good code for accessing an
array directly, even in debug build.

I guess the following is what happen.
#1: accessing an array via interface(in sum(), slow, no optimization for array)
vs
#2: accessing an array directly(in List<>.this[], very fast, optimized for array) + vtable dispatch to invoke List<>.this[](very fast)
Anonymous
March 09, 2006
In Rico's Example, #2 is faster but this is actually twice as fast as #2 ...

ushort[] tmp = new ushort[500000]; // this doesn't count
int result = Sum(tmp);

public int Sum(ushort[] indices)
{
int result = 0;
int n = indices.Length-1; // cache the Count rather than accessing it on every loop iteration
for (int i=n;i>-1;i--) // Iterating thru Backwards is faster than Forwards
result += indices[i];
return result;
}

I have been experimenting with doing this type of iteration in JavaScript, where every optimization can mean exponential gains in performance. You can check out come of my tests here ... http://geekswithblogs.net/mparsons/archive/2006/03/02/71175.aspx#FeedBack

Check out the Test Page in my comment.
Anonymous
March 09, 2006
The 2nd commenter was correct that #2 runs faster, and also that changing the for loop to a foreach loop makes #1 way faster than #2. I'm really not sure why this happens though. Maybe in certain cases the JIT compiler thinks it has to do bounds checking on the array when it really didn't have to.
Anonymous
March 10, 2006
Not to be a stick-in-the-mud, but I kinda think 500k virtual-calls do count.
Anonymous
March 10, 2006
Great comments so far! I love this one, it's so juicy :)
Anonymous
March 10, 2006
I'm with Wesner on this one. The (internal) SZArrayHelper class is a class that contains the method bodies that is generated for (1d) arrays. The reason why #1 is more time consuming seems evident when you compare SZArrayHelper.get_Item's implementation to that of List<T>.get_Item. The former includes a cast (which incurs per item in your array), which is where I believe the overhead of time is spent.
Anonymous
March 10, 2006
Rico’s posted another performance quiz on his blog, which means that I get to run the F1 profiler on...
Anonymous
March 10, 2006
Hey Rico. I ran F1 on this scenario and I got some more data. But to be honest I'm still not 100% sure what these JIT_IsInstanceOfArray and ArrayIsInstanceOfNoGC function are doing in the array case. I have some guesses, but nothing that I'm sure of. Anyways, my run of the profiler is here:
http://blogs.msdn.com/ianhu/archive/2006/03/10/548778.aspx
Anonymous
March 10, 2006
The comment has been removed
Anonymous
March 10, 2006
Mono has similar results:

http://www.jprl.com/Blog/archive/development/mono/2006/Mar-10.html
Anonymous
March 11, 2006
KeithH:

but sum() accesses the array via IList<>.this[int], which doesn't return object, so no boxing around there.
Anonymous
March 11, 2006
I have tested these two with a very simple profiler (Stopwatch) and got dramatic performance differences.

Array/Generics = 852,33
The Array version is over 800 times slower.

I have used the following code:
IList<ushort> gentmp = new List<ushort>(ArraySize);
ushort[] arrtmp = new ushort[ArraySize];

public void Run()
{
long GenTicks;
long ArrayTicks;

Stopwatch watch = Stopwatch.StartNew();

Sum(gentmp); // this is what we are timing
watch.Stop();
GenTicks = watch.ElapsedTicks;

List<ushort> ArrCopy = new List<ushort>(ArraySize);
ArrCopy.AddRange(arrtmp);
watch = Stopwatch.StartNew();

Sum(arrtmp); // this is what we are timing

watch.Stop();
ArrayTicks = watch.ElapsedTicks;

Console.WriteLine("Array/Generics = {0:F2}", (double)ArrayTicks /(double) GenTicks);

}

Perhaps I have an obvious error here which I did not see yet. But the number I get seem to be consistent. What is very interesting if I copy the array completely into a generic list and use that one I get
List<ushort> ArrCopy = new List<ushort>(ArraySize);
ArrCopy.AddRange(arrtmp);
Sum(ArrCopy);

Array/Generics = 1,70

Nearly the same. I think there is massive overhead during the access of each element going on here. It is even cheaper to copy the whole thing and use that one if the array is big enough.

Yours,
Alois Kraus
Anonymous
March 11, 2006
The comment has been removed
Anonymous
March 11, 2006
One of the problems with this is if the assembly is marked as debug mode or a debugger attached during startup the JIT disables optimizations, and that will make a big difference.

The optimizations Mike Parsons does are very similar to what the JIT can do when it is allowed to optimize.
Anonymous
March 12, 2006
To sum it up. Every Array supports IList<T> with its specified type at compile time. If you pass it via the IList<T> interface to another function the CLR has special run time checks embedded to check if is an array or a reall generic list and an additional type check to ensure type safety. This is all done at run time for each get call which explains this factor of several hundred times slowdown. Is this the correct interpretation of Ianhu's profiling run?

Yours,

Alois Kraus
Anonymous
March 12, 2006
I'd think it just depends on what kinds of optimization are used (i.e. available and used, whereas not yet available would result in not currently being used).

Meanwhile this confuses me:
for (int i = 0; i < 500000; i++) tmp.Add(0); // this doesn't count

Even though it doesn't count, even a minimal amount of optimization should make it even less count ^_^
Anonymous
March 13, 2006
The comment has been removed
Anonymous
March 13, 2006
The comment has been removed
Anonymous
March 13, 2006
The comment has been removed
Anonymous
March 13, 2006
If you happen to know the exact type of the container, it's always useful to let JIT know it too. It can produce more optimized code. For comparison, I passed List<> into the following function Sum2, and it appeared to be much faster. No function calls are made inside Sum2(). JIT inlined it.

static public int Sum2(List<ushort> indices)
{
int result = 0;
for (int i = 0; i < indices.Count; i++)
result += indices[i];
return result;
}

Timing:
Sum(IList<ushort>)
time: 404
Sum2(List<ushort>)
time: 57

You can call it "abstraction penalty." Sum() function is generic, it operates on any container implementing IList<>, whereas Sum2() is specific to one concrete type List<>.

Interesting that the abstraction penalty cannot be avoided with a generic function:

static public int Sum3<T>(T indices) where T : IList<ushort>

The code generated here is no better than the one for the original Sum() function.
Anonymous
March 16, 2006
Last week I posted Performance Quiz #9&nbsp;for discussion. Well the comments have been just awesome...
Anonymous
January 23, 2007
Last week I posted Performance Quiz #9 for discussion. Well the comments have been just awesome and many

Share via

Performance Quiz #9 : IList<T> List and array speed

Comments

Additional resources