次の方法で共有


The Performance of Arrays

Stop me if you’ve heard this one, but here’s some information about how arrays perform, and a neat trick you can do to (possibly) get some performance back.

Some background

In .NET, arrays of reference types are covariant in their element type, but not safely. Eric, as always, has a post that goes into this more deeply if you want to refresh your memory. The upshot is that if you have a Derived[], you can convert it to a Base[] and use it that way. For instance,

 class Base { }
class Derived : Base { }
 
class Program
{
    static void Main()
    {
        Derived[] derivedArray = new Derived[10];
        // This is the covariant conversion
        Base[] baseArray = derivedArray;
 
        for (int i = 0; i < baseArray.Length; ++i)
        {
            // Putting a Derived into our Base[] is ordinary polymorphism
            baseArray[i] = new Derived();
        }
    }
}

Allowing this conversion from Derived[] to Base[] is not safe because now the compiler can’t help you know if your types match up when you set array elements. In the preceding example, I can put the Derived in the array, but imagine if I had another unrelated derived class, OtherDerived. Putting those things in the array must fail, since the actual array can contain only Deriveds, and OtherDeriveds are not Deriveds. In Eric’s perhaps more folksy example, “you can’t put a Turtle into an array of Giraffes.”

 class Base { }
class Derived : Base { }
class OtherDerived : Base { }
 
class Program
{
    static void Main()
    {
        Derived[] derivedArray = new Derived[10];
        // This is the covariant conversion
        Base[] baseArray = derivedArray;
 
        for (int i = 0; i < baseArray.Length; ++i)
        {
            // Putting a OtherDerived into our Base[] is ordinary
            // polymorphism. However, this is going to cause a runtime
            // exception because the actual array won't be able to
            // store an OtherDerived.
            baseArray[i] = new OtherDerived();
            // Unhandled Exception: System.ArrayTypeMismatchException: 
            //   Attempted to access an element as a type incompatible
            // with the array.
        }
    }
}

Well now to the point I want to address: where did that exception come from? It came from the runtime, which was in a unique position to know what the real type of the array object was, and the real type of the element. It needed to determine if the assignment was allowable and throw if not. Now if you think about this for a minute, you can see that it’s going to have to perform that check for every assignment to an array element, right? The compiler just can’t know if any assignment to a covariant array is ever going to work, and the runtime is always going to have to be there to manage the potential for failure. Can we turn that off? Does it make sense to? What does the check cost you?

A workaround

Well, if it’s doing that check for every assignment to an array, that’s probably going to cost you something. Let’s see if we can get rid of the check to determine what the cost is. Remember I said that arrays are covariant only for reference types? We’re going to use that information. If we have an array of value types, the runtime isn’t going to have to perform that check anymore, since it knows that the IL the compiler emitted wasn’t going to allow assignment of anything but that value type to that array.

So if I want to have an array of reference types, but get the runtime to treat it like an array of value types, what have I got to do? Just wrap the reference type in a value type. There you go. That’s the trick. Let’s see what that does.

I’m going to create a simple value type that merely holds a reference. I am even going to make it generic. And to make this as dead-simple as I can, that’s all I’ll do. You could imagine a potentially easier-to-use version of this code that has implicit conversions and nice constructors, but let’s not do any of that.

 struct Reference<T> where T : class
{
    public T Value;
}

Ok, so, wow, what is that thing? It’s no bigger than the reference we started with. And if I want to make an array of them in order to get rid of covariance, I can. And here’s how I’d use it.

 class Base { }
class Derived : Base { }
 
class Program
{
    static void Main()
    {
        Reference<Base>[] baseArray = new Reference<Base>[10];
 
        for (int i = 0; i < baseArray.Length; ++i)
        {
            baseArray[i] = new Reference<Base> { Value = new Derived() };
        }
    }
}

Now I have an array assignment that doesn’t need the runtime to perform a check anymore, and the reason is because it’s impossible for there to be anything in “baseArray” except for an array of the exact type I said. The struct-ness of Reference<T> prevents the possibility of anyone having performed a covariant conversion as existed in our previous examples. So I got rid of that check, right? But I added the initialization of a value type. Did I win or lose? Let’s do some timing to figure it out.

The experiment

Here’s my code. I’m going to time two different things: First, I’ll put a lot of Deriveds into a Base[], and then I’ll put a lot of Reference<Base>s that hold Deriveds into a Reference<Base>. I’m using huge arrays just because I want to prove to myself that the jitter isn’t going to lift some of these operations out of the loop. And I’m producing Release binaries that will run on the 32-bit framework on my 64-bit machine. I’ll time each example a few times to be sure. Here’s the code.

 class Base { }
class Derived : Base { }
 
struct Reference<T> where T : class
{
    public T Value;
}
 
class Program
{
    const int Count = 1 << 27;
 
    static void Main()
    {
        Stopwatch watch = new Stopwatch();
 
        TestReferenceArray(watch);
        TestReferenceArray(watch);
        TestReferenceArray(watch);
 
        TestValueArray(watch);
        TestValueArray(watch);
        TestValueArray(watch);
    }
 
    static Base[] TestReferenceArray(Stopwatch watch)
    {
        Base[] array = new Base[Count];
        var element = new Derived();
 
        watch.Restart();
        for (int i = 0; i < Count; ++i)
        {
            array[i] = element;
        }
        watch.Stop();
        Console.WriteLine(watch.ElapsedMilliseconds);
 
        return array;
    }
 
    static Reference<Base>[] TestValueArray(Stopwatch watch)
    {
        Reference<Base>[] array = new Reference<Base>[Count];
        var element = new Derived();
 
        watch.Restart();
        for (int i = 0; i < Count; ++i)
        {
            array[i] = new Reference<Base> { Value = element };
        }
        watch.Stop();
        Console.WriteLine(watch.ElapsedMilliseconds);
 
        return array;
    }
}

Have a seat. Here’s what it prints out.

2507

2471

2474

595

607

593

Wow! That’s a huge time difference, and a big win. It’s FOUR TIME FASTER! I wouldn’t blame you for thinking that you have to go to your code base right now and eliminate arrays of reference types from it; but hold on a second. That’s not the whole story.

Issue #1: Are the types exact?

In the example that I timed, I was putting a Derived into arrays with element type Base. The types weren’t the same, and therefore the check that the runtime performed had to be complicated by doing something like walking an inheritance hierarchy. But what if I were putting Bases into the Base array. What do the times look like then? (I won’t replicate the code here; just replace “new Derived()” with “new Base()” everywhere).

805

775

771

593

593

599

Oh, that’s much more reasonable. The lesson here is that for my scenario, if I’m not making use of polymorphism, I have much less to be worried about.

Issue #2: Which runtime exactly?

I said that I was doing tests on the 32-bit framework on my 64-bit machine. I ensured that by building my binaries “x86,” which is the default for VS2010. What if I instead build them “Any CPU” so that I can run the test on the 64-bit framework? Under those conditions, I am swapping out the whole stack that my tests run on for a whole different runtime. Let’s see what the numbers are then.

2102

2057

2055

789

784

784

Ok, still faster, but not by as big a factor. What about for the exact type case?

861

848

846

792

790

817

Nearly the same! So my example seems to be not as big a deal for 64-bit processes. I am not sure why. There is another oddity here that I can’t explain: when I build Debug binaries, the Reference<T> trick makes my example about three times SLOWER. I have no idea why.

Issue #3: My example is not your scenario

I tried to make the thing I was timing as simple as possible. But I’m using relatively a lot of array assignments. Do you do that much? Probably not. I’m not doing any I/O and not waiting on any user interaction, are you? I also have a giant array and I’m clearly blowing out the CPU cache in my for loops. Does your code do that? I’m assigning arrays in a tight loop with a known upper bound. Are you? Does the jitter lift the bounds checking out of the loop, and how would that check compare with the assignment compatibility check?

My point is that as with any performance timing, there are a lot of variables and a lot of considerations. If you want to make your program perform better using this trick, you need to try it and then time it yourself. I am not saying that .NET arrays are slower than they need to be or unsuited for your application. But it’s a fun experiment to try; how many milliseconds faster can you make your scenario, and is it worth the cost? Were you using the covariance of arrays in your code, and therefore ineligible to even consider this trick?

Why?

Why does C# have these unsafe covariant arrays to begin with? I was not here at the time, but I can confidently say that the CLR needed them because they needed to support languages that have array covariance. One example of such a language (in fact, probably the interesting example) is Java. C# has them both because the CLR has them and because it was a clear goal to that if users were porting existing code to C#, they were met with as few hurdles as possible. Also, there would have been complications with supporting both covariant arrays and invariant arrays in the same runtime; imagine the confusion users and compiler-writers would face trying to interop between the two. The language design notes from 1999 bear out these claims.

Comments

  • Anonymous
    June 11, 2010
    The comment has been removed
  • Anonymous
    June 11, 2010
    The comment has been removed
  • Anonymous
    June 14, 2010
    Also worth noting that type checking is not applicable to some common cases where the JIT can prove that declared type of variable and actual array type match. For example, if you have a local array variable, and initialize or assign it with the result of a new[] expression - and never pass it as a ref or out parameter to anything.
  • Anonymous
    June 27, 2010
    That is a Neat Post Chris.Interesting to learn one new optimization technique if in such a scenario.Thank you.
  • Anonymous
    June 28, 2010
    Interesting post. It really helps a lot.
  • Anonymous
    June 29, 2010
    Is this really a workaround for the type checking? It looks to me like you've done the same as starting with an array of Base in the first place. All of your value type arrays would accept new Reference<Base>{ Value = new OtherDerived;} which is what is excluded by the type check.Isn't this is analogous to initially declaring a Base[]? If there is no covariant array assignment, does the type check still occur?
  • Anonymous
    June 29, 2010
    Really optimization is very big concern, Thanks for doing experiment with that.
  • Anonymous
    June 30, 2010
    I don't think there is any need to initializate your value instances (Reference<T>). These will already be initialised by the runtime to the default state. Hence one can further optimise this by removing the value type initialization, and simply use assignment.vis:           array[i] = new Reference<Base> { Value = element };becomes:           array[i].Value = element;This in term renders a further performance improvement (of roughly a third on my machine).
  • Anonymous
    July 01, 2010
    Great post, however the use case presented here is not typical. All elements in the arrays reference the same object. Developers typically use each element in an array to store a unique object.If you change lines 34 and 50 to array[i] = new Derived(); and array[i].Value = new Derived();, you’ll find that the regular covariant arrays outperform the Reference<Base> arrays (by a factor of 3 on my machine).I don’t know if this proves that regular covariant arrays are optimal in the typical use case or that it’s just more expensive to create new objects that are assigned to reference fields of a generic struct.
  • Anonymous
    July 02, 2010
    Was there a change in the C# compiler related to this concept?  It seems that the C# 3.0 compiler with VS2008 runs the Reference<T> version about twice as slow as the standard version, while VS2010 and C# 4.0 has the Reference<T> version 2 to 3 times as fast.
  • Anonymous
    July 02, 2010
    It was only an issue when VS2008 was compiled for AnyCPU.  I checked again and VS2008 worked the same when it was compiled as x86.  But VS2010 seemed to run fine as x86 and x64.
  • Anonymous
    July 06, 2010
    Wow, I treid this and got similar results as original author on x86, but on x64 the numbers for TestValueArray were 3-4 times slower!  (I'm using VS2008, Windows 7 64 bit).
  • Anonymous
    July 09, 2010
    Thanks for this post, I finally understand the whole unsafe covariance with arrays thing. :)
  • Anonymous
    July 09, 2010
    @Darren, in a broad sense, for arrays of reference types, the type check always occurs. Given any array reference, in general, the runtime doesn't know whether some covariant conversion occured in the past. (I say generally because obviously it is possible for the JIT to know otherwise in some limited scenarios, such as the ones Jon and Pavel mention).@Mike B, good call. I wish I had written that code instead. =)@Sunny, I intentionally factored the intialization out of the loop because it's not what I wanted to measure. I am surprised if you're seeing this trick doing the opposite of what's intended in a Release build; I suspect you ran this in Debug, am I right?@Matthew, the C# 3 compiler produces more awkward temp management in IL than the C# 4 compiler for the code I posted. For Mike B's improvement, the IL is the same. If you're running your binaries on the 2.0 runtime and the 4.0 runtime respectively, it's hard to say what is attributable to the compiler and what to the runtime.
  • Anonymous
    July 09, 2010
    The comment has been removed
  • Anonymous
    July 15, 2010
    @Mike Barray[i].Value = element;Correct me if I am wrong in my understanding but as Reference<T> is value type would not your above optimisation fail because actually it is setting the value on a copy. The below code given in the example is correct.array[i] = new Reference<Base> { Value = element };
  • Anonymous
    July 15, 2010
    @Mike Barray[i].Value = element;Correct me if I am wrong in my understanding but as Reference<T> is value type would not your above optimisation fail because actually it is setting the value on a copy. The below code given in the example is correct.array[i] = new Reference<Base> { Value = element };
  • Anonymous
    July 18, 2010
    @mjfAlas, not as I understand it. The element held within the array is held "in place". Therefore, by setting the Value property of the element, we are impacting the original object. When one assigns an array element using the constructor, as per the article and as you have done here... the constructor will create a new instance of Reference<Base> and set the Value property on it, the assignement will then perform a value copy to the array's element.
  • Anonymous
    July 19, 2010
    I translated the entire code to Visual Basic .NET on Framework 3.5 (using VB 2008 Express and with the improvement of "array(i).value = element") and the Reference(of Derived) version was indeed 5 times faster. And just as observed earlier with C#, in the Debug version the Reference tric version was slower.And when I tried assigning new objects for every array element the Refererence version was two times slower compared to the standard array version, just as Sunny Ahuwanya observed earlier. Therefore it looks like these effects happen in the CLR and are not compiler issues.I also tried the code assigning more realistic objects (I used Labels). As was to be expected the amount of  time taken to instantiate the Label objects is much larger than instantiating the trivial Derived objects and any performance difference due to clever array techniques became negligible.It's an interesting technique, but unfortunately it does not look as if it is worth the trouble in realistic scenarios.
  • Anonymous
    July 19, 2010
    I translated the entire code to Visual Basic .NET on Framework 3.5 (using VB 2008 Express and with the improvement of "array(i).value = element") and the Reference(of Derived) version was indeed 5 times faster. And just as observed earlier with C#, in the Debug version the Reference tric version was slower.And when I tried assigning new objects for every array element the Refererence version was two times slower compared to the standard array version, just as Sunny Ahuwanya observed earlier. Therefore it looks like these effects happen in the CLR and are not compiler issues.I also tried the code assigning more realistic objects (I used Labels). As was to be expected the amount of  time taken to instantiate the Label objects is much larger than instantiating the trivial Derived objects and any performance difference due to clever array techniques became negligible.It's an interesting technique, but unfortunately it does not look as if it is worth the trouble in realistic scenarios.
  • Anonymous
    July 24, 2010
    What I really dislike about arrays in .NET is that you cannot encapsulate a fixed-size array into an object, say DIM A(&O2) AS INTEGER.  (Of course, such arrays would have to be passed by value always.)