Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Did you know that C# foreach statement is your enemy in games development?

Published on
3,901 Points
1 Endorsement
Last Modified:
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
When a developer lands in the games industry he has to change his state of mind about performances. In this industry we have to perform a lot of operations in less than 33 milliseconds (30 FPS, frames per second), possibly tuning the logic and the art assets to achieve 60 FPS on standalone (Windows/Linux/Mac) and consoles (Xbox One/PS4) and that means rendering the scene content, computing physics and game logic in no more than 16 milliseconds! Not really an easy task, that's why in our industry every CPU tick counts really a lot.

So, what about the foreach statement? Well, this one is really bad, killing hundreds CPU ticks just to allow the programmer to write less code!  You think I'm exaggerating here? Let's have a look to some code to give definitive proof.

Let's open Visual Studio (I'm using VS2015 Enterprise to produce the compiled code below) and let's create a simple C# console app, and in that let's write the following very simple code:
    public class DemoRefType
        public List<Object> intList = new List<Object>();
        public void Costly()
            Object a = 0;
            foreach (int x in intList)
                a = x;
        public void Cheap()
            Object a = 0;
            for (int i = 0; i < intList.Count; i++)
                a = intList[i];

Open in new window

That's an easy one, right? these two methods perform the same job, but one costs a lot in term of CPU ticks... let's see why. I use ILSpy (http://ilspy.net/) to look into the compiled code, so let's analyze the IL (intermediate language) I get after Visual Studio builds it.

Let's start with the Cheap method:
.method public hidebysig 
    instance void Cheap () cil managed 
    // Method begins at RVA 0x2140
    // Code size 36 (0x24)
    .maxstack 2
    .locals init (
        [0] int32 i

    IL_0000: ldc.i4.0
    IL_0001: stloc.0
    IL_0002: br.s IL_0015
    // loop start (head: IL_0015)
        IL_0004: ldarg.0
        IL_0005: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
        IL_000a: ldloc.0
        IL_000b: callvirt instance !0 class [mscorlib]System.Collections.Generic.List`1<object>::get_Item(int32)
        IL_0010: pop
        IL_0011: ldloc.0
        IL_0012: ldc.i4.1
        IL_0013: add
        IL_0014: stloc.0
        IL_0015: ldloc.0
        IL_0016: ldarg.0
        IL_0017: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
        IL_001c: callvirt instance int32 class [mscorlib]System.Collections.Generic.List`1<object>::get_Count()
        IL_0021: blt.s IL_0004
    // end loop

    IL_0023: ret
} // end of method DemoRefType::Cheap

Open in new window

So, nothing odd in the above, it's pretty much what I would expect: a simple loop and a straight move of reference value, nothing more.

Now let's have a look to what we get in IL from the Costly method:
.method public hidebysig 
    instance void Costly () cil managed 
    // Method begins at RVA 0x20ec
    // Code size 53 (0x35)
    .maxstack 1
    .locals init (
        [0] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>

    IL_0000: ldarg.0
    IL_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
    IL_0006: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<object>::GetEnumerator()
    IL_000b: stloc.0
        IL_000c: br.s IL_001b
        // loop start (head: IL_001b)
            IL_000e: ldloca.s 0
            IL_0010: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::get_Current()
            IL_0015: unbox.any [mscorlib]System.Int32
            IL_001a: pop

            IL_001b: ldloca.s 0
            IL_001d: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::MoveNext()
            IL_0022: brtrue.s IL_000e
        // end loop

        IL_0024: leave.s IL_0034
    } // end .try
        IL_0026: ldloca.s 0
        IL_0028: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>
        IL_002e: callvirt instance void [mscorlib]System.IDisposable::Dispose()
        IL_0033: endfinally
    } // end handler

    IL_0034: ret
} // end of method DemoRefType::Costly

Open in new window

Well, well, well... it's many lines longer and it contains some quite nasty logic. As we can see it allocates a generic enumerator (IL_0006) that gets  disposed finally (IL_0028 to IL_002e), and that obviously is creating load on the GC (Garbage Collector). Is that it? Not really! We can also see (IL_0015) the nasty unbox operation, one of the most costly and slow in the framework! Please also note how the loop end is caught by the finally clause in case something happens (mostly an invalid casting), not really code we would write in the first place... and still we get it just using a foreach.

So, imagine to have a few of these in your game logic executing at every frame... obviously it's never simple code like in this example, so it will be way nastier than the result shown in this above.

We struggle already so much to keep our games above 30FPS while presenting beautiful artwork (really costly to render), and a lot of nice VFX (visual effects, definitely costly) and we all love to rely on the underlying physics engine to improve the overall gaming experience: all that costs quite a lot... so when it comes to the game logic we have to write, every clock cycle and CPU tick are so valuable... we cannot possibly waste any of them, so let's remember two rule of thumbs:
  • Language helpers that make it easier to code come with a performance cost
  • Always verify the efficiency of your code habits looking into the generated IL code
In the game industry we are all aiming at improving gamers' experiences, making it immersive as much as technically possible: gamers are quite demanding, so let's make sure that we always keep performance testing at the top of our coding practice, because losing even one frame in a second can be a failure factor from a market perspective.
© Copyright Giuseppe "Pino" De Francesco - 2016
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Join & Write a Comment

In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…
In response to a need for security and privacy, and to continue fostering an environment members can turn to for support, solutions, and education, Experts Exchange has created anonymous question capabilities. This new feature is available to our Pr…
Other articles by this author

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month