[Webinar] Streamline your web hosting managementRegister Today


Did you know that C# foreach statement is your enemy in games development?

Published on
4,166 Points
1 Endorsement
Last Modified:
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
When a developer lands in the games industry he has to change his state of mind about performances. In this industry we have to perform a lot of operations in less than 33 milliseconds (30 FPS, frames per second), possibly tuning the logic and the art assets to achieve 60 FPS on standalone (Windows/Linux/Mac) and consoles (Xbox One/PS4) and that means rendering the scene content, computing physics and game logic in no more than 16 milliseconds! Not really an easy task, that's why in our industry every CPU tick counts really a lot.

So, what about the foreach statement? Well, this one is really bad, killing hundreds CPU ticks just to allow the programmer to write less code!  You think I'm exaggerating here? Let's have a look to some code to give definitive proof.

Let's open Visual Studio (I'm using VS2015 Enterprise to produce the compiled code below) and let's create a simple C# console app, and in that let's write the following very simple code:
    public class DemoRefType
        public List<Object> intList = new List<Object>();
        public void Costly()
            Object a = 0;
            foreach (int x in intList)
                a = x;
        public void Cheap()
            Object a = 0;
            for (int i = 0; i < intList.Count; i++)
                a = intList[i];

Open in new window

That's an easy one, right? these two methods perform the same job, but one costs a lot in term of CPU ticks... let's see why. I use ILSpy (http://ilspy.net/) to look into the compiled code, so let's analyze the IL (intermediate language) I get after Visual Studio builds it.

Let's start with the Cheap method:
.method public hidebysig 
    instance void Cheap () cil managed 
    // Method begins at RVA 0x2140
    // Code size 36 (0x24)
    .maxstack 2
    .locals init (
        [0] int32 i

    IL_0000: ldc.i4.0
    IL_0001: stloc.0
    IL_0002: br.s IL_0015
    // loop start (head: IL_0015)
        IL_0004: ldarg.0
        IL_0005: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
        IL_000a: ldloc.0
        IL_000b: callvirt instance !0 class [mscorlib]System.Collections.Generic.List`1<object>::get_Item(int32)
        IL_0010: pop
        IL_0011: ldloc.0
        IL_0012: ldc.i4.1
        IL_0013: add
        IL_0014: stloc.0
        IL_0015: ldloc.0
        IL_0016: ldarg.0
        IL_0017: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
        IL_001c: callvirt instance int32 class [mscorlib]System.Collections.Generic.List`1<object>::get_Count()
        IL_0021: blt.s IL_0004
    // end loop

    IL_0023: ret
} // end of method DemoRefType::Cheap

Open in new window

So, nothing odd in the above, it's pretty much what I would expect: a simple loop and a straight move of reference value, nothing more.

Now let's have a look to what we get in IL from the Costly method:
.method public hidebysig 
    instance void Costly () cil managed 
    // Method begins at RVA 0x20ec
    // Code size 53 (0x35)
    .maxstack 1
    .locals init (
        [0] valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>

    IL_0000: ldarg.0
    IL_0001: ldfld class [mscorlib]System.Collections.Generic.List`1<object> performanceDemo.DemoRefType::intList
    IL_0006: callvirt instance valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<!0> class [mscorlib]System.Collections.Generic.List`1<object>::GetEnumerator()
    IL_000b: stloc.0
        IL_000c: br.s IL_001b
        // loop start (head: IL_001b)
            IL_000e: ldloca.s 0
            IL_0010: call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::get_Current()
            IL_0015: unbox.any [mscorlib]System.Int32
            IL_001a: pop

            IL_001b: ldloca.s 0
            IL_001d: call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>::MoveNext()
            IL_0022: brtrue.s IL_000e
        // end loop

        IL_0024: leave.s IL_0034
    } // end .try
        IL_0026: ldloca.s 0
        IL_0028: constrained. valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<object>
        IL_002e: callvirt instance void [mscorlib]System.IDisposable::Dispose()
        IL_0033: endfinally
    } // end handler

    IL_0034: ret
} // end of method DemoRefType::Costly

Open in new window

Well, well, well... it's many lines longer and it contains some quite nasty logic. As we can see it allocates a generic enumerator (IL_0006) that gets  disposed finally (IL_0028 to IL_002e), and that obviously is creating load on the GC (Garbage Collector). Is that it? Not really! We can also see (IL_0015) the nasty unbox operation, one of the most costly and slow in the framework! Please also note how the loop end is caught by the finally clause in case something happens (mostly an invalid casting), not really code we would write in the first place... and still we get it just using a foreach.

So, imagine to have a few of these in your game logic executing at every frame... obviously it's never simple code like in this example, so it will be way nastier than the result shown in this above.

We struggle already so much to keep our games above 30FPS while presenting beautiful artwork (really costly to render), and a lot of nice VFX (visual effects, definitely costly) and we all love to rely on the underlying physics engine to improve the overall gaming experience: all that costs quite a lot... so when it comes to the game logic we have to write, every clock cycle and CPU tick are so valuable... we cannot possibly waste any of them, so let's remember two rule of thumbs:
  • Language helpers that make it easier to code come with a performance cost
  • Always verify the efficiency of your code habits looking into the generated IL code
In the game industry we are all aiming at improving gamers' experiences, making it immersive as much as technically possible: gamers are quite demanding, so let's make sure that we always keep performance testing at the top of our coding practice, because losing even one frame in a second can be a failure factor from a market perspective.
© Copyright Giuseppe "Pino" De Francesco - 2016

Featured Post

Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

Join & Write a Comment

How to fix display issue, screen flickering issue when I plug in power cord to the machine. Before I start explaining the solution lets check out once the issue how it looks like after I connect the power cord. most of you also have faced this…
This video tutorial shows you the steps to go through to set up what I believe to be the best email app on the android platform to read Exchange mail.  Get the app on your phone: The first step is to make sure you have the Samsung Email app on your …
Other articles by this author
Suggested Courses

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month