asked on

How can I speed up .NET app load time when the machine is freshly booted?

I have a Windows Forms app written in C# that takes approximately 12 seconds to load on a fast machine (P4 2.4Ghz 1GB memory, Windows XP Pro) when run the first time after booting. After I've run it once, the next time I run it the app was taking about 6 seconds. I made the following optimizations:
1. Added all strongly-named assemblies to the GAC on the target machines (mainly Infragistics and ComponentOne controls) so that the CLR doesn't do security checking on them every time the app is loaded.
2. N-Gen'ed ALL of the application's assemblies and dependent assemblies.

These improvements caused the 6 second time to drop to about 3.5 seconds, which is very acceptable. However, after freshly booting, and waiting for the hard drive to stop chirping and the cpu to settle down, it again takes 12 seconds to load the application. The second time I run, I'm back to about 3.5 seconds. The problem is, this application is typically run only once after booting, so the subsequent load times are largely irrelevant.

This is the fastest of my target machines. I have lower-end machines (233MHz PII/128MB memory) that take a full minute to load even after the optimizations.

The Compuware DevPartner profiler shows that roughly 50% of the time is spent in obvious user interface code (Infragistics, ComponentOne, System.Drawing, System.Windows.Forms, etc.).

The CLR JIT performance counter shows that without NGen, roughly 5700 methods are jitted every time I start this app, and another 100 or so when shutting it down (???). When I NGen every assembly, I still take a hit for about 500 to 600 JIT methods.

Am I just seeing the results of disk caching? The disk is being hit quite a bit the first time I run it, but hardly at all the subsequent times. The performance continues to be good, even after logout/login (same user) as long as the machine is not rebooted. I have observed this behavior on both machines.

On another machines (PIII 1GHz with plenty of memory) the load time after booting is always about 25 seconds, but about 8-10 seconds on subsequent invocations.

Am I just stuck with slow initial load times for .NET apps, or are there more optimizations I can do?

Thanks,
Tim.

_TAD_

The biggest difference between JAVA and .Net is when things are compiled to machine code.

Java and .NET both compile their assemblies to some IL language. JAVA then converts the IL to machine code as needed. .NET converts ALL IL to machine code at startup.

Not that you particularly care about Java, but this should give you some sense as to when things happen in .NET.

So little things like having your function calls inlined can help as well as NGen and strong names (which you've done). Also, are you running in DEBUG or RELEASE mode for these tests? As I'm sure you are aware DEBUG adds a ton of overhead even if you are not running in the IDE.

Let's see... also, how are you calling your functions? This doesn't affect 'Load time' persay, but it can affect performance. Fully qualified names are slower IF the library is not loaded as a pre-req.

That is...

using System;
using System.Data;
...
for (int i=0;i<1000;i++)
DataTable dt = new DataTable();

is MUCH faster than

using System;
...
for (int i=0;i<1000;i++)
System.Data.DataTable dt = new System.Data.DataTable();

_TAD_

Here's Microsoft's list of tuning tips (including ngen.exe)
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetperftips.asp

_TAD_

Here is a list of other performance considderations... again, it's not really pertinant to your question exactly... but you may find it useful.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetperftechs.asp

vascov

Hi _TAD_

Actually i don't think that both samples you posted are different. The produced IL in both cases is EXACTLY the same.
Can you elaborate how can one be faster than the other ?

I'm under the impression that using statement is just syntactic sugar to save you from typing full type names. They're used ONLY during compile time.
Eventually, compilation is marginally slower when NOT using full type names, because the compiler as to look for the symbol prepending the using statements to try to form a full type name and look for it within the assemblies passed in for compilation.

Vasco

_TAD_

they will produce identicle IL code, where the savings comes in is when the IL is interpreted.

Earlier when I said that .Net converts ALL IL code to machine code, I was speaking on the method level.

Here's how it works (as I understand it).

You load a class... the class level references are loaded into memory (e.g. system.Data library)

While that class is active (on the heap) that library stays in memory.

Now a method is called, the ENTIRE method's IL code is converted to machine code and any library references are loaded into memory. These references exist in memory for the lifetime of the method. Of course this changes when you are inlining function. inlined functions are simplified and treated as if the inlined function is part of the calling function.

So simply wrapping my above example in a couple of function calls won't make a difference either. In order to test the differences in code, you will have to wrap my above examples into two different function calls, and those functions (once compiled) need to be larger than 32 bytes (so they are not inlined). Then time the execution of a few thousand repititions and you will see a signifigant change in execution time.

Here's another Q that I answered that had a performance issue that was cleared up by simply moving the library references around.

https://www.experts-exchange.com/questions/20883952/Access-AS-400-db-from-NET-too-slow.html

Of course I'm not suggesting that using global references is the end all to fixing all performance issues. I am merely suggesting that if you are going to be using a library over and over across many different functions it would probably be a good idea to make the library global. It uses more RAM and holds it in memory longer, but then there is less load time overhead. What's more important... RAM or Time? It really depends on the scenario... taking up 100 K of RAM for the life of your program to save 30 ms is not worth it, but on the other hand, taking 1 MEG of RAM to save 6 seconds on every database call.... that's worth it to me (considderring I sometimes make "chatty" calls to a database... bad programming habit, but I'm of the 'on demand' attitude).

vascov

Hi

_TAD_, i don't mean to be picky :), but i still don't understand how can two identical assemblies (differing only in MVID), that have the same manifest be jit compiled differently.

Again, at the end of the day, .NET needs the full type name of every type to compile it, and the using statement has 0 impact in how you import an assembly (note there's no using statement in IL), it merely saves you from typing the full name everytime and in the process makes the code more readable. (the using statement is also used for IDisposable, but that's not within this context)
Assembly referencing is something you'll need to tell using /r: or using al.exe. I can't say 100% sure, but i even think that it will discard references for not used assemblies. (that's what's actually happening everytime you compile and the rsp file is used that references a slew of assemblies, but in your target assembly not every reference is expressed in your manifest, only the ones that you're actually making use of some type within your code.)

Can you send a sample ? i really don't understand how can the exact same IL behave differently.

What's a global reference ? (when you reference an assembly, you're saying that in the process of compilation the compiler should also look at your assembly to find the needed types. isn't it always global ?)

thanks for the answer,

Vasco

_TAD_

Vasco, good questions! I too also shared your reservation about where a library was referenced and why it would make a difference. After all, the IL for both System.Data.DataTable and using System.Data ..... DataTable is exactly the same.

In file size, it makes no difference. however... in execution time it *can*.

here are two classes Syntactically the same... compiles to the same IL code and everything:

<High Performance>

using System;
using System.Data;
using System.IO;

namespace PerformanceTesting
{
      public class HighPerf
      {
            public HighPerf()
            {
            }

            public void ExecuteLoop()
            {
                  DataTable dt;
                  MemoryStream ms;

                  for(int i=0;i<10000;i++)
                  {
                        dt = new DataTable();
                        for(int j=0;j<10;j++)
                              dt.Columns.Add(new DataColumn("Col" + j));
                        ms = new MemoryStream(new byte[10]);
                        dt.Dispose();
                        ms.Close();
                  }
            }
      }
}

<Low Performance>
namespace PerformanceTesting
{
      public class LowPerf
      {
            public LowPerf()
            {
            }

            public void ExecuteLoop()
            {
                  System.Data.DataTable dt;
                  System.IO.MemoryStream ms;

                  for(System.Int32 i=0;i<10000;i++)
                  {
                        dt = new System.Data.DataTable();
                        for (System.Int32 j=0;j<10;j++)
                              dt.Columns.Add(new System.Data.DataColumn("col"+j));
                        ms = new System.IO.MemoryStream(new byte[10]);
                        dt.Dispose();
                        ms.Close();
                  }
            }
      }
}

Then in my forms designer I have two buttons and two data grids (high perf and low perf)

            private void button1_Click(object sender, System.EventArgs e)
            {
                  HighPerf hp;
                  ArrayList lst = new ArrayList();

                  for (int i=0;i<5;i++)
                  {
                        hp = new HighPerf();

                        long start = DateTime.Now.Ticks;
                        hp.ExecuteLoop();
                        long end = DateTime.Now.Ticks;

                        lst.Add(end-start);
                  }
                  lst.Sort();
                  BindSource(lst,dataGrid1);
            }

            private void button2_Click(object sender, System.EventArgs e)
            {
                  LowPerf lp;
                  ArrayList lst = new ArrayList();

                  for (int i=0;i<5;i++)
                  {
                        lp = new LowPerf();

                        long start = DateTime.Now.Ticks;
                        lp.ExecuteLoop();
                        long end = DateTime.Now.Ticks;

                        lst.Add(end-start);
                  }
                  lst.Sort();
                  BindSource(lst,dataGrid2);
            }

            private void BindSource(ArrayList lst, DataGrid grid)
            {
                  DataTable dt = new DataTable();
                  dt.Columns.Add(new DataColumn());

                  DataRow dr;
                  for (int i=0;i<lst.Count;i++)
                  {
                        dr = dt.NewRow();
                        dr[0] = lst[i];
                        dt.Rows.Add(dr);
                  }

                  grid.DataSource = dt;
            }

_TAD_

now press each button and compare times (you may want to run this several times).

You will find that the LOW performance model will always be just a bit slower than the high performance model.

However, note that the FASTEST "low perf" times will be faster than the SLOWEST "high perf" times.

On my computer (512 MB RAM, 2 GigHz) the high perf tracked consistantly in the 135000 to 145000 [ticks] range

while the low perf consistantly tracked in the 139000 to 149000 [ticks] range.

And my example only uses three small class libraries (System, System.Data, and System.IO {not reading/writing to disk}).

What if you have a class that uses a dozen different libraries with dozens of different objects?

Interesting side note:

change the code from:

long start = DateTime.Now.Ticks;
lp.ExecuteLoop();
long end = DateTime.Now.Ticks;

to:

long start = DateTime.Now.Ticks;
lp.ExecuteLoop();
lp.ExecuteLoop();
lp.ExecuteLoop();
long end = DateTime.Now.Ticks;

For both high perf and low perf and the performance is nearly a dead heat with the low performance a bit faster (???).

There must be some additional optimization going on when re-executing the same function/method. Since the .Net compiler converts IL code to machine language for each method call, then the optimization must be on the IL conversion process.

Pretty nifty either way... the same IL code executes at different speeds based on whether the library is referenced on the class level, or on the function/method level.

_TAD_

I found that changing the code slightly:

private void button2_Click(object sender, System.EventArgs e)
{
LowPerf lp;
ArrayList lst = new ArrayList();

for (int i=0;i<5;i++)

...

change the loop to

for (int i=0;i<15;i++)

It takes a lot longer but you will see that all of the numbers with the fully qualified name are pretty consistant. The other way (not fully qualified),it gets "out of the gate" a little slower, but over time it has better performance times.

So I guess my initial statement still stands... Fully qualified names can be slower than using class level declared libraries. Of course... if you have (for example) only ONE call to a particular library and it only uses ONE class. Then that 1/100 of a ms delay is probably tolerable so that you are not loading and retaining that library in memory for the duration of your program.

SOLUTION

vascov

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER CERTIFIED SOLUTION

_TAD_

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ooperman

ASKER

Actually, this discussion has been very informative and I appreciate the input from both of you. To follow up, I have completely disconnected the user interface in this application so I could benchmark the business logic. It seems that 60 to 80% of the time is spent building the user interface, most of which is spent loading Infragistics controls. I am now focusing on optimizing our use of the controls and looking for opportunities to speed things up using Assembly.LoadFromFile in certain places.

We are loading a lot of assemblies: 6 that we have written, 8 from Infragistics, and 3 from ComponentOne.

I should (hope to!) be able to use the DevPartner profiler and the .NET CLR performance counters to further optimize the managed code. I wanted to be sure I wasn't missing something obvious about the initial loading of .NET apps in general.

Thanks for you input!

_TAD_

Have you considdered multi-threading your application?

If the time is spent loading *visble* controls... well, then you have to streamline the controls themselves. However, if you are loading a ton of controls behind the scenes that are not used right away, you may be able to load them in a separate thread into memory and then render those objects when an event occurs to display the hidden controls.

_TAD_

vascov> I, like you was a doubter when it came to libraries and the IL code, etc, etc... but a few experiments and I was convinced. Either I'm missing something, or we have a disconnect somewhere. I'd like to continue this discourse (and invite others for their opinions).

https://www.experts-exchange.com/questions/20932502/Does-it-matter-where-I-add-my-library.html

When I get a little time, I'll post a few more comments/observations. I'd like your input.

Thanks!