I'm working on an ASP.NET C# application that will process most of the database reads and joins in memory using List.Find methods.
I wanted to see if normalizing the database would make a difference in performance in terms of how fast the List.Finds would operate.
So I created three types of class objects:
TypeBoth = ID, Type, Text (Type = Type1 or Type2)
Type1 = ID, Text
Type2 = ID, Text
Then populated three Lists with randomly generated text fields
ListBoth (alternating Type1 and Type2) ( number of objects = 2X
ListType1 (number of objects = X)
ListType2 (number of objects = X)
Then I generated "index arrays" (because that's what the application will do) that contain randomly selected IDs from the Lists for Type1 and Type2, and randomly selected ID + Type pairs for TypeBoth.
For speed testing, I set up routines where the index arrays were used to Find the object in the List associated with the index array entry. These routines have find statement like these:
wrkTypeBoth = wrkList.Find(i => (i.ID == wrkIndex) && (i.Type == wrkType)); // <== note, two conditions
For Type1 and Type 2
wrkType1 = wrkList.Find(i => i.ID == wrkIndex); // <== note, one condition
wrkType2 = wrkList.Find(i => i.ID == wrkIndex); // <== note, one condition
I then placed these routines in test loops. The test loop for TypeBoth had twice as many objects in its List as for either Type1 or Type2, but the test loop for the Type1+Type2 ran 2 index read loops, i.e.:
=> find 200 indexes from 500 objects in ListBoth, two conditions per find
=> find 100 indexes from 250 objects in ListType1, one condition per find
=> find 100 indexes from 250 objects in ListType2, one condition per find
I counted all the Finds for both test loops to make sure I was doing the same number of Finds in either case. I checked to make sure the Lists and Index arrays had the expected number of records before running the speed test. I used StopWatch to capture the elapsed times for both Test Loops.
Throughout a range of record counts, index sizes, and test cycles TestLoop12 always took approximately 1/2 the elapsed time of TestLoopBoth.
This tells me that normalizing the data so that the Finds can run with only one condition instead of two conditions creates a significant performance benefit.
My question is two part:
A) does this seem to be in agreement with what other people know about the ASP.NET C# LIST.FIND method?
B) are their any flaws in my test protocol or conclusions?
Any help with this would be appreciated.