Solved

Interface Implementation causing performance issues

Posted on 2010-09-17
28
512 Views
Last Modified: 2013-12-16
I have refactored a section of code that originally looped through a bunch of objects (1 loop, all the objects where the same object type called "Element").

The new code I have refactored now has implemented advanced object oriented methods, I now have 2 classes that each implement a single interface called ISearchComponent.

the ISearchComponent has a method called "search".

the two classes that implement the interface are

NodeComposite (this holds a List of ISearchComponents, each item is looped through when "search" is called and search is called on each of the items in the list)

LeafComponent (this object is responsible for determining if its true or not, ie, does this text appear in a document?)


A search can consist of multiple nodes and leafs in a large hierarchy, this new refactored code is also highly optimised in terms of business logic because of the way its built, if certain boolean criteria are not fulfilled (ie one Leaf under a "AND" NodeComposite returns false, then it immediately escapes out of the loop and returns false) and so forth, so logically, you would think that this should be much much faster.

It turns out, whatever the problem is, its not, in fact its 4 times slower.  Running the same search on the old version runs in 1 minute 6 seconds, on the new version is runs at 4 minutes 30 seconds, an obviously huge performance difference.

In terms of the code, it does exactly the same as the previous, except now its structured in a tree, there are more than 1 loop because its recursive and theres 2 classes and an interface.

Could it actually be the case that because I have used interfaces and a composite design pattern that its made it slower?
0
Comment
Question by:recruitit
  • 17
  • 8
  • 3
28 Comments
 

Author Comment

by:recruitit
ID: 33700205
also, the search being done is on roughly 6,500 documents, this code is a secondary search layer for Lucene Indexing (as requirements for more advanced search functionality were required)
0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33700370
This is what I've understood so far. Can you tell how you're searching and what's the search criteria?
public interface ISearchComponent
    {
        bool Search();
    }

    public class LeafComponent : ISearchComponent
    {
        public bool Search()
        {
            throw new NotImplementedException();
        }
    }

    public class NodeComposite : ISearchComponent
    {
        private ISearchComponent childNode;

        public bool Search()
        {
            throw new NotImplementedException();
        }
    }

Open in new window

0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33700374
This is what I've understood so far. Can you tell what you're searching for and it's criteria?
public interface ISearchComponent

    {

        bool Search();

    }



    public class LeafComponent : ISearchComponent

    {

        public bool Search()

        {

            throw new NotImplementedException();

        }

    }



    public class NodeComposite : ISearchComponent

    {

        private ISearchComponent childNode;



        public bool Search()

        {

            throw new NotImplementedException();

        }

    }

Open in new window

0
 
LVL 16

Expert Comment

by:kris_per
ID: 33700404


Your new code need to be this much slower. Having a look at the code might help to look for something else as well.

Some of the following things might make the app slow:
casting (boxing/unboxing).
virtual functions
frequent exceptions that are swallowed (caught and ignored)
large string construction
any late binding
Check out this link for performance considerations in .net => http://msdn.microsoft.com/en-us/library/ff647790.aspx

Also check if the same node (both composite and component) nodes are searched multiple times by having a temp count to keep track of number of times it's search is called and finally printing out the counts and look for counts > 1)

0
 
LVL 16

Expert Comment

by:kris_per
ID: 33700407

Sorry; the first line in the last comment should be:

Your new code need NOT to be this much slower.
0
 
LVL 16

Expert Comment

by:kris_per
ID: 33700545

You can also find which methods take more time using Visula Studio Profiler. Here is a link on using VS profiler => http://www.geekzone.co.nz/vs2008/6265
0
 

Author Comment

by:recruitit
ID: 33701242
I have attached all the relevant code files.

Just to clarify, the code below is the original code of the "new version", ie the code I have attached is the refactored code of the original, the example I have given above is from an attempt to make the code below work better, ie I compressed all LeafComponents into one singular Leaf Class and I compressed all Node Classes into the NodeComposite class hoping to achieve some sort of performance gain as it eliminated abstract method calls, but anyway the code below is the code that runs 4 times slower than the original code, the primary funtion we are testing is the search function implemented by all the components below
AndComposite.cs
ISearchComponent.cs
LeafComponent.cs
NodeComposite.cs
NotComponent.cs
OrComposite.cs
PhraseComponent.cs
ProximityComponent.cs
TermComponent.cs
XorComposite.cs
0
 

Author Comment

by:recruitit
ID: 33701259
The NotComponent uses the decorator design pattern, and the rest is a mix of composition and template.
0
 

Author Comment

by:recruitit
ID: 33701283
Sorry for the lack of comments, if you have any questions let me know :)
0
 

Author Comment

by:recruitit
ID: 33702119
Just to clarify again, im only really looking for your opinion on the search function, is there anything in there that you can see that would not be very optimised or would cause slowdown?  (ignoring the regex get matches, as that is identical to the previous version).
0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33704331
Like you can see the area of concern can only be


    foreach (IInequality ie in inequalities)
    {
        if (!ie.equate(hits))
            return false;
    }

Which is not exactly this loop, but the number of entries in inequalities and what's the equate method doing. I didn't care to download all files except few interested ones. It will be nice if you can tell us how many entries can be expected in inequalities and post the equate method logic here.
0
 

Author Comment

by:recruitit
ID: 33714217
in this particular case there are no items in the inequalities list

there are 5 inequality objects, less than, less than or equal, greater than, equal to etc

>
<
>=
<=
==

and the inequality also has a parameter set by the user that is used to compare with the hits, for example, if hits are > user critera, ie 3 > 4 then as a greater than inequality this would return false

but like I said in these tests there are no inequalities in the search so it should skip that loop entirely
0
 

Author Comment

by:recruitit
ID: 33714225
I think that I am going to have to smash all these objects into a singular object, I cant see any other way of getting around the abstraction because I think thats what is causing the issue, its going to be a horrible coding technique but I cant see any other way of escaping it.
0
 

Author Comment

by:recruitit
ID: 33714242
ill keep multiple copies of the different versions of code but just as an experiment to see if it solves the issue I am going to combine them into one object, since then the CLR does not have to calculate which object the calls are being made to which can apparently be a massive performance hit if your making many many function calls like I am doing.
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33714265
I too had a similar situation where the code refactor actually introduce performance issue. I was able to pin-point the reason by debugging line by line to see where it was taking more time. And hence finally understood the reason. I'll also encourage you to do the same and see if if (!ie.equate(hits)) take more than a second to return.

In my case the problem was lying with Extension Method I've added on a collection. Likewise, see if you have added any new code and removing that (add stub) will make any difference.
0
 

Author Comment

by:recruitit
ID: 33714337
I have just tested the performance using a singular object named search component and it made zero difference, I am going to do as you suggest and test the inequality loop and see if that make any difference
0
 

Author Comment

by:recruitit
ID: 33714419
I have just removed the inequality loops, I removed some extra function calls and placed the code inline, and its running that the same speed, this time 5 minutes 11 seconds, just did a test again on the previous version and it ran in 1 minute 13 seconds, I just dont get what could possibly be slowing it down a whole 4 minutes!

There are no virtual methods or abstracts classes or interfaces anymore, the only thing that remains different is that has more for loops than the first version, and has a composition structure, like I said, the previous version only has 1 loop
0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33714423
You mean nested for loops?

That's obvious performance hit then, isn't it?
0
 

Author Comment

by:recruitit
ID: 33714424
I guess the point is, even though the search is doing the exact same quantity of searching, does splitting up one large loop into many tiner loops effect performance?
0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33714426
I will more interested in seeing the difference b/w older loop structure and new one.
0
 

Author Comment

by:recruitit
ID: 33714560
I just replaced the Lists with Arrays and that made zero difference, I can post the old loop code if you want but its not too pretty :), the old Count method is basically the same as the Search method in the new code, the old code could not handle brackets / complex hierarchy searching but the new system can, the old count method worked on a single plain if that makes any sense

ie.

wordA or wordB and wordC or WordD

but in the new code it was designed for bracketing

(((wordA or wordB) and wordC or wordD) or someothercriteria) or anotherSetOfCritera

etc, as you can see in the criteria above it is multi tiered
public string Count(string Content, bool ReturnAll)

        {

            // instantiate

            sbSearchHits = new StringBuilder("");

            iHits = 0;

            bOrOpen = false;



            int CritCount = sc.Elements.Count;



            SearchElement se;



            for (int s = 0; s < sc.Elements.Count; s++)

            {

                se = (SearchElement)sc.Elements[s];

                

                if (se.NewOr)

                { iHits = 0; }



                switch (se.MatchType)

                {

                    case MatchType.Within:

                        MatchCollection mc = Regex.Matches(content,

                            se.RegexString.ToString(),

                            RegexOptions.Singleline | RegexOptions.IgnoreCase);



                        int nWithin = se.Within;

                        int wHits = 0;



                        foreach (Match m in mc)

                        {

                            if ((m.Groups.Count > 1) && (Regex.Matches(m.Groups[1].Value, @"\b\w+\b").Count <= nWithin))

                            {

                                ++wHits;

                            }

                        }



                        switch (se.Occur)

                        {

                            case Occur.MUST:  //And

                                if ((iHits <= 0) && (bOrOpen)) { if(!ReturnAll)return string.Empty; }//goto nextdoc; }

                                iHits = 0;

                                bOrOpen = false;

                                if (wHits == 0) { if (!ReturnAll)return string.Empty; }//goto nextdoc; }



                                sbSearchHits.Append(wHits);

                                break;

                            case Occur.EITHER:  //Or

                                bOrOpen = true;

                                iHits = iHits + wHits;

                                sbSearchHits.Append(wHits);

                                if ((iHits == 0) && (s + 1 == CritCount)) { if (!ReturnAll)return string.Empty; }//goto nextdoc; }

                                break;

                        }

                        break;



                    case MatchType.Normal:



                        int nHits = Regex.Matches(content,

                            se.RegexString.ToString(),

                            RegexOptions.IgnoreCase | RegexOptions.Singleline).Count;



                        switch (se.Occur)

                        {

                            case Occur.MUST: //And

                                if ((iHits <= 0) && (bOrOpen)) { if (!ReturnAll)return string.Empty; }//goto nextdoc; }

                                iHits = 0;

                                bOrOpen = false;

                                if (nHits == 0) { if (!ReturnAll)return string.Empty; }//goto nextdoc; }



                                sbSearchHits.Append(nHits);

                                break;

                            case Occur.EITHER: //Or

                                bOrOpen = true;

                                iHits += nHits;

                                sbSearchHits.Append(nHits);

                                if ((iHits == 0) && (s + 1 == CritCount)) { if (!ReturnAll)return string.Empty; }//goto nextdoc; }

                                break;

                        }

                        break;

                }



                if (s + 1 != CritCount)

                {

                    sbSearchHits.Append(":");

                    if ((iHits == 0) && (bOrOpen) && ((SearchElement)sc.Elements[s + 1]).NewOr)

                    {

                        { if (!ReturnAll)return string.Empty; }//goto nextdoc; }

                    }

                }

            }

            

            return sbSearchHits.ToString();

        }

Open in new window

0
 

Author Comment

by:recruitit
ID: 33714625
I have also just done a few tests regarding looping , ie, breaking up a big loop into many smaller loops to see if the there is any performance loss by using smaller loops and it made zero difference, I tested a loop that ran 1,000,000 cycles into a loop that did 1,000 cycles and called a function that containes a for loop of 1,000 cycles, and they run at the same speed, so I am really lost here, but there must be something different,
0
 

Author Comment

by:recruitit
ID: 33714939
Something must obviously be wrong, I have eliminated so much code, and not a single thing I have done has made a bit of difference to the speed, it just doesnt make sense, I mean the only thing I can see differently is having this collection of collections composition structure, can that actually cause performance problems?

I have reviewed and analysed the code at least 40 times now comparing it with the previous version and I cant honsetly see why the new one should run any slower!
0
 

Author Comment

by:recruitit
ID: 33714951
Is there any lists out there that show overhead breakdowns? ie, it takes this much time to initialise a for loop, a class, a method, whatever?
0
 
LVL 8

Expert Comment

by:Gururaj Badam
ID: 33714976
Just a suggestion - Rather working on same code I suggest you to implement a new code with old and new model and perform some similar operation to see if there's any performance issue. I know we're doing this in a brute force method now.
0
 

Author Comment

by:recruitit
ID: 33715044
Ok, I have found the issue.  I was analysing the original code and discovered that the number of search results the original code is looping through is only 720, where in the new version it has to loop through over 8000 results, this is because I seem to have screwed up something thats completely unrelated to the code I have written and displayed here!  My god, its always the little things that are a total pain in the rear!  So basically what was happening is a bit of an illusion, the previous code only has to process 720 results compared to the later which is processing over 8000, thus we can see why its taking so much longer!  I am going to quickly attempt to fix this issue, (I am SO glad its not the new codes fault! lol)
0
 
LVL 8

Accepted Solution

by:
Gururaj Badam earned 500 total points
ID: 33715055
Finally, it's good that you found the issue. One suggestion, when refactoring do it pieces and iterative.

Step 1 - Refactor piece of code
Step 2 - Test
Step 3 - Jump to Step 1
0
 

Author Comment

by:recruitit
ID: 33716515
Problem is solved, although its somewhat of an issue related problem, and probably not much use to anyone else, but always check the quantity of data your testing on and never ever assume something is working the way it should, test everything!!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Before You Read The Article Please make sure you understand these two concepts: Variable Scope (http://www.php.net/manual/en/language.variables.scope.php) and Property Visibility (http://www.php.net/manual/en/language.oop5.visibility.php).  And to …
It was really hard time for me to get the understanding of Delegates in C#. I went through many websites and articles but I found them very clumsy. After going through those sites, I noted down the points in a easy way so here I am sharing that unde…
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now