Better Approach for full text ranking

Posted on 2007-09-30
Last Modified: 2013-11-26
I have a program that builds a database of keywords to relate to a certain email thread.  Users can select sets of keywords that uniquely identify an email thread and then allow the program to run through the database of these "sets of keywords" against every email to create a ranking of possible threads the email is related to.

Now I notice that my searches are getting slower and slower.  My algorithim is very simple:  for each Keyword, do an instr function against the email text and count the numnber of times a set of keywords matches.  because the number of defined keyword sets is increasing, this is taking longer and longer.

Even if I branch this out to multiple threads or "mothball" old keyword sets it's clear to me that I am in need of a different approach if I expect performance to be acceptable for the long term.  I marvel at the searches that google does, quick and thourogh, with ranking, all on a few keywords.  How can I extend this kind of functionality to my application?  I am writting in Visual Studio, my keywords sets are stored in a disconnected ADO.NET dataset, so I don't have a real database backend with full text capabilities.
Question by:tmesias
    LVL 5

    Assisted Solution

    Maybe you are reading records on a loop. If yes I think you can extract every word on the email's body, then insert them in a temporary table and join this data to the threads' list.

    I think this will have good performance; to get better performance you can read entire thread's list into memory and compara there, but it can take a lot of memory for large lists.
    LVL 14

    Assisted Solution

    cant you use a WHERE clause in your QUERY to the database rather than doing it in .NET ? or you can use FILTERS from your recordsets (datatable)
    LVL 16

    Accepted Solution

    I'd use sql express to store the data.  Then you can just use a where clause with a LIKE filter.
    Alternatively, you can put trace statements in your code to identify where the hangup is.
    LVL 7

    Assisted Solution

    Google (and in fact all web search engines) use full-text search. Read about it here:

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    6 Surprising Benefits of Threat Intelligence

    All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

    IP addresses can be stored in a database in any of several ways.  These ways may vary based on the volume of the data.  I was dealing with quite a large amount of data for user authentication purpose, and needed a way to minimize the storage.   …
    Normally a window is moved by clicking on the caption bar and dragging. You may want your user to be able to move borderless forms or move a form by clicking anywhere in the form without the limitation to the caption bar. There are many ways to do i…
    Need more eyes on your posted question? Go ahead and follow the quick steps in this video to learn how to Request Attention to your question. *Log into your Experts Exchange account *Find the question you want to Request Attention for *Go to the e…
    Internet Business Fax to Email Made Easy - With eFax Corporate (, you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

    759 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    11 Experts available now in Live!

    Get 1:1 Help Now