[Last Call] Learn how to a build a cloud-first strategyRegister Now


Better Approach for full text ranking

Posted on 2007-09-30
Medium Priority
Last Modified: 2013-11-26
I have a program that builds a database of keywords to relate to a certain email thread.  Users can select sets of keywords that uniquely identify an email thread and then allow the program to run through the database of these "sets of keywords" against every email to create a ranking of possible threads the email is related to.

Now I notice that my searches are getting slower and slower.  My algorithim is very simple:  for each Keyword, do an instr function against the email text and count the numnber of times a set of keywords matches.  because the number of defined keyword sets is increasing, this is taking longer and longer.

Even if I branch this out to multiple threads or "mothball" old keyword sets it's clear to me that I am in need of a different approach if I expect performance to be acceptable for the long term.  I marvel at the searches that google does, quick and thourogh, with ranking, all on a few keywords.  How can I extend this kind of functionality to my application?  I am writting in Visual Studio, my keywords sets are stored in a disconnected ADO.NET dataset, so I don't have a real database backend with full text capabilities.
Question by:tmesias

Assisted Solution

fmonroy earned 150 total points
ID: 19988820
Maybe you are reading records on a loop. If yes I think you can extract every word on the email's body, then insert them in a temporary table and join this data to the threads' list.

I think this will have good performance; to get better performance you can read entire thread's list into memory and compara there, but it can take a lot of memory for large lists.
LVL 14

Assisted Solution

by:Jai S
Jai S earned 150 total points
ID: 19988837
cant you use a WHERE clause in your QUERY to the database rather than doing it in .NET ? or you can use FILTERS from your recordsets (datatable)
LVL 16

Accepted Solution

RobertRFreeman earned 600 total points
ID: 19992160
I'd use sql express to store the data.  Then you can just use a where clause with a LIKE filter.
Alternatively, you can put trace statements in your code to identify where the hangup is.

Assisted Solution

photowhiz earned 600 total points
ID: 19992597
Google (and in fact all web search engines) use full-text search. Read about it here: http://en.wikipedia.org/wiki/Full_text_search

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In real business world data are crucial and sometimes data are shared among different information systems. Hence, an agreeable file transfer protocol need to be established.
Hello there! As a developer I have modified and refactored the unit tests which was written by fellow developers in the past. On the course, I have gone through various misconceptions and technical challenges when it comes to implementation. I would…
This Micro Tutorial will teach you how to add a cinematic look to any film or video out there. There are very few simple steps that you will follow to do so. This will be demonstrated using Adobe Premiere Pro CS6.
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question