Solved

search algorithms for searching a body of text

Posted on 2003-12-06
4
325 Views
Last Modified: 2010-04-17
Im looking for an efficent search algorithm which searches a body of text (under 3,000 words) looking for keywords.

What would be the best algorithm to employ.

The body of text may be a XML document (in which case id like to be able to search the XML elements eg search for 'Alan Turing' in the element tag author)

Thanks in adavance for any pointers
0
Comment
Question by:mellowmoose
4 Comments
 

Author Comment

by:mellowmoose
ID: 9888361
To clarify a few points.

The document will be one ive never seen before.

I'd prefer to be able to search the XML b4 I parse it (if the keyowrds dont match then XML doc will not be (parsed)

Ill be using DOM parsing.

Thanks
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 125 total points
ID: 9888431
Hi mellowmoose,

since you are searching only under 300 words, even linear search should be fast enough on todays hardware

however, if you do wish take the troble of implementing efficient algorithms, I would recommend Aho Corasick algorithm

you can see a demo here
http://www-sr.informatik.uni-tuebingen.de/~buehler/AC/AC1.html

details of the algorithm
http://courses.cs.vt.edu/~algnbio/algnbio_2001/lectures/AhoCorasick.html

Cheers!
Sunny:o)
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

A short article about a problem I had getting the GPS LocationListener working.
This is about my first experience with programming Arduino.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question