?
Solved

search algorithms for searching a body of text

Posted on 2003-12-06
4
Medium Priority
?
341 Views
Last Modified: 2010-04-17
Im looking for an efficent search algorithm which searches a body of text (under 3,000 words) looking for keywords.

What would be the best algorithm to employ.

The body of text may be a XML document (in which case id like to be able to search the XML elements eg search for 'Alan Turing' in the element tag author)

Thanks in adavance for any pointers
0
Comment
Question by:mellowmoose
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 

Author Comment

by:mellowmoose
ID: 9888361
To clarify a few points.

The document will be one ive never seen before.

I'd prefer to be able to search the XML b4 I parse it (if the keyowrds dont match then XML doc will not be (parsed)

Ill be using DOM parsing.

Thanks
0
 
LVL 45

Accepted Solution

by:
sunnycoder earned 500 total points
ID: 9888431
Hi mellowmoose,

since you are searching only under 300 words, even linear search should be fast enough on todays hardware

however, if you do wish take the troble of implementing efficient algorithms, I would recommend Aho Corasick algorithm

you can see a demo here
http://www-sr.informatik.uni-tuebingen.de/~buehler/AC/AC1.html

details of the algorithm
http://courses.cs.vt.edu/~algnbio/algnbio_2001/lectures/AhoCorasick.html

Cheers!
Sunny:o)
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Simple Linear Regression

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question