I have written a perl script which searches our website. It currently searches the titles of 150 pages for the search terms which takes about 30 to 40 seconds, which I think is quite slow.
The method I use at the moment is the serial file (item entry) method, whereby each document is opened the text searched and then the next document and so on....
I would like to use the inverted file (term entry) method whereby a list is generated of all the searchable terms and each term has a corresponding list of documents.
My questions are as follows, bearing in mind that the CGI program will be written in Perl:-
1). How should the inverted file list be created (format, example appreciated)
2). How to search the file list using a perl script and then provide the matches.
3). How to provide some sort of description to go with each document.
4). Any ideas on Best-match retrieval using some sort of relavance ranking.
I know that this is quite alot and probably very difficult, hence I am offering alot of points.