I currently have a cgi Perl program that can allow a user to search a flat file database. The script then returns the answer in the form of a url, linking them to the site where the matched words were found (i.e. a basic search engine). The problem is, that the database must be created manually.
I require a web robot that can crawl over a given site, following any links found there (to a given depth, and/or, including a given string). The web crawler should then create a flat file database of the parsed html, and the url where the words were found. I could then use my current search engine to allow users to query a site. Are there any robots available that could be easily modified to provide this functionality? I hav contacted some contractors about providing me with one, and was quoted in the region of $600, I would obvciously prefer it if I could find somehting cheaper.
-The robot and search engine will not be used commercially.