Solved

Searching html files

Posted on 1997-07-12
1
143 Views
Last Modified: 2013-12-25
I'm trying to write a search engine in 'C'for several www pages. The idea i that the search engine will search through these pages and record the number of times the required word is found. It should also return the URL of the pages where the word was found. I've written an html form asking for the search word, but I'm not having much luck opening the required html files and searchig through them. I dont really know the best way to search through html files.  Can anyone please help ?  I originally posted this question on the C pogrammers questions, but was advised to try here instead.
               Phil H.
0
Comment
Question by:ee96m17
1 Comment
 
LVL 5

Accepted Solution

by:
icd earned 200 total points
Comment Utility
I assume you are able to run cgi scripts on your server (always worth asking).

You can find several search engine scripts in C at the following URL.

http://www.cgi-resources.com/

Follow the links to 'scripts' 'C' and then Search Engines.

Don't discount scripts written in other languages (such as perl)

One further point I would make. You have two options. The first one is that the search takes place at the point the user submits the form. This is OK if there are a small number of pages that are updated frequently.

The second options is probably the most effective, you do the same as the big Internet search engines do. You have an independent process that periodically scans all your pages and compiles a database of key words. When the user submits the search form then you can go straight to the database to find the key words. This is *far* more efficient when the documents don't change very frequently compared to the number of search requests.

I think you will find scripts for both these approaches on the resourse I gave above.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

In this tutorial I will aim to show you how simple is making a small application in WhizBase, how to add, remove and update data in the DB. I will make a small address book application where you can add, browse, update and remove addresses. I wi…
Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now