Solved

Search Engine

Posted on 2001-07-17
14
199 Views
Last Modified: 2013-12-25
how do i write a search script that will simulate google.com.

search through the web pages on my site for pages that have the search criteria in them, and display the results set, 10 at a time, allowing for pagination (next and previous).

0
Comment
Question by:augblay
  • 7
  • 3
  • 2
  • +2
14 Comments
 
LVL 4

Expert Comment

by:ykf2000
ID: 6290291
I do not think that is possible without a spider gathering information and storing it into a database first.
0
 
LVL 1

Author Comment

by:augblay
ID: 6290692
i don't know perl, but with asp or vb it is possible,

here is a pseudocode for that, i hope some one can help.

grab all files that meet criteria store into an array
grab the page request variable, from the requesting page.
if page request variable is blank set it to 1
if page request variable is 1 then
  starting index variable=0
else
  starting index variable=(page number-1) * page size
end if

get ending index which is starting index + page size

if ending index > array lenght then
   ending index=array length
end if

loop through the array starting with starting index and ending with ending index, at each loop, printing the content

knowing the array lenght, the page size and the current page, we can know if we are at the end of the pages or at the beginning of the pages.


i hope this pseudocode helps

thanks

 
0
 
LVL 8

Expert Comment

by:bebonham
ID: 6290693
you want a search engine for your site only?

had you thought about how you would rank the pages?
0
How Do You Stack Up Against Your Peers?

With today’s modern enterprise so dependent on digital infrastructures, the impact of major incidents has increased dramatically. Grab the report now to gain insight into how your organization ranks against your peers and learn best-in-class strategies to resolve incidents.

 
LVL 1

Author Comment

by:augblay
ID: 6290714
no, i don't want page ranking,
0
 
LVL 10

Expert Comment

by:makerp
ID: 6292838
you will not be able to search the pages for each request, it would kill the server, esspecially if there are a lot of pages. what you need is a process that spiders the web (or just your local server) for files, this would run independent of the web server and bung the pages found in a database. then you would need a script that would allow you to search like this. writing a web crawler is hardcore programming to be done in C++ or C

<%
' simple search engine example that takes a string of comma seperated keywords and then
' searches a simple single tables db with a table called table1 with a name column

Set Con = Server.CreateObject("ADODB.Connection")
Con.Open "test_db","",""
' if there no search string then display the form
IF(Request("search") = "")THEN
     %>
          <B>Please enter keywords separted by a space and select the and or radio buttpns</B>
          <FORM ACTION=<%=Request.ServerVariables("SCRIPT_NAME")%> METHOD=POST>
               <INPUT TYPE=TEXT SIZE=80 NAME=search><BR>
               OR<INPUT TYPE=RADIO NAME=s_opt VALUE=OR CHECKED>AND<INPUT TYPE=RADIO NAME=s_opt VALUE=AND><BR><BR>
               <INPUT TYPE=SUBMIT VALUE=Search>
          </FORM>
     <%
ELSE
' else the form has been submitted so lets query

     ' first split the string on spaces to get out keywords
     keywords = Split(Request("search")," ")
     ' get the array size of keywords
     no = UBOUND(keywords,1)
     ' now loop through building out string
     FOR i = 0 to no
          s = s + " name LIKE '%" & keywords(i) & "%' " & Request("s_opt")
     NEXT
     ' chop of the remaing logic operator
     s = LEFT(s, LEN(s) - LEN(Request("s_opt")))
     ' now execute the stmt. response.write for testing
     Response.write("SELECT * FROM table1 WHERE " & s)
     ' exe it and then display our results
     Set rs = Con.Execute("SELECT * FROM table1 WHERE " & s)
     DO UNTIL rs.EOF
          Response.Write(rs("name") & "<BR>")
          rs.MoveNext
     LOOP
END IF
%>
0
 
LVL 10

Expert Comment

by:makerp
ID: 6337016
do i assume this is another question that will never get graded!
0
 
LVL 1

Author Comment

by:augblay
ID: 6342882
i will grade it but this is not the answer i want. there a cgi search script that my isp provided , the script searches through the pages and display all matching pages, in one page, you will have to scroll for a long time to view alll, what i am asking for, is an example in perl to mimic the pseudocode i provided. i know it will kill the server, but i asked for the script
0
 
LVL 10

Expert Comment

by:makerp
ID: 6343899
for a start you need to recursivley scan the directories,

# written by paul maker
#
# recurisve search, this recursivly scans a path defined in the call find(path)
# file maching extensions @extensions are displayed. you could add these file to an
# array/database etc
#

# make use of the cgi package
use CGI;
# new object please
$query = new CGI;
# hard coded array of extensions (we could load this up dynamically from a database etc)
@extensions = (".htm",".html",".txt",".c",".cpp",".h");

# function to check if a filename has an extension we want
sub isValidExtension
{
        for($i=0;$i<$#extensions;$i++)
        {
                if(@_[0] eq $extensions[$i])
                {
                        return 1;
                }
        }
        return 0;
}
# our recursive function
sub find
{
        my($input) = @_[0];
        if(-f $input)
        {
                if(isValidExtension(substr($input, -4)) == 1)
                {
                        # we might what to slap this in a data base etc
                        printf("$input\n<BR>");
                }
        }
        if(-d $input)
        {
                if(substr($input, -1) ne ".")
                {
                        opendir(FD,$input);
                        my @files = readdir(FD);
                        closedir(FD);                
                        foreach $file (@files)
                        {
                                find("$input/$file");        
                        }
                }
        }
}

# make it a cgi to demo the script
print $query->header();
if(!$query->param())
{
 print("
          <FORM ACTION=search.pl METHOD=POST>
               Enter your path on the server to spider :
               <INPUT TYPE=TEXT NAME=path>
               <BR>
               <INPUT TYPE=SUBMIT VALUE=spider>
          </FORM>
     ");
}
else
{
        print("<B>I</B> found the following files :<BR><BR>");
        find($query->param('path'));
        print("<BR><A HREF=search.pl>Play Again</A>");
}
0
 
LVL 10

Expert Comment

by:makerp
ID: 6343909
now when you find a file you need to open it and scan the text for your match.

in the above example you enter a path, you will want to remove that and have the path hard coded as you know where you want to start.

now this script will scan files on the local host, this wont kill your server. if you need to scan files on remote servers you will need to use http, this is another kettle of fish.

Paul
0
 
LVL 10

Accepted Solution

by:
makerp earned 200 total points
ID: 6343916

this is where you will want to scan the file and add it to an array, or printf it to the browser. also you will need to convert the filename to a link that will work on your server, this is easy simply chop the filename at the start of where the virtual directory is and append it to the server name

# we might what to slap this in a data base etc
printf("$input\n<BR>");
0
 
LVL 8

Expert Comment

by:bebonham
ID: 6346838
you think it is too hardcore for perl?

how come...

I always thought perl was king of text processing...

I'm trying to learn c++ now.
0
 
LVL 10

Expert Comment

by:makerp
ID: 6347926
here is a cgi script in asp to scan directories, you will need to add the code to open each file and scan it for the text the user has entered in a search box

<%
' script to make a link for all files under
' virtual dir, youll have to tidy the output a bit

DIM root

' set this to the root of your chosen dir
root = Server.MapPath("./")

' *******************************************************************************************
' MAIN BIT

' create a filesystem object
Set fso = CreateObject("Scripting.FileSystemObject")

Set fldr = fso.GetFolder(root)

%><H1>All files under this Virtual Dir....</H1><%

' now search the folder, this will recursivly call the search on sub-folders
searchFolder(fldr)

' **********************************************************************************************
' DEFINITION OF SUBS

' recursive (calls itself) serach function
SUB searchFolder(folder)
     ' get files colection from our folder object
     Set fil = folder.files
     ' for each one lets get some info and store it in an array
     FOR EACH file IN fil
          ' first chop off the root bit
          filepath = RIGHT(file.path,(LEN(file.path) - LEN(root)) - 1)
          ' swap the seperators
          filepath = REPLACE(filepath,"\","/")
          ' add path to filename
          ' write out the link
          %><A HREF=<%=filepath%>><%=filepath%> : <%=file.size%> : <%=file.DateLastModified%></A><BR><%
     NEXT
     ' now do the sub-folders
     Set fol = folder.subfolders
     FOR EACH folder IN fol
          ' call our self to search this folder
          %><HR><%
          searchFolder(folder)    
     NEXT
END SUB
%>

0
 
LVL 10

Expert Comment

by:makerp
ID: 6347932
for example to scan a file

Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(file.path)
str = UCASE(f.ReadAll())
IF(INSTR(str,UCASE("Request("search_val")) <> 0)
  ' add page to list
END IF

this code will go just after

FOR EACH file IN fil
0
 
LVL 1

Expert Comment

by:Moondancer
ID: 6419723
Open today, need more?
Moondancer
Community Support Moderator @ Experts Exchange
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this tutorial I will show you how to provide a dynamic RTF document on your website generated with data from your database. For this tutorial you will need Microsoft Word or WordPad, WhizBase and Microsoft Access. In this tutorial I will show …
This article will show, step by step, how to integrate R code into a R Sweave document
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question