Solved

Search Engine

Posted on 2001-07-17
14
194 Views
Last Modified: 2013-12-25
how do i write a search script that will simulate google.com.

search through the web pages on my site for pages that have the search criteria in them, and display the results set, 10 at a time, allowing for pagination (next and previous).

0
Comment
Question by:augblay
  • 7
  • 3
  • 2
  • +2
14 Comments
 
LVL 4

Expert Comment

by:ykf2000
Comment Utility
I do not think that is possible without a spider gathering information and storing it into a database first.
0
 
LVL 1

Author Comment

by:augblay
Comment Utility
i don't know perl, but with asp or vb it is possible,

here is a pseudocode for that, i hope some one can help.

grab all files that meet criteria store into an array
grab the page request variable, from the requesting page.
if page request variable is blank set it to 1
if page request variable is 1 then
  starting index variable=0
else
  starting index variable=(page number-1) * page size
end if

get ending index which is starting index + page size

if ending index > array lenght then
   ending index=array length
end if

loop through the array starting with starting index and ending with ending index, at each loop, printing the content

knowing the array lenght, the page size and the current page, we can know if we are at the end of the pages or at the beginning of the pages.


i hope this pseudocode helps

thanks

 
0
 
LVL 8

Expert Comment

by:bebonham
Comment Utility
you want a search engine for your site only?

had you thought about how you would rank the pages?
0
 
LVL 1

Author Comment

by:augblay
Comment Utility
no, i don't want page ranking,
0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
you will not be able to search the pages for each request, it would kill the server, esspecially if there are a lot of pages. what you need is a process that spiders the web (or just your local server) for files, this would run independent of the web server and bung the pages found in a database. then you would need a script that would allow you to search like this. writing a web crawler is hardcore programming to be done in C++ or C

<%
' simple search engine example that takes a string of comma seperated keywords and then
' searches a simple single tables db with a table called table1 with a name column

Set Con = Server.CreateObject("ADODB.Connection")
Con.Open "test_db","",""
' if there no search string then display the form
IF(Request("search") = "")THEN
     %>
          <B>Please enter keywords separted by a space and select the and or radio buttpns</B>
          <FORM ACTION=<%=Request.ServerVariables("SCRIPT_NAME")%> METHOD=POST>
               <INPUT TYPE=TEXT SIZE=80 NAME=search><BR>
               OR<INPUT TYPE=RADIO NAME=s_opt VALUE=OR CHECKED>AND<INPUT TYPE=RADIO NAME=s_opt VALUE=AND><BR><BR>
               <INPUT TYPE=SUBMIT VALUE=Search>
          </FORM>
     <%
ELSE
' else the form has been submitted so lets query

     ' first split the string on spaces to get out keywords
     keywords = Split(Request("search")," ")
     ' get the array size of keywords
     no = UBOUND(keywords,1)
     ' now loop through building out string
     FOR i = 0 to no
          s = s + " name LIKE '%" & keywords(i) & "%' " & Request("s_opt")
     NEXT
     ' chop of the remaing logic operator
     s = LEFT(s, LEN(s) - LEN(Request("s_opt")))
     ' now execute the stmt. response.write for testing
     Response.write("SELECT * FROM table1 WHERE " & s)
     ' exe it and then display our results
     Set rs = Con.Execute("SELECT * FROM table1 WHERE " & s)
     DO UNTIL rs.EOF
          Response.Write(rs("name") & "<BR>")
          rs.MoveNext
     LOOP
END IF
%>
0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
do i assume this is another question that will never get graded!
0
 
LVL 1

Author Comment

by:augblay
Comment Utility
i will grade it but this is not the answer i want. there a cgi search script that my isp provided , the script searches through the pages and display all matching pages, in one page, you will have to scroll for a long time to view alll, what i am asking for, is an example in perl to mimic the pseudocode i provided. i know it will kill the server, but i asked for the script
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 10

Expert Comment

by:makerp
Comment Utility
for a start you need to recursivley scan the directories,

# written by paul maker
#
# recurisve search, this recursivly scans a path defined in the call find(path)
# file maching extensions @extensions are displayed. you could add these file to an
# array/database etc
#

# make use of the cgi package
use CGI;
# new object please
$query = new CGI;
# hard coded array of extensions (we could load this up dynamically from a database etc)
@extensions = (".htm",".html",".txt",".c",".cpp",".h");

# function to check if a filename has an extension we want
sub isValidExtension
{
        for($i=0;$i<$#extensions;$i++)
        {
                if(@_[0] eq $extensions[$i])
                {
                        return 1;
                }
        }
        return 0;
}
# our recursive function
sub find
{
        my($input) = @_[0];
        if(-f $input)
        {
                if(isValidExtension(substr($input, -4)) == 1)
                {
                        # we might what to slap this in a data base etc
                        printf("$input\n<BR>");
                }
        }
        if(-d $input)
        {
                if(substr($input, -1) ne ".")
                {
                        opendir(FD,$input);
                        my @files = readdir(FD);
                        closedir(FD);                
                        foreach $file (@files)
                        {
                                find("$input/$file");        
                        }
                }
        }
}

# make it a cgi to demo the script
print $query->header();
if(!$query->param())
{
 print("
          <FORM ACTION=search.pl METHOD=POST>
               Enter your path on the server to spider :
               <INPUT TYPE=TEXT NAME=path>
               <BR>
               <INPUT TYPE=SUBMIT VALUE=spider>
          </FORM>
     ");
}
else
{
        print("<B>I</B> found the following files :<BR><BR>");
        find($query->param('path'));
        print("<BR><A HREF=search.pl>Play Again</A>");
}
0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
now when you find a file you need to open it and scan the text for your match.

in the above example you enter a path, you will want to remove that and have the path hard coded as you know where you want to start.

now this script will scan files on the local host, this wont kill your server. if you need to scan files on remote servers you will need to use http, this is another kettle of fish.

Paul
0
 
LVL 10

Accepted Solution

by:
makerp earned 200 total points
Comment Utility

this is where you will want to scan the file and add it to an array, or printf it to the browser. also you will need to convert the filename to a link that will work on your server, this is easy simply chop the filename at the start of where the virtual directory is and append it to the server name

# we might what to slap this in a data base etc
printf("$input\n<BR>");
0
 
LVL 8

Expert Comment

by:bebonham
Comment Utility
you think it is too hardcore for perl?

how come...

I always thought perl was king of text processing...

I'm trying to learn c++ now.
0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
here is a cgi script in asp to scan directories, you will need to add the code to open each file and scan it for the text the user has entered in a search box

<%
' script to make a link for all files under
' virtual dir, youll have to tidy the output a bit

DIM root

' set this to the root of your chosen dir
root = Server.MapPath("./")

' *******************************************************************************************
' MAIN BIT

' create a filesystem object
Set fso = CreateObject("Scripting.FileSystemObject")

Set fldr = fso.GetFolder(root)

%><H1>All files under this Virtual Dir....</H1><%

' now search the folder, this will recursivly call the search on sub-folders
searchFolder(fldr)

' **********************************************************************************************
' DEFINITION OF SUBS

' recursive (calls itself) serach function
SUB searchFolder(folder)
     ' get files colection from our folder object
     Set fil = folder.files
     ' for each one lets get some info and store it in an array
     FOR EACH file IN fil
          ' first chop off the root bit
          filepath = RIGHT(file.path,(LEN(file.path) - LEN(root)) - 1)
          ' swap the seperators
          filepath = REPLACE(filepath,"\","/")
          ' add path to filename
          ' write out the link
          %><A HREF=<%=filepath%>><%=filepath%> : <%=file.size%> : <%=file.DateLastModified%></A><BR><%
     NEXT
     ' now do the sub-folders
     Set fol = folder.subfolders
     FOR EACH folder IN fol
          ' call our self to search this folder
          %><HR><%
          searchFolder(folder)    
     NEXT
END SUB
%>

0
 
LVL 10

Expert Comment

by:makerp
Comment Utility
for example to scan a file

Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(file.path)
str = UCASE(f.ReadAll())
IF(INSTR(str,UCASE("Request("search_val")) <> 0)
  ' add page to list
END IF

this code will go just after

FOR EACH file IN fil
0
 
LVL 1

Expert Comment

by:Moondancer
Comment Utility
Open today, need more?
Moondancer
Community Support Moderator @ Experts Exchange
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
This article will show, step by step, how to integrate R code into a R Sweave document
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…
The viewer will learn how to count occurrences of each item in an array.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now