Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 208
  • Last Modified:

Search Engine

how do i write a search script that will simulate google.com.

search through the web pages on my site for pages that have the search criteria in them, and display the results set, 10 at a time, allowing for pagination (next and previous).

0
augblay
Asked:
augblay
  • 7
  • 3
  • 2
  • +2
1 Solution
 
ykf2000Commented:
I do not think that is possible without a spider gathering information and storing it into a database first.
0
 
augblayAuthor Commented:
i don't know perl, but with asp or vb it is possible,

here is a pseudocode for that, i hope some one can help.

grab all files that meet criteria store into an array
grab the page request variable, from the requesting page.
if page request variable is blank set it to 1
if page request variable is 1 then
  starting index variable=0
else
  starting index variable=(page number-1) * page size
end if

get ending index which is starting index + page size

if ending index > array lenght then
   ending index=array length
end if

loop through the array starting with starting index and ending with ending index, at each loop, printing the content

knowing the array lenght, the page size and the current page, we can know if we are at the end of the pages or at the beginning of the pages.


i hope this pseudocode helps

thanks

 
0
 
bebonhamCommented:
you want a search engine for your site only?

had you thought about how you would rank the pages?
0
Receive 1:1 tech help

Solve your biggest tech problems alongside global tech experts with 1:1 help.

 
augblayAuthor Commented:
no, i don't want page ranking,
0
 
makerpCommented:
you will not be able to search the pages for each request, it would kill the server, esspecially if there are a lot of pages. what you need is a process that spiders the web (or just your local server) for files, this would run independent of the web server and bung the pages found in a database. then you would need a script that would allow you to search like this. writing a web crawler is hardcore programming to be done in C++ or C

<%
' simple search engine example that takes a string of comma seperated keywords and then
' searches a simple single tables db with a table called table1 with a name column

Set Con = Server.CreateObject("ADODB.Connection")
Con.Open "test_db","",""
' if there no search string then display the form
IF(Request("search") = "")THEN
     %>
          <B>Please enter keywords separted by a space and select the and or radio buttpns</B>
          <FORM ACTION=<%=Request.ServerVariables("SCRIPT_NAME")%> METHOD=POST>
               <INPUT TYPE=TEXT SIZE=80 NAME=search><BR>
               OR<INPUT TYPE=RADIO NAME=s_opt VALUE=OR CHECKED>AND<INPUT TYPE=RADIO NAME=s_opt VALUE=AND><BR><BR>
               <INPUT TYPE=SUBMIT VALUE=Search>
          </FORM>
     <%
ELSE
' else the form has been submitted so lets query

     ' first split the string on spaces to get out keywords
     keywords = Split(Request("search")," ")
     ' get the array size of keywords
     no = UBOUND(keywords,1)
     ' now loop through building out string
     FOR i = 0 to no
          s = s + " name LIKE '%" & keywords(i) & "%' " & Request("s_opt")
     NEXT
     ' chop of the remaing logic operator
     s = LEFT(s, LEN(s) - LEN(Request("s_opt")))
     ' now execute the stmt. response.write for testing
     Response.write("SELECT * FROM table1 WHERE " & s)
     ' exe it and then display our results
     Set rs = Con.Execute("SELECT * FROM table1 WHERE " & s)
     DO UNTIL rs.EOF
          Response.Write(rs("name") & "<BR>")
          rs.MoveNext
     LOOP
END IF
%>
0
 
makerpCommented:
do i assume this is another question that will never get graded!
0
 
augblayAuthor Commented:
i will grade it but this is not the answer i want. there a cgi search script that my isp provided , the script searches through the pages and display all matching pages, in one page, you will have to scroll for a long time to view alll, what i am asking for, is an example in perl to mimic the pseudocode i provided. i know it will kill the server, but i asked for the script
0
 
makerpCommented:
for a start you need to recursivley scan the directories,

# written by paul maker
#
# recurisve search, this recursivly scans a path defined in the call find(path)
# file maching extensions @extensions are displayed. you could add these file to an
# array/database etc
#

# make use of the cgi package
use CGI;
# new object please
$query = new CGI;
# hard coded array of extensions (we could load this up dynamically from a database etc)
@extensions = (".htm",".html",".txt",".c",".cpp",".h");

# function to check if a filename has an extension we want
sub isValidExtension
{
        for($i=0;$i<$#extensions;$i++)
        {
                if(@_[0] eq $extensions[$i])
                {
                        return 1;
                }
        }
        return 0;
}
# our recursive function
sub find
{
        my($input) = @_[0];
        if(-f $input)
        {
                if(isValidExtension(substr($input, -4)) == 1)
                {
                        # we might what to slap this in a data base etc
                        printf("$input\n<BR>");
                }
        }
        if(-d $input)
        {
                if(substr($input, -1) ne ".")
                {
                        opendir(FD,$input);
                        my @files = readdir(FD);
                        closedir(FD);                
                        foreach $file (@files)
                        {
                                find("$input/$file");        
                        }
                }
        }
}

# make it a cgi to demo the script
print $query->header();
if(!$query->param())
{
 print("
          <FORM ACTION=search.pl METHOD=POST>
               Enter your path on the server to spider :
               <INPUT TYPE=TEXT NAME=path>
               <BR>
               <INPUT TYPE=SUBMIT VALUE=spider>
          </FORM>
     ");
}
else
{
        print("<B>I</B> found the following files :<BR><BR>");
        find($query->param('path'));
        print("<BR><A HREF=search.pl>Play Again</A>");
}
0
 
makerpCommented:
now when you find a file you need to open it and scan the text for your match.

in the above example you enter a path, you will want to remove that and have the path hard coded as you know where you want to start.

now this script will scan files on the local host, this wont kill your server. if you need to scan files on remote servers you will need to use http, this is another kettle of fish.

Paul
0
 
makerpCommented:

this is where you will want to scan the file and add it to an array, or printf it to the browser. also you will need to convert the filename to a link that will work on your server, this is easy simply chop the filename at the start of where the virtual directory is and append it to the server name

# we might what to slap this in a data base etc
printf("$input\n<BR>");
0
 
bebonhamCommented:
you think it is too hardcore for perl?

how come...

I always thought perl was king of text processing...

I'm trying to learn c++ now.
0
 
makerpCommented:
here is a cgi script in asp to scan directories, you will need to add the code to open each file and scan it for the text the user has entered in a search box

<%
' script to make a link for all files under
' virtual dir, youll have to tidy the output a bit

DIM root

' set this to the root of your chosen dir
root = Server.MapPath("./")

' *******************************************************************************************
' MAIN BIT

' create a filesystem object
Set fso = CreateObject("Scripting.FileSystemObject")

Set fldr = fso.GetFolder(root)

%><H1>All files under this Virtual Dir....</H1><%

' now search the folder, this will recursivly call the search on sub-folders
searchFolder(fldr)

' **********************************************************************************************
' DEFINITION OF SUBS

' recursive (calls itself) serach function
SUB searchFolder(folder)
     ' get files colection from our folder object
     Set fil = folder.files
     ' for each one lets get some info and store it in an array
     FOR EACH file IN fil
          ' first chop off the root bit
          filepath = RIGHT(file.path,(LEN(file.path) - LEN(root)) - 1)
          ' swap the seperators
          filepath = REPLACE(filepath,"\","/")
          ' add path to filename
          ' write out the link
          %><A HREF=<%=filepath%>><%=filepath%> : <%=file.size%> : <%=file.DateLastModified%></A><BR><%
     NEXT
     ' now do the sub-folders
     Set fol = folder.subfolders
     FOR EACH folder IN fol
          ' call our self to search this folder
          %><HR><%
          searchFolder(folder)    
     NEXT
END SUB
%>

0
 
makerpCommented:
for example to scan a file

Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(file.path)
str = UCASE(f.ReadAll())
IF(INSTR(str,UCASE("Request("search_val")) <> 0)
  ' add page to list
END IF

this code will go just after

FOR EACH file IN fil
0
 
MoondancerCommented:
Open today, need more?
Moondancer
Community Support Moderator @ Experts Exchange
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 7
  • 3
  • 2
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now