Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Web Site Search Engine software

Posted on 2004-09-08
8
Medium Priority
?
297 Views
Last Modified: 2013-11-19
I am running IIS 5.0 on Windows 2000 Server.  I found a packaged search engine software called 'Zoom Search Engine 3.1'.  This particuliar product says it can run on one box and use the space on that box to index the site verses on the actual server.  I was wondering if anyone had any info about this product or if there is something better out there.  I have a lot of documents on the web site both word and .pdf that folks need to find via the search engine.  Apparently the zoom has a plug-in that can do this for me.  I also have space issues at this time so need to run it from another box.  I don't want to write any code myself as I am the only person that handles all of this and other projects as well and don't have time for coding.  Thanks for any information or refrences as to what other products can do this.
0
Comment
Question by:a182612
  • 4
  • 3
8 Comments
 
LVL 6

Expert Comment

by:Fahdmurtaza
ID: 12006098
Ok what I'll recommend is my favorite and I think the best free one i.e the web wiz site search. You can search for it on google and you will instantly get its link. You can download it from there and easily configure it for your site in 5 mins.

Regards
Fahd Murtaza
0
 
LVL 3

Expert Comment

by:passmark
ID: 12013760
If you really have no free disk space on your web site to upload the Zoom index files, then you can put the search script and the index files on another site.

For example, the CIA's world fact book web site was indexed with Zoom, but the search function put on a different host. See this page,
http://www.wrensoft.com/zoom/worldfactbook/search.php

You can see the files that are generated by Zoom here,
http://www.wrensoft.com/zoom/worldfactbook/

The search function is on this domain,
http://www.wrensoft.com/
but the results point to this domain
http://www.cia.gov/

However in your case you will probably need to select the ASP option instead of the PHP option in Zoom. (becuase you are using IIS and by default PHP is not installed on IIS).

If you have specifc questions about Zoom, list them here and I'll try to answer them point by point.

----
David
0
 

Author Comment

by:a182612
ID: 12018358
David, what if I add another drive and then create a virtual directory that the webserver recognizes as part of the root. Could I use that drive to run the zoom appication and then store the index that it requires?
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 3

Expert Comment

by:passmark
ID: 12023544

I don't see any problem with what you are suggesting. It doesn't really matter what directory the index files are in, as long as you can access them via a URL.

For the creation of the index there are two modes in Zoom, offline mode and spider mode. In general spider mode is a better choice becuase it indexes all your dymanic content (such as ASP content and database content exposed through web pages).

----
David
0
 

Author Comment

by:a182612
ID: 12043117
Can I set this up so it will only index file names for .pdf and .doc documents?  
Could I also set it up to search actual .pdf and .doc files for specific keywords?
0
 
LVL 3

Expert Comment

by:passmark
ID: 12043259
In the Configuration window of Zoom on the Scan Options tab you can enter in a list of file extensions to search. You can remove all the extensions except .doc and .pdf if you like. The potential problem you might have with this however is that you probably have a number of HTML pages that provide the navigation for your web site. (i.e. your home page is probably a HTML or ASP document, rather than a PDF).

So for the spider to get to your PDF files you’ll probably need to index and follow the links on a number of HTML pages. So removing the HTML file will maybe not give the result you want. There are a couple of ways around this problem, but it would help to know what your site is like. Can you post the URL?

I am not sure not if I understand your question about specific keywords. By default Zoom will index the entire content of PDF and Word documents and store all the words it finds in its index (not just specific keywords). Can you give an example of the behaviour you want?

------
David
0
 

Author Comment

by:a182612
ID: 12055904
Indexing all the documents will probably take up a lot of space.  Especially since I have so many.  Can the product use a metatag or can it just search on the document names in the URL such as 'safety/cssd/Hurricane.pdf' without actually having to store an entire .pdf file as an index?
0
 
LVL 3

Accepted Solution

by:
passmark earned 1000 total points
ID: 12059906
The index is coded and compressed so it will be much smaller than your entire collection of PDFs. But nevertheless the entire text of the document is indexed. The text of the document is required to be stored in the index for the following reasons,

1) For exact phrase matching to occur. e.g. the user searches for "Hurricane safety procedures". The search results will be just the pages that have these three words in the text AND have the words in the same order that the user entered them.

2) So that the search results can display the context of the search. e.g. "Your local government recomends that you follow the <bold>hurricane safety procedures</bold> described below"

You can disable both of these features and save some space in the index

There is also an option in Zoom called, "Index meta information only". This option allows you to only index the meta information found on a page (ie: the title, keywords, description, and zoomwords). Note that the page content will still have to be scanned through (in order to find links to other pages) so the scanning process will not be significantly faster, but does allow the index data files to be smaller. This is useful for sites where the page contents are less meaningful or searchable than the meta information available (eg. technical papers, charts, etc.). However we have found that in a lot of cases the meta data is not accurate or keep up to date by webmasters. There is also a feature for for adding meta data to PDF files, using .desc files. See the Zoom users guide for more details.
http://www.wrensoft.com/ftp/zoom.pdf

-----
David
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Some of the SEO trends we might expect in 2017.
Dramatic changes are revolutionizing how we build and use technology. Every company is automating, digitizing, and modernizing operations. We need a better, more connected way to work together as teams so we can harness the insights from our system…
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.
Suggested Courses

877 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question