Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Telling a spider to crawl a page not on my index page

Posted on 2004-04-14
13
370 Views
Last Modified: 2010-04-27
I would like to tell the search engine spiders to crawl a page that is not indexed in my html paged not linked to directly anywhere in my site.

I have a search engine and the results are all html files however I can't think of a way to let the spider know where they are so it can index them.

Also do spiders index file directory lists.  IE if it comes to a directory and there is just a list of files on the server, will it crawl those files?

thanks, Will
0
Comment
Question by:webcs
  • 3
  • 2
  • 2
  • +3
13 Comments
 
LVL 24

Accepted Solution

by:
duz earned 50 total points
ID: 10828067
webcs -

>I can't think of a way to let the spider know where they are

Search engine spiders follow links and if there are no inbound links to a page they will not find it. You could submit the page by hand and the spider may come and take a look but with zero inbound links it will not be impressed and it will rank the page as low as it can.

>do spiders index file directory lists

Spiders index web pages and files of type pdf, asp, shtml, xml, cfm, doc, xls, ppt, rtf etc., but as above there must be a link to the file for it to get spidered.

Search engines do not 'see' lists of files in a directory, they just spider the web by going from link to link.

- duz

0
 
LVL 2

Expert Comment

by:sudev_shetty
ID: 10841564
sorry there is no way to do that unless there is a inbond link to that file
0
 
LVL 29

Expert Comment

by:coreybryant
ID: 10851172
You could consider placing a spacer.gif with the link there - that will at least get the hyperlink on the page

-Corey
0
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 24

Expert Comment

by:duz
ID: 10852142
Corey -

You're not suggesting a hidden link are you....?

Number one on the list of Google's "Quality Guidelines - Specific recommendations:" is "Avoid hidden text or hidden links".

- duz

 
0
 
LVL 29

Expert Comment

by:coreybryant
ID: 10853211
Well worse comes to worse you know. :)  The other option woud possibly be to create a sitemap as well.  

-Corey
0
 
LVL 33

Expert Comment

by:shalomc
ID: 10890411
webcs,
If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents.

ShalomC
0
 

Expert Comment

by:jpjanze
ID: 10965797
You could also do a 'paid submit' of the direct url - it will get independantly listed quicker than doing a 'free' submit.

You could, actually, you SHOULD get a domain name for the page if it is important enough that you want it found! Then optimize the heck out of it for the specific information.
0
 
LVL 2

Author Comment

by:webcs
ID: 11173692
I did think of that...but will a spider actually index every file in a list like that or just ignore them.  Also thought of putting a chatacter like an asterist on the bottom of the page and linking in that way.  That essentially would not be hidden text correct?

 webcs,
"If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents."

ShalomC
0
 
LVL 33

Expert Comment

by:shalomc
ID: 11174365
Hey,
A spider doesn't care whether the HTML page was created manually or by the web server. If the browser can display it - a spider can read it and will index it.

ShalomC
0
 
LVL 2

Author Comment

by:webcs
ID: 11174622
I suppose it wouldnt help the rank or weuight to be found that way though,
0
 
LVL 33

Expert Comment

by:shalomc
ID: 11175239
The rank is calculated by popularity, or incoming links.
If you have Apache, you can add static headers and footers to the automatic directory index, and include there any descriptive text that will help the search engine calculate your site's relevance.


ShalomC
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Read about the 3 stages of the buyer's journey: awareness, consideration, and decision.
Google always has something new and amazing up its sleeve, and the most current thing that they have been working on is another step in the evolution of Google Search, from machine learning to its brilliant successor, deep learning.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
An overview of how to create reports in Adobe Analytics (formerly Omniture Site Catalyst) using pageNames, events, eVars and props. This video will show you how to install the Omniture Debugger tool so can see (and test) what is being passed int…

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question