Solved

Telling a spider to crawl a page not on my index page

Posted on 2004-04-14
13
367 Views
Last Modified: 2010-04-27
I would like to tell the search engine spiders to crawl a page that is not indexed in my html paged not linked to directly anywhere in my site.

I have a search engine and the results are all html files however I can't think of a way to let the spider know where they are so it can index them.

Also do spiders index file directory lists.  IE if it comes to a directory and there is just a list of files on the server, will it crawl those files?

thanks, Will
0
Comment
Question by:webcs
  • 3
  • 2
  • 2
  • +3
13 Comments
 
LVL 24

Accepted Solution

by:
duz earned 50 total points
Comment Utility
webcs -

>I can't think of a way to let the spider know where they are

Search engine spiders follow links and if there are no inbound links to a page they will not find it. You could submit the page by hand and the spider may come and take a look but with zero inbound links it will not be impressed and it will rank the page as low as it can.

>do spiders index file directory lists

Spiders index web pages and files of type pdf, asp, shtml, xml, cfm, doc, xls, ppt, rtf etc., but as above there must be a link to the file for it to get spidered.

Search engines do not 'see' lists of files in a directory, they just spider the web by going from link to link.

- duz

0
 
LVL 2

Expert Comment

by:sudev_shetty
Comment Utility
sorry there is no way to do that unless there is a inbond link to that file
0
 
LVL 29

Expert Comment

by:coreybryant
Comment Utility
You could consider placing a spacer.gif with the link there - that will at least get the hyperlink on the page

-Corey
0
 
LVL 24

Expert Comment

by:duz
Comment Utility
Corey -

You're not suggesting a hidden link are you....?

Number one on the list of Google's "Quality Guidelines - Specific recommendations:" is "Avoid hidden text or hidden links".

- duz

 
0
 
LVL 29

Expert Comment

by:coreybryant
Comment Utility
Well worse comes to worse you know. :)  The other option woud possibly be to create a sitemap as well.  

-Corey
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 32

Expert Comment

by:shalomc
Comment Utility
webcs,
If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents.

ShalomC
0
 

Expert Comment

by:jpjanze
Comment Utility
You could also do a 'paid submit' of the direct url - it will get independantly listed quicker than doing a 'free' submit.

You could, actually, you SHOULD get a domain name for the page if it is important enough that you want it found! Then optimize the heck out of it for the specific information.
0
 
LVL 2

Author Comment

by:webcs
Comment Utility
I did think of that...but will a spider actually index every file in a list like that or just ignore them.  Also thought of putting a chatacter like an asterist on the bottom of the page and linking in that way.  That essentially would not be hidden text correct?

 webcs,
"If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents."

ShalomC
0
 
LVL 32

Expert Comment

by:shalomc
Comment Utility
Hey,
A spider doesn't care whether the HTML page was created manually or by the web server. If the browser can display it - a spider can read it and will index it.

ShalomC
0
 
LVL 2

Author Comment

by:webcs
Comment Utility
I suppose it wouldnt help the rank or weuight to be found that way though,
0
 
LVL 32

Expert Comment

by:shalomc
Comment Utility
The rank is calculated by popularity, or incoming links.
If you have Apache, you can add static headers and footers to the automatic directory index, and include there any descriptive text that will help the search engine calculate your site's relevance.


ShalomC
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

A step by step SEO guide to creating content that drives traffic and maximizes page views by using the right tricks, tools and keyword analysis. You are the subject matter expert.   You have forgotten more about your area of expertise than most …
Digital marketing agencies have encountered both the opportunities and difficulties that emerge from working with a wide-ranging organizations.
This tutorial walks through the best practices in adding a local business to Google Maps including how to properly search for duplicates, marker placement, and inputing business details. Login to your Google Account, then search for "Google Mapmaker…
Learn how to set-up custom confirmation messages to users who complete your Wufoo form. Include inputs from fields in your form, webpage redirects, and more with Wufoo’s confirmation options.

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now