Solved

Telling a spider to crawl a page not on my index page

Posted on 2004-04-14
13
373 Views
Last Modified: 2010-04-27
I would like to tell the search engine spiders to crawl a page that is not indexed in my html paged not linked to directly anywhere in my site.

I have a search engine and the results are all html files however I can't think of a way to let the spider know where they are so it can index them.

Also do spiders index file directory lists.  IE if it comes to a directory and there is just a list of files on the server, will it crawl those files?

thanks, Will
0
Comment
Question by:webcs
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +3
13 Comments
 
LVL 24

Accepted Solution

by:
duz earned 50 total points
ID: 10828067
webcs -

>I can't think of a way to let the spider know where they are

Search engine spiders follow links and if there are no inbound links to a page they will not find it. You could submit the page by hand and the spider may come and take a look but with zero inbound links it will not be impressed and it will rank the page as low as it can.

>do spiders index file directory lists

Spiders index web pages and files of type pdf, asp, shtml, xml, cfm, doc, xls, ppt, rtf etc., but as above there must be a link to the file for it to get spidered.

Search engines do not 'see' lists of files in a directory, they just spider the web by going from link to link.

- duz

0
 
LVL 2

Expert Comment

by:sudev_shetty
ID: 10841564
sorry there is no way to do that unless there is a inbond link to that file
0
 
LVL 29

Expert Comment

by:coreybryant
ID: 10851172
You could consider placing a spacer.gif with the link there - that will at least get the hyperlink on the page

-Corey
0
MS Dynamics Made Instantly Simpler

Make Your Microsoft Dynamics Investment Count  & Drastically Decrease Training Time by Providing Intuitive Step-By-Step WalkThru Tutorials.

 
LVL 24

Expert Comment

by:duz
ID: 10852142
Corey -

You're not suggesting a hidden link are you....?

Number one on the list of Google's "Quality Guidelines - Specific recommendations:" is "Avoid hidden text or hidden links".

- duz

 
0
 
LVL 29

Expert Comment

by:coreybryant
ID: 10853211
Well worse comes to worse you know. :)  The other option woud possibly be to create a sitemap as well.  

-Corey
0
 
LVL 33

Expert Comment

by:shalomc
ID: 10890411
webcs,
If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents.

ShalomC
0
 

Expert Comment

by:jpjanze
ID: 10965797
You could also do a 'paid submit' of the direct url - it will get independantly listed quicker than doing a 'free' submit.

You could, actually, you SHOULD get a domain name for the page if it is important enough that you want it found! Then optimize the heck out of it for the specific information.
0
 
LVL 2

Author Comment

by:webcs
ID: 11173692
I did think of that...but will a spider actually index every file in a list like that or just ignore them.  Also thought of putting a chatacter like an asterist on the bottom of the page and linking in that way.  That essentially would not be hidden text correct?

 webcs,
"If your web server allows directory browsing, and you submitted the directory url rather than or in addition to the index.html, then the spider will index the directory.
In such a setup, the web server in effect creates an HTML page that lists the directory contents."

ShalomC
0
 
LVL 33

Expert Comment

by:shalomc
ID: 11174365
Hey,
A spider doesn't care whether the HTML page was created manually or by the web server. If the browser can display it - a spider can read it and will index it.

ShalomC
0
 
LVL 2

Author Comment

by:webcs
ID: 11174622
I suppose it wouldnt help the rank or weuight to be found that way though,
0
 
LVL 33

Expert Comment

by:shalomc
ID: 11175239
The rank is calculated by popularity, or incoming links.
If you have Apache, you can add static headers and footers to the automatic directory index, and include there any descriptive text that will help the search engine calculate your site's relevance.


ShalomC
0

Featured Post

[Webinar] How Hackers Steal Your Credentials

Do You Know How Hackers Steal Your Credentials? Join us and Skyport Systems to learn how hackers steal your credentials and why Active Directory must be secure to stop them. Thursday, July 13, 2017 10:00 A.M. PDT

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Preparing an email is something we should all take special care with – especially when the email is for somebody you may not know very well. The pressures of everyday working life stacked with a hectic office environment can make this a real challen…
CTAs encourage people to do something specific to show interest in your company, product or service. Keep reading to learn why CTAs should always be thought of as extremely important, albeit small, sections of websites.
Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
An overview of how to create reports in Adobe Analytics (formerly Omniture Site Catalyst) using pageNames, events, eVars and props. This video will show you how to install the Omniture Debugger tool so can see (and test) what is being passed int…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question