Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


make google friendly

Posted on 2003-11-19
Medium Priority
Last Modified: 2010-05-19

The cgi scripts/dynamic sites are not visited from google.
What is the best way to make scan from google search robot?
Question by:tilmes
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
LVL 20

Expert Comment

ID: 9784156
You may be able to entice Google into scanning your site by placing an appropriate entry in your robots.txt file.

Take a look at Expert-Exchange's robots.txt file, for example:

User-agent: *

There are also "Allow:" directives and you could list your .cgi script there.

The main hesitation behind spidering dynamic sites is that they may never have the same thing twice. That leaves doubt about whether a search engine should keep track of what the page once said, but which it won't say again if you visit it.


Expert Comment

ID: 9784562
In Apache, simply use Action/AddHandler in your .htaccess file to translate external *.html URLs seen by users and search engines to internal *.cgi paths on the server:

Your users might never even know that your page are implemented internally with CGIs.

I've often wondered why ASP/JSP/Perl/etc. traditionally add their own extensions to URLs.  This needlessly exposes implementation, and it makes it difficult to change the implementation (Parnas would not be pleased).  <troll>It's poor design, and apparently MS is slow to understand this.</troll>

Author Comment

ID: 9785097

How can i use
simply use Action/AddHandler in your .htaccess?
Please make me understand this.
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.


Author Comment

ID: 9785226

I read about the article said that change query part of the dynamic URL
Example -
HOw can i do this in my cgi script?

Accepted Solution

ext2 earned 150 total points
ID: 9792205
Create a file named ".htaccess" in either your root web directory or some subdirectory of it.  Any setting you place in the .htaccess file will be applied to the directory in which it is contained as well as all subdirectories of that directory (you can even have multiple .htaccess files in your directory hierarchy, where one file overrides settings in another).  So, for testing purposes, you might create a subdirectory named "test" (accessible from the URL and place an .htaccess file in that directory to play with.

In your .htaccess file, add something like this:

    Action my-handler /cgi-bin/
    AddHandler my-handler .html

This will cause all URLs with extension .html within the "/test" directory or subdirectories to be internally sent to the CGI residing at .  Since multiple URLs might all be sent to the same CGI, you'll likely need some way for your CGI to determine which URL invoked it.  You can obtain this info from two environment variables:


The former is the absolute file system path.  So, if you requested "" and your root web directory is "/var/htdocs", then this variables will be "/var/htdocs/test/ok.html".

The later is the path given in the URL (not including the domain name).  In the above example, this would be "/test/ok.html".

Now, what if you want only -some- HTML files within the "test" directory to be sent to your CGI?  The selection can be specified in your .htaccess file with a "Files" tag.  So, if your .htaccess instead included

  Action my-handler /cgi-bin/
  <Files "ok.html">
      SetHandler my-handler

then only "" will be sent to the CGI, while "" will be served statically as normal (Apache actually uses "default-handler" as the name for the handler that serves static pages, so "SetHandler default-handler" would be a way to specify this explicitly on files).

Concerning your second message (11:35PM), what you do is pretend that is a file (rather than a directory) and use an Action/SetHandler to send all requests on that file to your CGI.  The "A" part I believe can be retrieved via the environment variables as mentioned above.  If in doubt, just use a

  use Data::Dumper;

  print "Content-type: text/html\n\n";
  print Dumper(\%ENV);

to display what environment variables your CGI is seeing.

Author Comment

ID: 9795559
thank you for the explanation.
I inserted below two in .htaccess file
but in the query string has not changed at all.
Do i need to change also in httpd.conf file in parent directory?

Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteRule cla/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/$ /tell/cgi-bin/ADcla/cla\.cgi?$1=$2&$3=$4&$5=$6&$7=$8

Action my-handler /tell/cgi-bin/ADcla/cla.cgi
AddHandler my-handler .html
LVL 20

Expert Comment

ID: 10093432
Nothing has happened on this question in more than 7 weeks. It's time for cleanup!

My recommendation, which I will post in the Cleanup topic area, is to
accept answer by ext2 [grade B] (on the road to an answer).


EE Cleanup Volunteer

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

664 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question