Solved

make google friendly

Posted on 2003-11-19
8
280 Views
Last Modified: 2010-05-19
Hi

The cgi scripts/dynamic sites are not visited from google.
What is the best way to make scan from google search robot?
0
Comment
Question by:tilmes
  • 3
  • 2
  • 2
8 Comments
 
LVL 20

Expert Comment

by:jmcg
Comment Utility
You may be able to entice Google into scanning your site by placing an appropriate entry in your robots.txt file.

Take a look at Expert-Exchange's robots.txt file, for example:

User-agent: *
Disallow:

There are also "Allow:" directives and you could list your .cgi script there.

The main hesitation behind spidering dynamic sites is that they may never have the same thing twice. That leaves doubt about whether a search engine should keep track of what the page once said, but which it won't say again if you visit it.



0
 
LVL 2

Expert Comment

by:ext2
Comment Utility
In Apache, simply use Action/AddHandler in your .htaccess file to translate external *.html URLs seen by users and search engines to internal *.cgi paths on the server:

  http://httpd.apache.org/docs/handler.html

Your users might never even know that your page are implemented internally with CGIs.

I've often wondered why ASP/JSP/Perl/etc. traditionally add their own extensions to URLs.  This needlessly exposes implementation, and it makes it difficult to change the implementation (Parnas would not be pleased).  <troll>It's poor design, and apparently MS is slow to understand this.</troll>
0
 

Author Comment

by:tilmes
Comment Utility
Hi

How can i use
simply use Action/AddHandler in your .htaccess?
Please make me understand this.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 

Author Comment

by:tilmes
Comment Utility
Hi

I read about the article said that change query part of the dynamic URL
Example - http://www.my-online-store.com/books.asp?id=1190
to http://www.my-online-store.com/books/A
HOw can i do this in my cgi script?
0
 
LVL 2

Accepted Solution

by:
ext2 earned 50 total points
Comment Utility
Create a file named ".htaccess" in either your root web directory or some subdirectory of it.  Any setting you place in the .htaccess file will be applied to the directory in which it is contained as well as all subdirectories of that directory (you can even have multiple .htaccess files in your directory hierarchy, where one file overrides settings in another).  So, for testing purposes, you might create a subdirectory named "test" (accessible from the URL http://myserver.com/test/) and place an .htaccess file in that directory to play with.

In your .htaccess file, add something like this:

    Action my-handler /cgi-bin/myfilter.pl
    AddHandler my-handler .html

This will cause all URLs with extension .html within the "/test" directory or subdirectories to be internally sent to the CGI residing at http://myserver.com/cgi-bin/myfilter.pl .  Since multiple URLs might all be sent to the same CGI, you'll likely need some way for your CGI to determine which URL invoked it.  You can obtain this info from two environment variables:

  $ENV{PATH_TRANSLATED}
  $ENV{REQUEST_URI}

The former is the absolute file system path.  So, if you requested "http://myserver.com/test/ok.html" and your root web directory is "/var/htdocs", then this variables will be "/var/htdocs/test/ok.html".

The later is the path given in the URL (not including the domain name).  In the above example, this would be "/test/ok.html".

Now, what if you want only -some- HTML files within the "test" directory to be sent to your CGI?  The selection can be specified in your .htaccess file with a "Files" tag.  So, if your .htaccess instead included

  Action my-handler /cgi-bin/myfilter.pl
  <Files "ok.html">
      SetHandler my-handler
  </Files>

then only "http://myserver.com/test/ok.html" will be sent to the CGI, while "http://myserver.com/test/hello.html" will be served statically as normal (Apache actually uses "default-handler" as the name for the handler that serves static pages, so "SetHandler default-handler" would be a way to specify this explicitly on files).

Concerning your second message (11:35PM), what you do is pretend that http://www.my-online-store.com/books is a file (rather than a directory) and use an Action/SetHandler to send all requests on that file to your CGI.  The "A" part I believe can be retrieved via the environment variables as mentioned above.  If in doubt, just use a

  use Data::Dumper;

  print "Content-type: text/html\n\n";
  print Dumper(\%ENV);

to display what environment variables your CGI is seeing.
0
 

Author Comment

by:tilmes
Comment Utility
thank you for the explanation.
I inserted below two in .htaccess file
but in the query string has not changed at all.
Do i need to change also in httpd.conf file in parent directory?

Options +FollowSymLinks
RewriteEngine on
RewriteBase /
RewriteRule cla/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/(\.*)/$ /tell/cgi-bin/ADcla/cla\.cgi?$1=$2&$3=$4&$5=$6&$7=$8


Action my-handler /tell/cgi-bin/ADcla/cla.cgi
AddHandler my-handler .html
0
 
LVL 20

Expert Comment

by:jmcg
Comment Utility
Nothing has happened on this question in more than 7 weeks. It's time for cleanup!

My recommendation, which I will post in the Cleanup topic area, is to
accept answer by ext2 [grade B] (on the road to an answer).

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
perl to mysql 5 127
use perl to insert into MySQL database 9 121
Perl 101 11 68
Add additional column to .csv using Perl. 8 120
On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now