Link to home
Start Free TrialLog in
Avatar of jtjli
jtjli

asked on

A web page without extension (e.g. http://php.net/array )

In some URLs, there is not extension to the "file" -- For example:
http://au2.php.net/array
http://en.wikipedia.org/wiki/Prokaryotes

Initially I thought they are Directories and there is an index.xxx inside the directories.... but I have tried index.php/asp/jsp/htm/html and they actually do not exist (in the two examples above)

so...what are they?? is "array" a file without an extension or what? If so, how can a web server process files without an extension?
Avatar of jeaton32
jeaton32

Here's a clip I found from the W3C's website on URI's: http://www.w3.org/Provider/Style/URI.html

How can I remove the file extensions...

...from my URIs in a practical file-based web server?

If you are using, for example, Apache, you can set it up to do content negotiation. You keep the file extension (such as .png) on the file (e.g. mydog.png), but refer to the web resource without it. Apache then checks the directory for all files with that name and any extension, and it can also pick the best one out of a set (e.g. GIF and PNG). (You do not have to put different types of file in different directories, in fact the content negotiation won't work if you do.)

    * Set up your server to do content negotiation
    * Make references always to the URI without the extension

References which do have the extension on will still work but will not allow your server to select the best of currently available and future formats.

(In fact, mydog, mydog.png and mydog.gif are each valid web resources. mydog is content-type-generic. mydog.png and mydog.gif are content-type-specific.)

Of course, if you are building your own server, then using a database to relate persistent identifiers to their current form is a very clean idea -- though beware the unbounded growth of your database.
jtjli,

Here's another link that looks a little more comprehensive:

http://www.websiteoptimization.com/speed/tweak/rewrite/

This article covers apache and IIS.
ASKER CERTIFIED SOLUTION
Avatar of OliWarner
OliWarner

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
For a URL like -- http://au2.php.net/array
what that really means is, there is some file in the "array" directory that is the "default" file for that dir.

There is a general heirarcy to observe, and it is usually in this sequence, 1 is first, 2 is next, 3 third, etc.
1.  array/index.html
2.  array/index.htm
3.  array/index.shtml
4.  array/index.php
5.  array/index.asp

SO there are many possibiliites, but there is a *default* file for each directory, like array.

What defines which file will be the default is the .htaccess file on each site.  Any person can code their .htaccess file to make the default file whatever they want it to be -- you as an outsider cannot discover this of your own accord, since the .htaccess file is hidden -- it is up to the site owner to decide that.

Read up on .htaccess files on the internet.  They are extremely powerful files to redirect web sites.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Basically we have something called wild card mapping in webservers. I will talk about IIS 6.0. We can have a programm on the server to process the request first then any other thing when any comes to a webserver.. So that programm knows about what to add to the request. Its just kind of security feature which hides the extensions and the special program set to process the requests will know how to handle that.  Its really normal..