views:

311

answers:

7

I understand that calling the home page index.html is a convention. Is that right ? Does this name has any special meaning (maybe for search engines) ?

+2  A: 

It may be a convention on many servers, but you can configure your server to route to a different page if you want. I don't know that there is any special meaning to search engines, but SEO experts might have something else to say...

FrustratedWithFormsDesigner
The name you use as your default usually is somewhat invisible to search engines if nobody links to it explicitly. Most people link to http://www.google.com, not http://www.google.com/index.html .
Brian
+8  A: 

I believe that it's historical rather than anything else. When we were creating pages back in 1996, we noted that if you didn't put an index.html, it listed the directory contents. If you had an index.html, it showed that instead.

Jonathan
Who's "we"? Just curious :)
Psytronic
+17  A: 

It's simply an English word that refers to a list of other places (in this case, on the website).

From the thefreedictionary.com:

Index: An alphabetized list of names, places, and subjects treated in a printed work, giving the page or pages on which each item is mentioned.

I suppose the convention stems from the fact that if no index file is available in the directory, the webserver (optionally) defaults to a generated "index" of subdirectories and files in that directory, and that this default behavior could be overridden by providing an explicit "index-file".

I doubt that any search engine will treat an index.html any different from another html-file.

aioobe
A: 

By default, index.* (it could be html, could be php, any valid web format) is the first page that gets loaded whenever your page is brought up by the browser / served by the server. If you name that file something else and attempt to visit the page you will most likely get the notorious 404 error (Page not found).

Kaa
For clarification, the norm is not to accept index with any file extension, but rather index with any extension the server supports. I believe in IIS7 this would htm, html, asp, and aspx...which means default.htm, default.asp, index.htm, index.html, default.aspx (and iisstart.htm). This may vary based on IIS7 setup, though.
Brian
A: 

I wouldn't even call it a convention, but merely a default for what file to use as directory index i.e. what file to serve when a request URL points at a directory.

My guess is that it was the default in Apache, and everyone else copied that.

Michael Borgwardt
index.html goes back to the earliest HTTP implementations long before Apache.
Christopher Barber
@Christopher: can you name which ones?
Michael Borgwardt
Answered below. It was NCSA. I know because I was using it back in '94.
Christopher Barber
+3  A: 

This is entirely dependent on the implementation of the web server. Originally, this file was meant to be a file that represents the directory's index.

When you access a web site by only specifying the domain name, your browser automatically retrieves the root directory /. The server then hands you the index file.

Apache has a list of default files that it will serve, in order, when the client specifies a directory instead of a file. By default, it's just index.html. index.htm works on IIS.

As far as SEO goes, when building back links, you should avoid creating links like http://www.example.com/index.html. What happens if you move to a PHP framework later and instead of index.html, you now have index.php? All of your links will be dead. Back links should be built in the following formats:

  • http://www.example.com/
  • http://photos.example.com/
  • http://www.example.com/photos/

This includes links on your own site. In the navigation, you shouldn't have a link that points to /index.html, it should just be /. This keeps the home page link from breaking if you change it.

Note that the last example is the least preferred since it's one level deeper.

Marcus Adams
+5  A: 

Yes, it's just a convention. It shouldn't have any special meaning for any UA, although once suspects some search engines might use the filename as part of its algorithm for detecting where two URLs refer to the same resource.

It comes from the NCSA HTTPd, the second-ever web server, after the CERN HTTPd (then “WWWDaemon”, now here). As far as I remember, CERN didn't have index pages as such, but encouraged the presence of a welcome.html file. A later version of CERN HTTPd added index pages with any of the names:

welcome.html
Welcome.html
index.html

but I think that was after NCSA introduced index.html. Apache was originally built on top of NCSA HTTPd (though it has now been fully rewritten) and inherited its default index.html config; most modern web servers inherit this from Apache.

bobince