views:

103

answers:

9

Hey guys, quick question. This has been scratching at my brain for some time so here it goes. I was browsing the internet and noticed, youtube for example contains a url like this to denote a video page: http://www.youtube.com/watch?v=gwS1tGLB0vc. My site uses a url like this for a topic page: http://www.site.com/page.php?topic_id=6f3246d0sdf42c2jb67abba60ce33d5cc. The difference is, if you haven't already noticed that on youtube, there is no file extension for their watch page, so I am wondering, why do some sites not use file extensions and what use does it serve? Thanks in advance for any insight.

I CHOSE AN ANSWER THAT ANSWERS MY QUESTION DIRECTLY, BUT PLEASE VIEW SOME OF THE OTHER ANSWERS AS THEY ARE VERY USEFUL TO THIS TOPIC AND SHOULD BE VIEWED.

+1  A: 

The key is the HTTP response header's Content-Type field. Something like that:

HTTP 200 OK
Content-Type: video/flv
Content-Length: 102345

DATA-DATA-DATA-DATA-DATA-DATA-....

See also:

Content-Disposition: attachment; filename=genome.jpeg;
     modification-date="Wed, 12 Feb 1997 16:29:51 -0500";

More details: http://en.wikipedia.org/wiki/MIME

Notinlist
when you say the key, you mean the key to how the server recognizes that file?
Scarface
The response contains the MIME type in the "Content-Type" field so the web browser knows what to do with it. It will display `text/html` differently than `image/png`, and so on. The point of not having an extension is that you do not have to expose your server-side technology to the world, eg no `.php`, no `.asp`, and so on. `.html` would be incorrect because they are not static pages, only the output of "the unknown technology" is HTML.
Notinlist
And also, for non-technical people the `.jsp` (or whatever) is just four more unnecessary and unrecognized characters that lengthen the URL.
Notinlist
+3  A: 

Having or not having the extension is irrelevant. The browser acts on the MIME type returned by the server, not any extension used in the URL.

Ignacio Vazquez-Abrams
This does not really explain why some URIs don’t have a file name extension. It is not relevant for the client but it might be relevant for the server.
Gumbo
KeithS
+5  A: 

What you are seeing is an example of URL routing. Instead of pointing to a specific file (e.g. page.php), the server is using a routing table or configuration that directs the request to a handler that actually renders the html (or anything else depending on the mime type returned). If you notice, StackOverflow uses the same mechanism.

Pete Amundson
what is the practical use to url routing?
Scarface
Also, it could be that 'watch' is a PHP file, and the server is just set to handle it as such even without the extension - this is how Wikipedia does it by changing 'index.php' it just 'wiki'
eds
thanks eds for the comment
Scarface
The practical use of URL routing is to hide the actual implementation behind the website. In the case of Web2.0-ish sites like SO, Wikipedia, Facebook, etc. that implementation can be extremely messy, or even impossible to represent as a true URL because it's a call to a web service and not a served file. Instead of all the gunk that would require, you have a relatively elegant URL to bookmark or link to in other sites.
KeithS
Thanks Keith, but when you say web service, and not a direct file, what exactly do you mean?
Scarface
slebetman
thanks slebetman
Scarface
+1  A: 

There are many possible answers to this. It's how your web application server(s) are configured that results in what your web browser is interpreting. There could be situations where you're using URL rewriting or routing, and as others have said, what handlers you're providing for requested URLs or extensions.

I could have a URL like "http://cory.com/this/really/doesnt/exist" and have it actually be pointing at "http://cory.com/this.does.exist.123" if I wanted to.

Cory Larson
why would one want to use url routing out of curiosity?
Scarface
URL routing lets you group related logic in a single controller file, rather than splitting it up among several stand-along PHP files.
meagar
Cory Larson
@Cory: Could you provide reference about the SEO point? I am okay with the easier to read, more meaningful, ... But I do not believe in the SEO point ;)
nikic
it can also make URLs look nicer and easier to remember
eds
thanks guys, appreciate it
Scarface
@nikic, just check out the first few results of a search on Google, you should be able to find enough information to convince you that it does help for SEO. http://www.google.com/search?q=url+rewriting+for+seo
Cory Larson
+1  A: 

Well, file extensions aren't of any use on the internet. The browser doesn't care what the file extension is. You could serve a CSS file as .avi. So why not simply leave it out? This allows for shorter URLs.

Furthermore "rewriting" a url allows for more readable urls. You may not understand /categories.php?id=455 but you do /455-some-category.

If you want to do this yourself and are using Apache have a look at mod_rewrite.

nikic
thanks a lot, that makes more sense to me now
Scarface
+1  A: 

When you ask 'Why?' are you asking for a technical reason or a design reason? Some people already answered the technical so I'll just comment on the design.

Basically it boils down to that url is an endpoint. It's a place that users/services need to get to. The extension is irrelevant in most cases. If a user is browsing the web and goes to http://site.com/users he is expecting a list of users. He doesn't care that it doesn't say .html or .php. And as a designer using those extensions doesn't really make sense. You want your app to make sense, and those extensions aren't really providing any insight that the user needs.

Times that you would want to use them were if you were creating a service that other applications would use. Then you could choose to use an extension to denote what kind of data one could expect to get back (.json, .xml, etc). There are people working on design guidelines and specs for this stuff, but it's all early

Basically those extensions are used because that's how web servers/clients worked by default. As web development has matured we started treating urls more professionally and tried to make them make sense to people reading/using them.

threendib
thanks a lot, that is basically what I wanted to hear
Scarface
+1  A: 

The normal behavior of a web server is to map the requested URI path onto a file somewhere in the document root directory. So http://example.com/foo/bar is simply mapped onto /path/do/document/root/foo/bar. Additionally, the web server needs to know how to handle a file. This is often done by the file name extension. So files with the file name extension .php are handled by the PHP interpreter.

Now apart from this normal behavior, most web servers have features that allow to change both the mapping (i.e. URL rewriting) and the way how a file without a file name extension is handled.

In case of the Apache web server, the former can be done with mod_rewrite:

RewriteEngine on
RewriteRule ^/watch$ /watch.php

And the latter can be done with mod_mime:

<File watch>
    ForceType application/x-httpd-php
</File>

(Ok, actually this is not mod_mime feature but a core feature.)

Gumbo
thanks a lot gumbo
Scarface
ok so basically that example tells the server to map watch to watch.php, and tells the server to handle as php file, by entering mime type?
Scarface
@Scarface: Yes, exactly. Both variants can be used so that `/watch` refers to a page that’s content is generated by a PHP script.
Gumbo
excellent, thanks gumbo for your time, I will explore using those mods.
Scarface
+3  A: 

File extensions are not used because of the idea that URI's (and therefore URL's) should be independent of implementation - if you want to access George W. Bush's addresses, you should be able to go to http://www.whitehouse.gov/presidents/georgewbush/addresses (for example). Whether they're using PHP or Python or Perl doesn't matter to the end-user, so they shouldn't see it. The end-user doesn't care how the page was generated, because all web languages output the same (X)HTML, CSS, and the like, and they're just viewing the page in their web browser.

Most web frameworks build this functionality in by default, precisely for this reason, and it can be accomplished regardless with URL rewriting in most webservers. This ideal is coded in the W3C Style Guide, which is undoubtedly a big proponent in this idea being so widely accepted. It's outlined in their guide, "Cool URIs Don't Change", which should clear things up if you still don't quite understand the reasoning here. That document is the go-to statement on the issue, and the de facto standard for frameworks.

It is worth noting that usually files that end up being downloaded (and sometimes data files used in AJAX) will still have their file extensions intact - http://example.com/song.mp3 or http://example.com/whitepaper.pdf - because they are intended to be saved to the end-user's computer, where file extensions matter. The extensions are not included for pages that are simply displayed - which is most pages.

cincodenada
thanks a lot, great article
Scarface
thanks again for that second portion, appreciate it
Scarface
+1  A: 

While extensions don't matter to the browser, which just uses the headers passed along to it to determine what to display and how to display it, chances are they do matter on the server. For instance, your box could have both a php and a ruby interpreter installed, but your webserver has configuration files to map file extensions to MIME types. For instance, from Apache's php5.conf:

  AddType application/x-httpd-php .php .phtml .php3

which tells Apache that files ending in .php, .phtml and .php3 should be recognized as being PHP files.

However, since the extensions don't mean anything to the client, URLs often look "nicer" without them. In order to do so, technologies such as Apache's mod_rewrite can be used to "rewrite" client-land URLs to have meaning on the server.

For instance, you could set up mod_rewrite rules to rewrite a URL like http://yourblog.com/article/the-article-you-wrote (which looks nicer and is simpler to type and remember) to http://yourblog.com/articles.php?title=the-article-you-wrote, which Apache can use to properly route the request to your PHP script.

Daniel Vandersluis
appreciate it Daniel
Scarface