views:

128

answers:

3

Hello,

I faced a problem some time back on a particular website. It has given many hyperlinks on it to other sites. e.g. of one such URL is:

http://http//example.com/a9noaa.asp

It is clearly incorrect (http comes twice) URL so when one clicks on it there is a page error like "Address not found".

But when one copies the link location and pastes it in the browser’s location bar, it loads that new page correctly. So it’s the problem of incorrect URL being mentioned in the hyperlink.

Will it be possible to make browser check for basic sanity of the URL being accessed like checking that:

  • word http is present only once,
  • colon is typed correct,
  • no unusual character at beginning of URL,
  • double backlashes are correctly present, etc.

Or that the URL being typed in the address bar and automatically correct the errors in it?

Can any client side code be present to make a internet browser achieve this functionality? Is it possible?

Or are there any plugins for popular browsers (Firefox, IE) already available to achieve this?

Thank you.

-AD.

+4  A: 

First of all, http://http//example.com/a9noaa.asp is a valid URI with http as the scheme, the second http as the host name and //example.com/a9noaa.asp as the path. So if it’s not invalid, the browser has no need to correct it.

Now let’s look at the location bar. Most user friendly browsers do some error correction if the location that has been entered is invalid. One of that correction measures is to prepend the string with http:// if that’s not present. So you just have to type example.com to request http://example.com.
Another correction measure is to complete unknown host names with http://www. and and .com before and after the entered string. So you just have to type example, hit enter and you request http://www.example.com.

But any error correction outside the location bar can especially in hyperlinks can be crucial. Take this for example: A guest enters his/her website URI in a guestbook entry but ommits the http://. Now that value is used in a hyperlink but the missing http:// is not prefixed. So the link might look like this:

<a href="example.com">Website</a>

If you click on such a link, the relative URI of that link would be resolved to an absolute URI using the current document’s URI as the base. So the link might be expanded to http://some.example/guestbook/example.com. Who hasn’t experienced that?

But correcting that missing http:// in the browser is fatal. Because the auther might have intended to reference http://some.example/guestbook/example.com instead of http://example.com that the browser would expect.

So to round it up: Correcting the user’s location bar input suitable when there is something missing (e.g. the http://). But doing that on every link is not.

Gumbo
@Gumbo: But http://http//example.com/a9noaa.asp still incorrect url as it does not take me to he page where it is supposed to! So i dont understand when u say it is not a invalid url?
goldenmean
@goldenmean I meant the URL is syntactically valid.
Gumbo
@Gumbo: It would be valid if there was anything at "http://http/...", which may be the case in an intranet. The fact that the URL doesn't take you where you wanted it to take you doesn't make it invalid.
deceze
Sorry, I meant @goldenmean
deceze
A: 

It really shouldn't be up to the browser to correct mal-formed URLs. A URL is supposed to be a unique identifier of some page. The one doing the linking to the page should take care to link to the correct page. There must be no guesswork involved in opening a URL.

That said, some browsers are better than others. Of the top of my head I think IE won't understand "localhost:8888/test" (no protocol given and not standard port 80), but Firefox will at least try to access it via "http://localhost:8888/test". This kind of best-guess filling-in-the-blanks is fine I think, any further auto-correction would be doing too much.

Safari for example will try to auto-guess domain names for you. If "apple/safari" yields a DNS error, it'll automatically try to complete the address to "apple.com/safari". With your URL it might try to complete it to "http://http.com//example.com/a9noaa.asp", which might yield a page if http.com exists. There's just no one way of doing it, therefore it shouldn't be done at all.

deceze
+1  A: 

The URL you posted is not "incorrect", it is valid. Hostnames can take many forms, such as http://localhost/ or http://http/ as well as the more common http://example.com

If you don't include http:// or another protocol in a web link, then the browser assumes you are using a relative link. For example...

<a href="www.example.com">link</a>

...will link to http://yoursite.com/www.example.com, because this is a perfectly valid URL - you can name a file www.example.com.

I would recommend contacting the website in question to fix their error. No browsers will correct this automatically.

DisgruntledGoat