views:

115

answers:

5

I have a page in which users submit URLs, some of which contain &, = etc. Now if I want it to validate it with W3C I need to write it as & = etc. How can I automatically do this? Also, should I even bother?

A: 

I'd say don't even bother. See Jeff's post on the subject: HTML Validation: Does It Matter?

On the other hand, if you're a perfectionist, properly escaping query strings should be pretty trivial in any language. For example, you can use htmlspecialchars, htmlentities, urlencode or rawurlencode in PHP.

Can Berk Güder
+1  A: 

Yes, you should bother, and it's quite simple. Saying, "Oh, look how many invalid pages there are" does not excuse your contributions to the problem. Every major language either has this functionality built-in (as Can noted for PHP) and/or can implement it trivially.

Matthew Flaschen
A: 

If users are submitting urls and you want to assist them in not making errors, then I'd validate the url by calling it. Use the http head method to validate the url.

This will take more programming than statically looking at the url string. You'll want to think about using a helper process, returning the result asynchronously to the original submit, etc. But that's the sort of stuff which separates the students from the professionals.

Larry K
+9  A: 

you should encode the urls on server side then. not knowing what backend language you use, here's a list:

* htmlentities() - PHP
* HttpUtility.UrlEncode() - ASP.net
* URI.escape() - Ruby
* URLEncodedFormat() - Coldfusion
* urllib.urlencode() - Python
* java.net.URLEncoder.encode() - Java
Jin
A: 

You need to use %26 instead of &.

In the general case though, find a URL encoder function in whatever language you're using.

DisgruntledGoat