views:

444

answers:

3

I have the following expression to validate a URL but it gives me a syntax error on the browser. I am no expert in regex expressions so I am not sure what I am looking for. I would also like it to test for http:// and https:// urls.

"url":{
    "regex":"/^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)?$/",
    "alertText":"URL must start with http://"}

Edit:

To clarify I am looking for help for both the regex and the syntax issues please. I have tried about 20 different variations based on all the answers but still no luck. Just to clarity, I do not need to validate the entire URL. I just need to validate that it starts with http:// or https:// but it must not fail validation if left empty. I can get the http part working with this

/^https?:///

no need to escape the / even. But it fails if the input field is empty, when I try:

/^(https?://)?/

I get an error saying "unterminated parenthetical /^(https?://)/".

Just to confuse matters more, here is one that I added yesterday to validate a date or no entry and it like the same sort of format to me.

/^([0-9]{1,2}\-\[0-9]{1,2}\-\[0-9]{4})?$/
+2  A: 

If you want to test for a URL or empty input, you might want to do two passes.

  1. test for empty string.
  2. test for valid url.

I would do something like the following (assuming urlString is my input).

// get rid of whitespace, in case user hit spacebar/tab
// also removes leading/trailing spaces.
urlString = urlString.replace(/[\s]*/g,'');

// test if zero length string, if not, test the url.
if( urlString.length > 0 ){  // test the URL
  var re = new RegExp( your_expression_goes_here );
  var result = re.exec(urlString);
  if( result != null ) {
    // we have a hit!!!  this is a URL.
  } else {
    // this is a bad string.
  }
} else {
  // user entered no text, let's move on.
}

So, the preceding should work and allow you to test for either empty string or a url. As to the regular expression you're using "/(http|https):\/\//", I believe it's a bit flawed. Yes, it will catch "http://" or "https://", but it will also key in on a string like "htthttp://" which is clearly not what you want.

Your other sample "/^(http|https):\/\//" is better in that it will match from the beginning of the string and will tell you if the string begins like a URL.

Now, I think jrob above was on the right track with his second string in regards to testing the full URL. I think I found the same sample he used at this page. I've modified the expression as per below and tested it using an online regex tester, can't post the link as I'm a new user :D.

It seems to catch a whole manner of valid URLs and produces an error if the input string is in any way an invalid URL, at least for the invalid URLs I can think of. It also catches http/https protocols only, which I think is your base requirement.

^(?:http(?:s?)\:\/\/|~/|/)?(?:\w+:\w+@)?(?:(?:[-\w]+\.)+([a-zA-Z]{2,9}))(?::[\d]{1,5})?(?:(?:(?:/(?:[-\w~!$+|.,=]|%[a-f\d]{2})+)+|/)+|\?|#)?(?:(?:\?(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)(?:&(?:[-\w~!$+|.,*:]|%[a-f\d{2}])+=(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)*)*(?:#(?:[-\w~!$+|.,*:=]|%[a-f\d]{2})*)?$

Hope this helps.

Updated code (twice).

I still strongly suggest you test for empty string first as per my earlier example, and you only test for the valid values if the string is non zero. I have tried to combine the two tests into one, but have been unable to do so so far (maybe someone else can still figure it out).

The following tests work for me, here's a URL sample as you required:

//var re = /^(?:http(?:s?)\:\/\/)/;
// the following expression will test for http(s):// and empty string
var re = /^(?:http(?:s?)\:\/\/)*$/;
// use the precompiled expression above, or the following
// two lines:
//var reTxt = "^(?:http(?:s?)\:\/\/)";
//var re = new RegExp(reTxt);

alert(
  "result:" + re.test("http://") +
"\nresult:" + re.test("https://") +
"\nresult:" + re.test("") +
"\nresult:" + re.test("https:") +
"\nresult:" + re.test("xhttp://") +
"\nresult:" + re.test("ftp://") +
"\nresult:" + re.test("http:/") +
"\nresult:" + re.test("http://somepage.com") +
"\nresult:" + re.test("httphttp://") +
"\nresult:" + re.test(" http://") +
"\nresult:" + re.test("Random text") 
);

And here's a test for dates:

var re2 = /^[0-9]{1,2}\-[0-9]{1,2}\-[0-9]{4}$/;
// use the precompiled expression above, or the following
// two lines:
//var reDateTxt = /^[0-9]{1,2}\-[0-9]{1,2}\-[0-9]{4}$/;
//var re2 = new RegExp(reDateTxt);

alert(
"result:" + re2.test("02-02-2009") +
"\nresult:" + re2.test("022-02-2009") + 
"\nresult:" + re2.test("02-032-2009") + 
"\nresult:" + re2.test("02-02-23009") + 
"\nresult:" + re2.test("  02-02-2009") + 
"\nresult:" + re2.test("02-0a2-2009") + 
"\nresult:" + re2.test("02-02-2009") + 
"\nresult:" + re2.test("Random text") 
);
Mike Mytkowski
Hi Mike, Thanks for the details answer, my problem is that I need to use this plugin to do the validation and as soon as the user clicks submit it validates automatically. Thats why I only have access to getting both scenarios covered in one regex expression. I am confused because the date one works with () in the expression but the URL one doesn't. When I try your /^(?:http(?:s?)\:\/\/)/ I still get the unterminated parenthetical error which makes no sense to me
Caroline
I just validated Mike's expression, re = /^(?:http(?:s?)\:\/\/)/, in my editor and in Expresso (which I recommend for RegEx work). There may be some code before your regex block that has an unterminated parens, that the interpreter isn't detecting until you reach the regex.
fatcat1111
I have updated the code segment with the following regular expression: /^(?:http(?:s?)\:\/\/)*$/. It seems to catch "http(s)://" and an empty string. I'm not sure if this is what I would do, but I understand you're constrained in your choices. Not sure why you're getting the unterminated parenthetical error though.
Mike Mytkowski
+2  A: 

Here's the spec on URIs, of which URLs are a subset, or here's the spec on URLs if you're sure that's all you care about. A full implementation of either would be nearly impossible with only a single regular expression.

If you truly want to validate a URL, one that you know will be HTTP or HTTPS, send it an HTTP HEAD request and check the response code.

Alternatively, if you're going to play loose with the spec, decide how loose you're willing to be with the input, and if it's better to exclude valid URLs or permit false ones.

fatcat1111
A: 

For what it's worth, the syntax error is the unescaped forward slash here: /\S*

Edit: oh wow, I'm tired. All of the forward slashes are unescaped. You can escape them with a backslash: \/

eyelidlessness