tags:

views:

66

answers:

2

Example link
http://stackoverflow.com/questions/tags/ruby
true url

http://stackoverflow.com/questions/@#dsd/javascript
false url

How i check the validity of only /tags/ part not whole url

IS any one who helps me Is anyone give me regular expression for this url part.

How i validate my url as per my condition

Thanks

+1  A: 

Whole URL:

function isValidURL($url) {
    return preg_match('^(https?|ftp)\:\/\/([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?\$', $url);
}

Unix Folder names (which is basically anything between the /'s):

function isValidPath($url) {
    return preg_match('(\/([a-z0-9+\$_-]\.?)+)*\/?', $url);
}
danyim
Pretty sure this will accept strings that are not valid http(s) urls, such as hostnames that don't begin with a letter (`[a-zA-Z]`).
eldarerathis
@eld: That's not technically true. I'm building my own answer, so I'll elaborate there.
JGB146
@eldarerathis urls dosn't need to start with a-z. For instance, you have the url `卐.com` (note, this is nothing rasistical, only a sample, and the url dosn't work correctly in every browser, howerver, but the specification it should).
Alxandr
In what situations are urls like that valid, then? Based on the BNF grammar from the W3C, any url beginning with 'http://' should then need to be followed by a letter: http://www.w3.org/Addressing/URL/url-spec.txt. Unless I'm not reading the grammar correctly, of course.
eldarerathis
@eldareathis: IP addresses. In your link, when defining the parts to a URL, it states that after the optional username/pw, you have "The internet domain name of the host in RFC1037 format (or, optionally and less advisably, the IP address as a set of four decimal digits)".
JGB146
+1  A: 

danyim's answer is accurate, though it might not fit your needs exactly, as noted in the comments. Also, his solution was php-based. From scanning your tag participation, I'm guessing that you'd actually prefer a javascript solution (so I'll provide both!).

First, refactoring his php:

function isValidURL($url) {
    $regex = "((https?|ftp)\:\/\/)?"; // SCHEME 
    $regex .= "([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?"; // User and Pass 
    $regex .= "((([a-z][a-z0-9-.]*)\.([a-z]{2,3}))|(([12]?[0-9]?[0-9]\.){4}))"; // Host or IP 
    $regex .= "(\:[0-9]{2,5})?"; // Port 
    $regex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?"; // Path 
    $regex .= "(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?"; // GET Query 
    $regex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?"; // Anchor 
    return preg_match($regex, lcase($url));
}

Note that I modified the return to perform an lcase operation before checking the url. You could also use a case-insensitive flag on the regex to prevent the need for this. As noted, there are a number of parts to this that may or may not be valid for your use-cases. Specifically, you may not ever have a situation in which you want to accept a url that includes a username/pw, or that is from a static IP. You can modify the regex to exclude whatever parts of the match are never valid by removing the related line. Also, here is a second option for the //Host or IP line, to make it host only:

    $regex .= "([a-z][a-z0-9-.]*)\.([a-z]{2,3})"; // Host only 

And now the same thing in javascript (combined together because js handles regex different than strings...adjustments will be easier to make in the php version and then mimic into here):

function isValidURL(url) {
   var regex = /((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?((([a-z][a-z0-9-.]*)\.([a-z]{2,3}))|(([12]?[0-9]?[0-9]\.){4}))(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@&%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/i
   return (url.match(regex));
}
JGB146