views:

202

answers:

3

In PHP, I need to be able to figure out if a string contains a URL. If there is a URL, I need to isolate it as another separate string.

For example: "SESAC showin the Love! http://twitpic.com/1uk7fi"

I need to be able to isolate the URL in that string into a new string. At the same time the URL needs to be kept intact in the original string. Follow?

I know this is probably really simple but it's killing me.

+2  A: 

Something like

preg_match('/[a-zA-Z]+:\/\/[0-9a-zA-Z;.\/?:@=_#&%~,+$]+/', $string, $matches);

$matches[0] will hold the result.

(Note: this regex is certainly not RFC compliant; it may fetch malformed (per the spec) URLs. See http://www.faqs.org/rfcs/rfc1738.html).

Artefacto
+1 Nice. Would you care to add a version that listens to `http` only for the sake of future generations?
Pekka
The scheme is `[a-z]`. There can neither be uppercase letters nor digits in the scheme part of a URL. And frankly, there aren't too many valid schemes to begin with.
Tomalak
@Tomalak Yes, but sometimes people write HTTP://, it was supposed to capture that. I'll remove the digits.
Artefacto
b-e-a-utiful. Thanks
Dylan Taylor
@Pekka Without username, password, or fragments and in the same permissive spirit of the answer, something like `http(?:s)?://[a-zA-Z\-.]{3,}(?::\d+)?(?:/[a-zA-Z0-9$\-_.+!*'(),%/;:@
Artefacto
I cannot edit the comment anymore, but the hostname part is missing numbers.
Artefacto
A: 
$test = "SESAC showin the Love! http://twitpic.com/1uk7fi";
$myURL= strstr ($test, "http");
echo $myURL; // prints http://twitpic.com/1uk7fi
kzh
No good: Will not parse URLs sans http, and will catch `http` even if used outside a URL context.
Pekka
A: 

URLs can't contain spaces, so...

\b(?:https?|ftp)://\S+

Should match any URL-like thing in a string.

The above is the pure regex. PHP preg_* and string escaping rules apply before you can use it.

Tomalak