views:

356

answers:

2

In a system that I'm developing I need to recognize the youtube link in the following format

[youtube]youtube url[/youtube]

for the moment I arrived at this regular expression:

#\[youtube\]http://www.youtube\.(.*)/watch\?v=([a-zA-Z0-9_-]*)\[\/youtube\]#s

But this pattern isn't able to recognize url like

[youtube]http://www.youtube.com/watch?v=s3wXkv1VW54&feature=fvst[/youtube]

Note the feature=fvst.

Some one can help me in recognize all possible youtube url?

+2  A: 

How about

|\[youtube\]http://www.youtube\.(.*?)/watch\?v=([a-zA-Z0-9_-]*)[%&=#a-zA-Z0-9_-]*\[\/youtube\]|s

EDIT: Changed the regex to keep only the video id inside parentheses.

Tim Pietzcker
If two "youtube-links" are on the same line, the greedy `.*` will mess things up. I'd put a reluctant `?` after it, as Andy suggested (**edit:** who has now deleted his answer...).
Bart Kiers
Oh yeah, right. I saw it in time, though :)
Tim Pietzcker
No Tim, you failed: I saw you had a `.*` without a `?`! Guilty as charged! :) +1
Bart Kiers
Your pattern works but i like to directly match the video ID, with your pattern I have the entire query string.
Luca Bernardi
@Bart: Misunderstanding there :) I meant that I saw Andy's answer in time before he removed it. I did only change my regex after your comment.
Tim Pietzcker
Ah, I see... :)
Bart Kiers
I deleted my answer because Tim pointed out that it wasn't quite the problem, and I didn't want to be hit by the downvote brigade. It's bad enough getting downvoted when you give the correct answer ;-)
Andy E
@Andy: True, it was not the answer to the question, but you had a point, hence my comment.
Bart Kiers
+1  A: 

Notes:
I'd perform a case-insensitive match on URLs (/i)
. matches anything; use \. instead to match the URL
Also, "www." in Youtube URLs is optional.
Use (\.[a-z]+){1,2} instead of ".*" to match ".com", ".co.uk", .. after "youtube"

Assuming the "&feature" part is optional, the regex would be:

/\[youtube\]http:\/\/(www\.)?youtube(\.[a-z]+){1,2}\/watch\?v=([a-z0-9_-]*)(&feature=([a-z]+))?\[\/youtube\]/is

Assuming the "&feature" part is optional, AND there can be other parameters than "feature":

/\[youtube\]http:\/\/(www\.)?youtube(\.[a-z]+){1,2}\/watch\?v=([a-z0-9_-]*)(&([a-z]+)=([a-z0-9_-])+)*?\[\/youtube\]/is
We Know