views:

187

answers:

1

I'm busy trying to create two regular expressions to filter the id from youtube and vimeo video's. I've already got the following expressions;

YouTube: (youtube\.com/)(.*)v=([a-zA-Z0-9-_]+)
Vimeo: vimeo\.com/([0-9]+)$

As i explained below there are 2 types of urls matched by the regular expressions i already created. Several other types of urls from Vimeo and YouTube aren't coverd by the expressions. What i prefer most is that all this can be covered in two expressions. One for all Vimeo video's and one for all youtube video's. I've been busy experimenting with some different expressions, but no succes so far. I'm still trying to master regular expressions, so i hope i'm on the right way and somebody can help me out! If more information is required, please let me know!

VIMEO URLs NOT MATCHED:

http://vimeo.com/channels/hd#11384488
http://vimeo.com/groups/brooklynbands/videos/7906210
http://vimeo.com/staffpicks#13561592

YOUTUBE URLs NOT MATCHED

http://www.youtube.com/user/username#p/a/u/1/bpJQZm_hkTE
http://www.youtube.com/v/bpJQZm_hkTE
http://youtu.be/bpJQZm_hkTE

URLs Matched

http://www.youtube.com/watch?v=bWTyFIYPtYU&feature=popular
http://vimeo.com/834881

The idea is to match all the url's mentioned above with two regular expressions. One for vimeo and one for youtube.

UPDATE AFTER ANSWER Sedith:

This is how my expressions look now

public static readonly Regex VimeoVideoRegex = new Regex(@"vimeo\.com/(?:.*#|.*/videos/)?([0-9]+)", RegexOptions.IgnoreCase | RegexOptions.Multiline);
public static readonly Regex YoutubeVideoRegex = new Regex(@"youtu(?:\.be|be\.com)/(?:(.*)v(/|=)|(.*/)?)([a-zA-Z0-9-_]+)", RegexOptions.IgnoreCase);

And in code i have

var youtubeMatch = url.match(YoutubeVideoRegex );
var vimeoMatch = url.match(VimeoVideoRegex );

var youtubeIndex = (youtubeMatch.length - 1)
var youtubeId = youtubeMatch[youtubeIndex];

As you can see i now need to find the index where the videoId is in the array with matches returned from the regex. But i want it to only return the id itselfs, so i don't need to modify the code when youtube of vimeo ever decide to change there urls. Any tips on this?

+3  A: 

I had a play around with the examples and came up with these:

Youtube: youtu(?:\.be|be\.com)/(?:.*v(?:/|=)|(?:.*/)?)([a-zA-Z0-9-_]+)
Vimeo: vimeo\.com/(?:.*#|.*/videos/)?([0-9]+)

And they should match all those given. The (?: ...) means that everything inside the bracket won't be captured. So only the id should be obtained.

I'm a bit of a regex novice myself, so don't be surprised if someone else comes in here screaming not to listen to me, but hopefully these will be of help.

I find this website extremely useful in working out the patterns: http://www.regexpal.com/

Edit:

get the id like so:

string url = ""; //url goes here!

Match youtubeMatch = YoutubeVideoRegex.Match(url);
Match vimeoMatch = VimeoVideoRegex.Match(url);

string id = string.Empty;

if (youtubeMatch.Success)
    id = youtubeMatch.Groups[1].Value; 

if (vimeoMatch.Success)
    id = vimeoMatch.Groups[1].Value;

That works in plain old c#.net, can't vouch for asp.net

Septih
Thanks! I've tried the expressions and the work perfectly, but while i was testing i found out it's more convenient to only get the part of the url containing the video id, is this possible?
Rob
See above changes. The youtube pattern was slightly wrong but now both of them should have the id captured in match.Groups[1].Value
Septih
@Septih Thanks for the good response! I'll try this out first thing tomorromorning, and i'll let you know!
Rob
@Septih Thanks it's all working like a charm now! Definately learned something about regexes.
Rob