tags:

views:

82

answers:

2

I need to grab the video ID from a Google Video URL. There are two different types of URLs that I need to be able to match:

http://video.google.com/videoplay?docid=-3498228245415745977#

where I need to be able to match -3498228245415745977 (note the dash; -), and

video.google.com/videoplay?docid=-3498228245415745977#docid=2728972720932273543

where I need to match 2728972720932273543. Is there any good regular expression that can match this?

This is what I've got so far: @"docid=(-?\d{19}+)" since the video ID seems to be 19 characters except when it's prefixed with the dash.

I'm using C# (of which I have very little experience) if that changes anything.

P.s. I would also appreciate you review my regular expressions for YouTube (@"[\?&]v=([^&#])";), RedTube (@"/(\d{1,6})") and Vimeo (@"/(\d*)").

I do not expect users to enter the full URL and thus do not match the ^http://\\.?sitename+\\.\\w{2,3}.

+2  A: 

The following piece of RegEx uses what is called negative lookahead to make sure that there is not any part of the string after the match that contains #docid:

docid=(-?\d{19}(?!\#docid=))

the (?!\#docid=) is the negative lookahead part of the RegEx. If you want to know more about it you could look at http://www.regular-expressions.info/lookaround.html

Hope this helps you

EDIT: If you haven't already gotten it you should get "The Regulator 2.0" from sourceforge. Its an Design and testing tool for Regular Expressions. I find it very helpful when I develope regular expressions.

Falle1234
I'm not familiar with negative lookaheads, but would that expression match the "2728972720932273543" (latter) part of "video.google.com/videoplay?docid=-3498228245415745977#docid=2728972720932273543"?
Baldur
yes it would. It wouldn't match the -3498228245415745977 part because there is a "#docid=" coming right after it, so it skips that part and then finds the latter part that was also matched by your RegEx. So the "only" thing a negative lookahead does is make sure the match is not followed be a given expression.
Falle1234
Thank you very much, this solution is amazing!
Baldur
A: 

use this RE:

docid=-([0-9]*)

Result

Array
(
    [0] => docid=-3498228245415745977
    [1] => 3498228245415745977
)

I had tested it in Java, PHP, awk, perl.

articlestack