tags:

views:

34

answers:

1

Embedly has a great regex generator with which you can use to verfiy the correctnes of services urls(http://api.embed.ly/tools/generator). It generates javascript regexes, but unfortunately it does not generate c# regex expressions. As far as I know though, c# uses the same ECMA regex definition, and I should therefore be able to use the in C#

So what I would like to achieve is take the generated regex from the embdly site and just paste it into my c# code.

The javascript regex would look like this:

/http:\/\/(.*youtube\.com\/watch.*|.*\.youtube\.com\/v\/.*|youtu\.be\/.*|.*\.youtube\.com\/user\/.*#.*|.*\.youtube\.com\/.*#.*\/.*|picasaweb\.google\.com.*\/.*\/.*#.*|picasaweb\.google\.com.*\/lh\/photo\/.*|picasaweb\.google\.com.*\/.*\/.*)/i

and should match urls like:

http://picasaweb.google.com/westerek/LadakhDolinaMarkha?feat=featured#5497194022344000402 http://www.youtube.com/watch?v=GVDc1uXda6Y&feature=related

What I have so far is the following:

Regex regex = new Regex(
      "[/http:\\/\\/(.*youtube\\.com\\/watch.*|.*\\.youtube\\.com\\/"+
      "v\\/.*|youtu\\.be\\/.*|.*\\.youtube\\.com\\/user\\/.*#.*|.*\\."+
      "youtube\\.com\\/.*#.*\\/.*|picasaweb\\.google\\.com.*\\/.*\\/"+
      ".*#.*|picasaweb\\.google\\.com.*\\/lh\\/photo\\/.*|picasaweb"+
      "\\.google\\.com.*\\/.*\\/.*)/i]",
    RegexOptions.IgnoreCase
    | RegexOptions.CultureInvariant
    | RegexOptions.IgnorePatternWhitespace
    | RegexOptions.Compiled
    );

.. but this only gives me partial matches..

EDIT: Solution: Just paste the embedly javascript regex expression into strEmbdlyRegex string in the following snippet.

    string strEmbdlyRegex = @"/http:\/\/(.*youtube\.com\/watch.*|.*\.youtube\.com\/v\/.*|youtu\.be\/.*|.*\.youtube\.com\/user\/.*#.*|.*\.youtube\.com\/.*#.*\/.*)/i";

string strRegx = strEmbdlyRegex.Remove(0, 1);
strRegx = strRegx.Remove(strRegx.IndexOf("("), 1);
strRegx = strRegx.Remove(strRegx.LastIndexOf(")/i"), 3);
strRegx = strRegx + "]";

regex = new Regex(
     strRegx,
    RegexOptions.IgnoreCase
    | RegexOptions.CultureInvariant
    | RegexOptions.ECMAScript
    | RegexOptions.Compiled
    );
+2  A: 

Being a bit more specific with your problem would help, but I appear to have it working (at least with your two test strings). You just need to clean up a few extraneous characters:

  • Convert it to a literal string using the @"" syntax (no escaping backslashes)
  • Remove the [/ from the beginning of the string
  • Remove the \i from the end of the string
  • Remove the ( and ) near the beginning and end of the string

Also, you probably don't need the IgnorePatterWhitespace option, and for a simple URL you probably don't need the CultureInvariant option either.

Lastly, there is a RegexOptions.ECMAScript option that allows you to pass in a /regex/i and have it be interpreted the same way JavaScript would handle it.

Goyuix
thanks! you understood me perfectly well, but I'll still try and be more specific next time. the @ is what got me!
AyKarsi