tags:

views:

411

answers:

4

What is the C# & .NET Regex Pattern that I need to use to get "bar" out of this url?

http://www.foo.com/bar/123/abc

In ATL regular expressions, the pattern would have been

http://www\.foo\.com/{[a-z]+}/123/abc
+3  A: 

Simply: #http://www\.foo\.com/([a-z]+)/123/abc#

use parenthesis instead of brackets.

You will need to use a character on the front and the end of the regular expression to make it work too.

Erick
actually this is wrong because the "." is considered any char? So yours also evaluates fine for http://wwwxfood.com/bar/123/abc
Nick Berardi
The dot is escaped with a backslash - so it works.
Daniel Brückner
# is not a valid character in ASP.NET, I think you mean ^ and $
Nick Berardi
Also it wasn't escaped when I posted the comment.
Nick Berardi
@Nick Actually it was excaped but for some reason you have to double escape your text on SO ^_^ , as for the #s don't you need a delimiter usually for your REGEX ... ?
Erick
+2  A: 

Pretty much the same thing

    http://www\.foo\.com/([a-z]+)/123/abc
Nick Berardi
+2  A: 

This will almost work - just a tiny modification - change brackets to parenthesis.

http://www\.foo\.com/([a-z]+)/123/abc

But I consider this regex of not much use because it includes almost the whole string. Would it not be better to match the first path element independently from the whole rest?

^http://[^/]*/([^/]*).*$
Daniel Brückner
Maybe ^http://(<domain>[^/])*/(<barThing>[^/]*).*$
Jonathan C Dickinson
Ignore the semicolon, SO added it for some reason.
Jonathan C Dickinson
+1  A: 

Here is a solution that breaks the url up into component parts; protocol, site and part. The protocol group is not required so you could give the expression 'www.foo.com/bar/123/abc'. The part group can contain multiple sub groups representing the folders and file under the site.

^(?<protocol>.+://)?(?<site>[^/]+)/(?:(?<part>[^/]+)/?)*$

You would use the expression as follows to get 'foo'

string s = Regex.Match(@"http://www.foo.com/bar/123/abc", @"^(?<protocol>.+://)?(?<site>[^/]+)/(?:(?<part>[^/]+)/?)*$").Groups["part"].Captures[0].Value;

The breakdown of the expression results are as follows

protocol: http://
site: www.foo.com
part[0]: bar
part[1]: 123
part[2]: abc

Stevo3000