tags:

views:

3339

answers:

5

I need to parse a string, such as /a/b/c/d=uno/c=duo.html

into three groups, such as

  • /a/b/c/
  • d=uno/
  • c=duo.html

The parsing rules are:

  1. The first group is everything until "d=" or the whole string if "d=" is not a part of the string.
  2. The second group is everything until "c=" or the rest of the string following the first group if "c=" is not a part of the string.
  3. The third group matches the rest of the string following the second group.

My problem with the following regex (?.+/)?(?d=([^/]+)/)?(?c=(?.*)) is that I don't know how to stop the group when it encounters "d=".

Any help will be appreciated.

Thanks.

+2  A: 

Is the string you need to parse in the form you supplied, or is it an actual URL with parameters? If it's a URL, you can use System.Web.HttpUtility.ParseQueryString to extract a NameValueCollection containing each parameter and its value.

I've found this useful even in Windows Forms (eg parsing query parameters in ClickOnce deployed applications).

Matt Hamilton
A: 

The string is in the format I supplied, not a URL. It's a custom repository location format.

A: 
(.*?)(?>d=|$)(.*?)(?>c=|$)(.*?)$

group1: Everything till d= or end of string

group2: Everything till c= or end of string

group3: Everything till end of string

Hasan Khan
A: 

RegEx suggested by hasankhan works great.

How can I extend this regex to require group1 and group2 to end with a slash?

For example, matching should succeed on the following valid strings:

  • /a/b/c/
  • /a/b/c/d=uno/

but should fail when matching either of these:

  • /a/b/c
  • /a/b/c/d=uno

Thank you for your help.

+1  A: 
(.*?/)(?>d=|$)(.*?/)(?>c=|$)(.*?)$
Hasan Khan