tags:

views:

53

answers:

2

Hey there,

I have been working on this regex:
{link=([^|{}]+)\||([^|{}]+)\||([^|{}]+)}

I wish to capture any non-pipe or bracket chars and place them in appropriate backreference (group).

How can I return the following:

  1. If test string is {link=a} return a into group 3.
  2. If test string is {link=a|b} return a into group 2, b into group 3.
  3. If test string is {link=a|b|c} return a into group 1, b into group 2, c into group 3.

Having issues where the strings are not correctly being picked up into correct groups on all permutations with the above regex. I think I have bracket grouping issues and/or OR (|) statement issues.

Thanks in advance.

+1  A: 

in python, but the syntax should be the same,

#!/usr/bin/python

import re
ptn = re.compile(r"""
    {link=
    (?:
      (?:([^|}]+)\|)?
      (?:([^|}]+)\|)
    )?
    ([^|}]+)
    }
    """, re.VERBOSE)

l = [
    "{link=a}",
    "{link=a|b}",
    "{link=a|b|c}",
    "{link=a} {link=a|b} {link=a|b|c}",
]
for s in l:
    for m in ptn.finditer(s):
        print "%s => matchs: %s => m.group(3): %s" % (
                s, m.group(0), m.group(3))

and result:

{link=a} => matchs: {link=a} => m.group(3): a
{link=a|b} => matchs: {link=a|b} => m.group(3): b
{link=a|b|c} => matchs: {link=a|b|c} => m.group(3): c
{link=a} {link=a|b} {link=a|b|c} => matchs: {link=a} => m.group(3): a
{link=a} {link=a|b} {link=a|b|c} => matchs: {link=a|b} => m.group(3): b
{link=a} {link=a|b} {link=a|b|c} => matchs: {link=a|b|c} => m.group(3): c
Dyno Fu
Nice. But why do you use {0,1} instead of good old '?'?
PhiLho
no good reason. i might just forget that;)
Dyno Fu
I took the liberty to fix that (cosmetic), but more importantly to exclude the pipe from the first captures.
PhiLho
Thanks for your efforts Dyno, can you try and run this again with l = ["{link=a} {link=a|b} {link=a|b|c}"]. I think you will find we run into issues, blocks start to merge into the one group etc. I am using this for a custom-syntax -> hyperlink tag generator and it's a valid use-case I might have many links in the one string that my regex will be tested upon.
GONeale
modified upon request, is this what you want?
Dyno Fu
Thanks that works great. I can see each capture set within group 1, 2, 3 perfectly now. Cheers.
GONeale
+1  A: 

How about capturing all the matches in the same group?

string[] tests = {
    "{link=a}",
    "{link=a|b}",
    "{link=a|b|c}",
};

var link = @"(?<link>[^|]+)";
var pattern = new Regex(String.Format(@"^\{{link={0}(\|{0})*\}}$", link));

foreach (var s in tests) {
    Match m = pattern.Match(s);

    if (!m.Success) {
        Console.WriteLine("{0}: FAIL", s);
        continue;
    }

    Console.Write("{0}: PASS ", s);
    foreach (var l in m.Groups["link"].Captures)
        Console.Write("[{0}]", l);
    Console.WriteLine();
}

Output:

{link=a}: PASS [a]
{link=a|b}: PASS [a][b]
{link=a|b|c}: PASS [a][b][c]
Greg Bacon