tags:

views:

211

answers:

5

What's the easiest way to do a "instring" type function with a regex. For example, how could I reject a whole string because of the presence of a single character such as ":" for example:

"this" - okay "there:is" - not okay because of ":"

More practically, how can I match the following string:

//foo/bar/baz[1]/ns:foo2/@attr/text()

For any node test on the xpath that doesn't include a namespace?

(/)?(/)([^:/]+)

Will match the node tests but includes the namespace prefix which makes it faulty.

A: 

I dont know regex syntax very well but could you not do:

[any alpha numeric]*:[any alphanumeric]*

I think something like that should work no?

Adam Lerman
+1  A: 

Match on :? I think the question isn't clear enough, because the answer is so obvious:

if(Regex.Match(":", input)) // reject
Will
A: 

You might want \w which is a "word" character. From javadocs, it is defined as [a-zA-Z_0-9], so if you don't want underscores either, that may not work....

Mike Stone
A: 

Yeah, my question was not very well put. Here's a solution but rather than a single pass with a regex, I use a split and perform iteration. It works as well but isn't as elegant:

string xpath = "//foo/bar/baz[1]/ns:foo2/@attr/text()";

string[] nodetests = xpath.Split( new char[] { '/' } );

for (int i = 0; i < nodetests.Length; i++) {

if (nodetests[i].Length > 0 && Regex.IsMatch( nodetests[i], @"^(\w|\[|\])+$" ))
{

 // does not have a ":", we can manipulate it.

}

}

xpath = String.Join( "/", nodetests );

David in Dakota
+1  A: 

I'm still not sure whether you just wanted to detect if the Xpath contains a namespace, or whether you want to remove the references to the namespace. So here's some sample code (in C#) that does both.

class Program
{

    static void Main(string[] args)
    {
        string withNamespace = @"//foo/ns2:bar/baz[1]/ns:foo2/@attr/text()";
        string withoutNamespace = @"//foo/bar/baz[1]/foo2/@attr/text()";

        ShowStuff(withNamespace);
        ShowStuff(withoutNamespace);
    }

    static void ShowStuff(string input)
    {
        Console.WriteLine("'{0}' does {1}contain namespaces", input, ContainsNamespace(input) ? "" : "not ");
        Console.WriteLine("'{0}' without namespaces is '{1}'", input, StripNamespaces(input));
    }

    static bool ContainsNamespace(string input)
    {
        // a namspace must start with a character, but can have characters and numbers
        // from that point on.
        return Regex.IsMatch(input, @"/?\w[\w\d]+:\w[\w\d]+/?");
    }

    static string StripNamespaces(string input)
    {
        return Regex.Replace(input, @"(/?)\w[\w\d]+:(\w[\w\d]+)(/?)", "$1$2$3");
    }
}

Hope that helps! Good luck.

OJ