views:

1100

answers:

8

I want to extract 'James\, Brown' from the string below but I don't always know what the name will be. The comma is causing me some difficuly so what would you suggest to extract James\, Brown?

OU=James\, Brown,OU=Test,DC=Internal,DC=Net

Thanks

A: 

If the format is always the same:

string line = GetStringFromWherever();

int start = line.IndexOf("=") + 1;//+1 to get start of name
int end = line.IndexOf("OU=",start) -1; //-1 to remove comma

string name = line.Substring(start, end - start);

Forgive if syntax is not quite right - from memory. Obviously this is not very robust and fails if the format ever changes.

Cheers.

xan
Actually, the second parameter of SubString is length, not endIndex. In your example it SHOULD be name = line.SubString(start, end - start). I've always hated that about Substring, which is the reason why I've created extension methods that DO allow startIndex and endIndex.
BFree
xan - I edited to correct syntax, since I am in front of a machine with Snippetcompiler installed. :)
ZombieSheep
+2  A: 

A quite brittle way to do this might be...

string name = @"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
string[] splitUp = name.Split("=".ToCharArray(),3);
string namePart = splitUp[1].Replace(",OU","");
Console.WriteLine(namePart);

I wouldn't necessarily advocate this method, but I've just come back from a departmental Christmas lyunch and my brain is not fully engaged yet.

ZombieSheep
Hi, my name is "Foo,OUBar" but you can call me "FooBar" ;-)
VVS
+6  A: 

A regex is likely your best approach

static string ParseName(string arg) {
    var regex = new Regex(@"^OU=([a-zA-Z\\]+\,\s+[a-zA-Z\\]+)\,.*$");
    var match = regex.Match(arg);
    return match.Groups[1].Value;
}
JaredPar
A good approach, but one of which I have an irrational fear. :)
ZombieSheep
Overcome your fear :)
samjudson
But in order to do that, I must admit that my fear is wrong, and as a Yorkshireman, I am *never* wrong. ;-)
ZombieSheep
You're assuming every name has a comma in it which might be wrong (and probably is).
VVS
@David, questioner didn't mention it one way or the other so all I can go on is what they put in the question. I could also wonder if @'s are allowed in the name. Or perhaps 3 name vs. 2. But once again unless the asker puts it in their question assumptions are necessary.
JaredPar
@Jared: I'm just pointing at a potential error that might appear in production code 2 years from now ;-). Still the sample provided looks really like an LDAP DN which is why I prefer Mark Brackett's answer.
VVS
@David, your comment though is also based on an assumption. You assume that Mark is correct (almost certainly is). 2 years from now you might find out they invented their own naming standard. Without clarification from the user there is no way to give 100% correct answers.
JaredPar
A: 

If the slash is always there, I would look at potentially using RegEx to do the match, you can use a match group for the last and first names.

^OU=([a-zA-Z])\,\s([a-zA-Z])

That RegEx will match names that include characters only, you will need to refine it a bit for better matching for the non-standard names. Here is a RegEx tester to help you along the way if you go this route.

Mitchel Sellers
+3  A: 

You can use a regex:

string input = @"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
Match m = Regex.Match(input, "^OU=(.*?),OU=.*$");
Console.WriteLine(m.Groups[1].Value);
bruno conde
A: 

I'd start off with a regex to split up the groups:

    Regex rx = new Regex(@"(?<!\\),");
    String test = "OU=James\\, Brown,OU=Test,DC=Internal,DC=Net";
    String[] segments = rx.Split(test);

But from there I would split up the parameters in the array by splitting them up manually, so that you don't have to use a regex that depends on more than the separator character used. Since this looks like an LDAP query, it might not matter if you always look at params[0], but there is a chance that the name might be set as "CN=". You can cover both cases by just reading the query like this:

    String name = segments[0].Split('=', 2)[1];
Dan Monego
+1  A: 

That looks suspiciously like an LDAP or Active Directory distinguished name formatted according to RFC 2253/4514.

Unless you're working with well known names and/or are okay with a fragile hackaround (like the regex solutions) - then you should start by reading the spec.

If you, like me, generally hate implementing code according to RFCs - then hope this guy did a better job following the spec than you would. At least he claims to be 2253 compliant.

Mark Brackett
A: 

Replace \, with your own preferred magic string (perhaps & #44;), split on remaining commas or search til the first comma, then replace your magic string with a single comma.

i.e. Something like:

string originalStr = @"OU=James\, Brown,OU=Test,DC=Internal,DC=Net";
string replacedStr = originalStr.Replace("\,", "&#44;");

string name = replacedStr.Substring(0, replacedStr.IndexOf(","));
Console.WriteLine(name.Replace("&#44;", ","));
Jonathan