tags:

views:

66

answers:

4

I am beginner and have some problems with regexp.

Input text is : something idUser=123654; nick="Tom" something

I need extract value of idUser -> 123456

I try this:

//idUser is already 8 digits number
        MatchCollection matchsID = Regex.Matches(pk.html, @"\bidUser=(\w{8})\b");
        Text = matchsID[1].Value;

but on output i get idUser=123654, I need only number

The second problem is with nick="Tom", how can I get only text Tom from this expresion.

A: 
.*?idUser=([0-9]+).*?

That regex should work for you :o)

Chief17
+1  A: 

you don't show your output code, where you get the group from your match collection.

Hint: you will need group 1 and not group 0 if you want to have only what is in the parentheses.

mihi
But It don't find a match
Tom
Try `Text = matchsID[0].Groups[1].Value`
mihi
A: 

Here's a pattern that should work:

\bidUser=(\d{3,8})\b|\bnick="(\w+)"

Given the input string:

something idUser=123654; nick="Tom" something

This yields 2 matches (as seen on rubular.com):

  • First match is User=123654, group 1 captures 123654
  • Second match is nick="Tom", group 2 captures Tom

Some variations:

  • In .NET regex, you can also use named groups for better readability.
  • If nick always appears after idUser, you can match the two at once instead of using alternation as above.
  • I've used {3,8} repetition to show how to match at least 3 and at most 8 digits.

API links

polygenelubricants
I have problem with quotes in regexp .MatchCollection matchsUsers = Regex.Matches(pk.html, @"\bidUser=(\d{8})\b|\bnick="(\w+)""); How use " in \bnick="(\w+)"" ?
Tom
@Tom: in `@`-quoted string, you double `"` to escape it. So it becomes `@"\bidUser=(\d{8})\b|\bnick=""(\w+)"""`. Also, note that `\d{8}` only matches exactly 8 digits. Your example has `123456` which is only 6.
polygenelubricants
A: 

Use look-around

(?<=idUser=)\d{1,8}(?=(;|$))

To fix length of digits to 6, use (?<=idUser=)\d{6}(?=($|;))

Amarghosh