tags:

views:

90

answers:

6

I need help putting together a regex that will match word that ends with "Id" with case sensitive match.

+1  A: 

How about "\A[a-z]*Id\z"? [This makes characters before 'Id' optional. Use "\A[a-z]+Id\z" if there needs to be one or more characters preceding "Id."]

tk-421
A: 
Regex ids = new Regex(@"\w*Id\b", RegexOptions.None);

"\b" means "word break" & \w mean any word character, so \w*Id\b means "{stuff}Id". By not including RegexOptions.IgnoreCase, it will be case sensitive.

James Curran
+4  A: 

Try this regular expression:

\w*Id\b

\w* allows word characters in front of Id and the \b ensures that Id is at the and of the word (\b is word boundary assertion).

Gumbo
@epitka, note that `\w` also matches numbers and the underscore. In short the strings `___Id` and `12345Id` will also be matched.
Bart Kiers
I gave you an upvote, but epitka doesn't specify if just "Id" is allowable, so I'd be tempted to change the * for a +
BenAlabaster
+1  A: 

I would use
\b[A-Za-z]*Id\b
The \b matches the beginning and end of a word i.e. space, tab or newline, or the beginning or end of a string.

The [A-Za-z] will match any letter, and the * means that 0+ get matched. Finally there is the Id.

Note that this will match words that have capital letters in the middle such as 'teStId'.

I use http://www.regular-expressions.info/ for regex reference

MrBones
The set `a-z` excludes `é` and other similar characters. Perhaps not an issue, but something epitka may want to know.
Bart Kiers
[A-Za-z] doesn't match non-English alphabetic characters, so should be avoided in favour of \w unless a guarantee can be made that only English letters will appear.
BenAlabaster
+1  A: 

This may do the trick:

\b\p{L}*Id\b

Where \p{L} matches any (Unicode) letter and \b matches a word boundary.

Bart Kiers
does \p{L} work in C# regex? I've never seen that one before and usually opt for \w
BenAlabaster
@BenAlabaster, yes: http://msdn.microsoft.com/en-us/library/20bw873z.aspx#SupportedUnicodeGeneralCategories And yes, perhaps `\w` is sufficient for the OP, but it matches more than letters (see my comment under Gumbo's post).
Bart Kiers
A: 

Gumbo gets my vote, however, the OP doesn't specify whether just "Id" is an allowable word, which means I'd make a minor modification:

\w+Id\b

1 or more word characters followed by "Id" and a breaking space. The [a-zA-Z] variants don't take into account non-English alphabetic characters. I might also use \s instead of \b as a space rather than a breaking space. It would depend if you need to wrap over multiple lines.

BenAlabaster