views:

39

answers:

2

Hello, I need to use a regular expression that contains all the \b characters except the dot, .

Something like [\b&&[^.]]

For example, in the following test string:

"somewhere deep down in some org.argouml.swingext classes and"

I want org.argouml.swingext string to match but org.argouml string not too match. (Using the Matcher.find() method)

If I use: \b(package_name)>\b they both match, which is not what I want.

If I use: \b(package_name)[\b&&[^\.]] I get a PatternSyntaxException

If I use: \b(package_name)(\b&&[^\.]) nothing matches.

I use this link to test my regexes.

Context: I have a list of package names from a project and I have to search them inside some texts. Obviously if a nested package is found, I don't want the outer package to match as well, as seen from the above example.

I am not using the \s character class at the end because the package may be at the end of line, or it may followed by other nonword characters such as : , ) etc, characters that are contained in the \b class. I just want to subtract the . from the \b class.

If anybody knows how to do this, I would be very grateful :) Thanks

A: 

A negative lookahead would work here:

\borg.argouml(?!\.)\b

Remember that in Java string literals the backslashes in regular expressions must be escaped:

"\\borg.argouml(?!\\.)\\b"
Mark Byers
A: 

Why not simply use:

\b\w+(\.\w+)+\b

FYI, the PatternSyntaxException pops up because \b matches a position, not a character. A character class always matches 1 character so putting \b (a word boundary) inside a character class will cause the exception to be thrown.

Bart Kiers