tags:

views:

104

answers:

2
+1  Q: 

Java Regex Problem

I am working on this bit of code that checks a line of actionscript 3 code for the existance of a type (MovieClip, Sprite, along with the custom classes defined in the classpath) that is in a collection that is being iterated.

for (String type: typeList) {
    if (input.contains(type)) {
        // dome something here
    }
}

The problem is, some of the custom type names also contain the name of another type:

Custom type: fSceneController
Contains flash type: Scene

So the .contains method will not work properly. I was thinking of using a regex inside the loop where the pattern checks for the type and makes sure that there are no a-zA-Z0-9 immediately before or after the type.

Pattern p = Pattern.compile("<stuff here>"+ type + "<more stuff here>");

Can anyone help me determine what i should put before and after the type so that the type itself can be detected distinctly from other types that may contain part of the text?

Or perhaps suggest a different method that i can use to accomplish the same goal?

+6  A: 

Not sure I'm clear on what you're trying to do, but I think this is what you're missing

If you want a word in a regex and just the word, then put \b in front and in back, for instance

\bhe\b will only match the first of...

he
she
the
they
Tim Hoolihan
wow i can't believe it was that simple. the power of regex knows no bounds.
Jason Miesionczek
cool, glad to hear that was the issue. I think the \b stands for boundary. It handles punctuation correctly to.
Tim Hoolihan
just to give some context, i am writing an actionscript 3 to haxe translator. do to the different ways the languages handle types and imports, all the types being used in a class must be detected so that the import statements can be expanded (haxe does not support wildcard imports), thus the original problem where other types were being detected and imported that weren't actually being used.
Jason Miesionczek
You might also want to quote the type name like using \Q and \E which will prevent any of its characters from being misinterpreted as regex control characters (though maybe only likely with '.' in this case). You regex would then look like: "\\b\\Q" + type + "\\E\\b"
paulcm
@paulcm thanks for the extra info.
Jason Miesionczek
A: 

You might be able to check the length as well, like this

for(String type : typeList)
    if(input.contains(type) && input.length() == type.length())
        System.out.println("Found " + type);
Matt McMinn
that won't work as the input length could be much higher than the length of the type being searched for. the whole problem is that the type is mixed in with a bunch of other text and has to be detected distinctively from other text and any other known types.
Jason Miesionczek
Oh yeah - just completely read over that bit of the problem.
Matt McMinn