tags:

views:

54

answers:

2

Hi,

I am trying to write a pattern in Java to match against Java import declarations.

Example:

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
// import org.apache.hadoop.mapreduce.Something;
/* import org.apache.hadoop.something.else; */

Would match with only:

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;

So far I have the following regex:

"[^A-Za-z0-9\\n]? *import(static|\\s)+[\\w.]*(\\*)?(\\s)*;"

But it's not working. For example:

import org.junit.Test; 
import java.util.ArrayList;
/* The import name; lazily initialized; defaults to a unspecified,...

returns:

import org.junit.Test; 
import java.util.ArrayList; 
import name;

which is wrong.

A: 

How about this:

^import
unbeli
Would catch the second line of `/* import blah; \nimport foo;*/` when it shouldn't.
glowcoder
@glowcoder nope, it will not match that, note the ^ anchor.
unbeli
@glowcoder ah, ok, with \n. Yes, but noone asked for anything else ;)
unbeli
It only matches with the first import line.
Fork
@Fork nice! how exactly do you run it? Post your matching code
unbeli
You can try with the examples I posted above. I test the regexes here :http://www.regexplanet.com/simple/index.html before testing in production code.
Fork
I am doing this in java, using java.util.regex.Pattern
Fork
A: 

I got it working with the use of a flag.

Now it looks like the following:

Pattern.compile("(;|^ *)import(static|\\s)+[\\w.]*(\\*)?(\\s)*;",Pattern.MULTILINE);
Fork
Is it ok if I mark my answer as the correct one?
Fork
This doesn't work if the import declaration uses unicode escapes, e.g. `\u0069mport\u0020java\u002Eutil\u002E\u002A\u003B`. Perhaps not very likely, but that is valid Java import declaration that the regex will miss. Also you are matching things like `importstaticstaticstatic....;`
polygenelubricants