tags:

views:

261

answers:

3

how to implement such a requirement via regexp?

I have a list of filenames as String's.
LOAD_filesourceA-01012008-00001.dat
LOAD_filesourceB-01012008-00001.dat
LOAD_filesourceB-01012008-00003.dat
LOAD_filesourceA-01012008-00004.dat
LOAD_filesourceA-01012008-000055.dat
LOAD_filesourceB-01012008_000055.dat
...
LOAD_filesourceB-01012008_000058.dat
etc

after loading each file, that file gets moved into an archive directory... and I log the file type and load number(last 6 chars in filename)
I have 2 pieces of info: 1- whether the file I wish to load is of type A or B 2- the last loaded file number as integer based on these, I would like to get the file name of the next file, i.e. that is of the same file type and the load number(= the last 6 digits before . ".dat" section) should be the next available number. say loaded was 12, then I will search for 13, if not available 14, 15 etc.. till I process all files in that directory.

just given a string like "LOAD_filesourceB-01012008_000058.dat" can I check that this is file type B and assuming last loaded file number was 57, it satisfies being number 58 requirement. (> 57 I mean)

+1  A: 

LOAD_filesource(A|B)-[0-9]+-([0-9])+.dat

A or B will end up in group 1, the number of the file in group 2. Then parse group 2 as a decimal integer.

Nat
A: 

I don't know if its intentional or not, but you have listed two different formats, one that uses a hyphen as the final separator and one that uses an underscore. If both are really supported, you would want:

LOAD_filesource(A|B)-[0-9]+[_-]([0-9])+.dat

Also, your six digit number is sometimes five digits (e.g. the 00001 in LOAD_filesourceA-...-00001.dat), but the above regular expression only requires at least one digit be present.

Depending on how many files you're going to attempt to examine, you might be better off loading up a directory listing rather than randomly checking to see if a file exists. With an appropriate compare method, sorting your list could give you your files in an easy-to-work-with order.

Kaleb Pederson
+1  A: 

See this:

public class Match {

    Pattern pattern = Pattern.compile("LOAD_filesource(A|B)-[0-9]{8}[_-]([0-9]{5,6})\\.dat");

    String files[] = {
        "LOAD_filesourceA-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00001.dat",
        "LOAD_filesourceB-01012008-00003.dat",
        "LOAD_filesourceA-01012008-00004.dat",
        "LOAD_filesourceA-01012008-000055.dat",
        "LOAD_filesourceB-01012008_000055.dat",
        "LOAD_filesourceB-01012008_000058.dat"
    };

    public static void main(String[] args) {
        new Match().run();
    }

    private void run() {
        for (String file : files) {
            Matcher matcher = pattern.matcher(file);

            System.out.print(String.format("%s %b %s %s\n", file, matcher.matches(), matcher.group(1), matcher.group(2)));
        }
    }
}

with this output:

LOAD_filesourceA-01012008-00001.dat true A 00001
LOAD_filesourceB-01012008-00001.dat true B 00001
LOAD_filesourceB-01012008-00003.dat true B 00003
LOAD_filesourceA-01012008-00004.dat true A 00004
LOAD_filesourceA-01012008-000055.dat true A 000055
LOAD_filesourceB-01012008_000055.dat true B 000055
LOAD_filesourceB-01012008_000058.dat true B 000058
Toader Mihai Claudiu