tags:

views:

237

answers:

3

I was trying to match files in a directory that had two dots in their name, something like theme.default.properties
I thought the pattern .\\..\\.. should be the required pattern [. matches any character and \. matches a dot] but it matches both oneTwo.txt and theme.default.properties

I tried the following:
[resources/themes has two files oneTwo.txt and theme.default.properties]
1.

public static void loadThemes()
{
    File themeDirectory = new File("resources/themes");
    if(themeDirectory.exists())
    {
        File[] themeFiles = themeDirectory.listFiles();
        for(File themeFile : themeFiles)
        {
            if(themeFile.getName().matches(".\\..\\.."))
            {
                System.out.println(themeFile.getName());
            }
        }
    }
}

This prints nothing

and the following

File[] themeFiles = themeDirectory.listFiles(new FilenameFilter()
{
    public boolean accept(File dir, String name)
    {
    return name.matches(".\\..\\..");
    }
});

for (File file : themeFiles)
{
    System.out.println(file.getName());
}

prints both

oneTwo.txt
theme.default.properties

I am unable to find why these two give different results and which pattern I should be using to match two dots...

Can someone help?

+5  A: 

This will return true if the filename has two dots in its name, separated by word characters:

matches("\\w+\\.\\w+\\.\\w+")

Matches the following:

aaa.bbb.ccc
aaa.bbb.ccc
111.aaa.bbb
aaa.b_b.ccc
a.b.c

Does not match the following:

aaa.bbb
..
.
---.aaa.bbb
aaa.bbb.ccc.ddd
a-a.bbb.ccc
Marcus Adams
It's always nice to explain the answer - especially in cases like this. +1
Amir Afghani
doesn't match "aaa-aaa.bbb.ccc" either
streetpc
BTW: For a `matches()` test, you do not need the start and end anchors ^ and $ (maybe you need them for more complex patterns with lookahead and such stuff). You would need them for a `find()` test to match the entire string.
Christian Semrau
No, the start and end anchors are implied when you use the `matches()` method. Your regex matches a name with exactly two dots in it whether you add the anchors or not.
Alan Moore
Also, you forgot to double up the backslashes for Java's string-literal handling. And the question marks (making the quantifiers reluctant) aren't doing anything useful.
Alan Moore
@Alan, Christian, doh! Thanks.
Marcus Adams
It should be mentioned that `\w` only matches characters of the standard latin alphabet and the underscore, while the filename allows more characters than that, such as parentheses, braces, comma, diacritic characters and even "special" unicode characters. Your regex would fail for any of them.
BalusC
I would want to match any file names, so this would not (completely) work for me...Anyway thanks!
Nivas
+2  A: 

I cannot reproduce your findings.

After removing the semicolon after the if in your first snippet, both versions print nothing for me. Both versions should print the same filenames, namely those that consist of

a single character, a dot, a single character, a dot, a single character

A test with an additional file named "a.b.c" prints that file.

If you want to match files containing exactly two dots, use the pattern

"[^.]*\\.[^.]*\\.[^.]*"
Christian Semrau
The semicolon was a typo - edited the question.Thanks, It worked. While I understand what is happening with the pattern in this answer, I would want to know what is wrong with .\\..\\.. - at least as per the javadoc, (looks like) this should work (Note: I am a complete RegEx novice)Perhaps one day I will master regular expressions.Thanks!
Nivas
That was already explained. Your pattern matches `any single char - dot - any single char - dot - any single char`, while you actually want `multiple chars which does NOT contain a dot - dot - multiple chars which does NOT contain a dot - dot - multiple chars which does NOT contain a dot`
BalusC
The dot `.` in a RegEx matches any single character (with a configurable exception of newline characters). Appending an asterisk `*` makes it match multiple characters (or none). The character class `[^.]` matches a single character except a dot (the dot has no special meaning within the brackes that define a character class). Mastering RegExes is a nontrivial task, which requires practice. Asking others for their solutions, after looking for an own solution, helps greatly in expanding one's understanding of RegExes.
Christian Semrau
+1  A: 

Another possibilities with less headache:

Replace everything which is not a dot and count the occurrences:

public boolean accept(File dir, String name) {
    return name.replaceAll("[^.]", "").length() == 2;
}

or split on any inner dot and count the parts:

public boolean accept(File dir, String name) {
    return name.split("\\.", -1).length - 1 == 2;
}
BalusC