views:

316

answers:

9

What's the most efficient way to remove the extension of a filename in Java, without assuming anything of the filename?

Some examples and expected results:

  • folder > folder
  • hello.txt > hello
  • read.me > read
  • hello.bkp.txt > hello.bkp
  • weird..name > weird.
  • .hidden > .hidden

(or should the last one be just hidden?)

Edit: The original question assumed that the input is a filename (not a file path). Since some answers are talking about file paths, such functions should also work in cases like:

  • rare.folder/hello > rare.folder/hello

This particular case is handled very well by Sylvain M's answer.

+7  A: 

This will take a file path and then return the new name of the file without the extension.

public static String removeExtention(String filePath) {
    File f = new File(filePath);
    // if it's a directory, don't remove the extention
    if (fisDirectory()) return f.getName();
    String name = f.getName();
    // if it is a hidden file
    if (name.startsWith(".")) {
        // if there is no extn, do not rmove one...
        if (name.lastIndexOf('.') == name.indexOf('.')) return name;
    }
    // if there is no extention, don't do anything
    if (!name.contains(".") return name;
    // Otherwise, remove the last 'extension type thing'
    return name.substring(0, name.lastIndexOf('.'))
}

People should note that this was written on my netbook, in the tiny SO editor box. This code is not meant for production. It is only meant to server as a good first attempt example of how I would go about removing the extension from a filename.

jjnguy
You'll want to check that `name.lastIndexOf('.') != -1`, else that last `substring` call will throw an exception. As hgpc said not to assume anything about the filenames, this seems like a case that should be handled correctly.
Andrzej Doyle
This also fails on the `weird..name` example.
Andrzej Doyle
I fixed that issue. We all probably saw it at the exact same time. Thanks for pointing that out.
jjnguy
@And in that case, I would assume the extn is `.name' and `weird.` is the name of the file.
jjnguy
@sylvain is correct. lastIndexOf will throw an exception if one isn't found.
Ash Burlaczenko
@Ash, that has been fixed.
jjnguy
+1 for being a good starting point (and for including the disclaimer, which IMO is implicit on most answers of this type anyway!).
Andrzej Doyle
The name of the filePath variable is misleading, as this solution might not work if filePath is indeed a path. Example: "rare.folder/hello"
hgpc
@hgpc, it should work because I create a `File` object, and then get the File Name out of that object. So it will only deal with the last part of the path.
jjnguy
Ah, I didn't see that. Clever. But you shouldn't return "name" in some cases, then.
hgpc
@And Thanks. I usually feel it is implicit, but when people start bringing up obscure test cases, I feel it is necessary to post the disclaimer.
jjnguy
@hgpc, thanks .
jjnguy
@hgpc, I return only the name in all cases because you only want the Name of the file/directory without the extension.
jjnguy
+6  A: 

Using common io from apache http://commons.apache.org/io/

public static String removeExtension(String filename)

FYI, the source code is here:

http://svn.apache.org/viewvc/commons/proper/io/trunk/src/java/org/apache/commons/io/FilenameUtils.java?view=markup

Arg, I've just tried something...

System.out.println(FilenameUtils.getExtension(".polop")); // polop
System.out.println(FilenameUtils.removeExtension(".polop")); // empty string

So, this solution seems to be not very good... Even with common io, you'll have to play with removeExtension() getExtension() indexOfExtension()...

Sylvain M
I should have mentioned that I can't include external libraries, but I will look at common's io source code.
hgpc
Oh, ok. In this case, the answer of Justin is more usefull...
Sylvain M
+1  A: 

Regex for these things are "fast" enough but not efficient if compared to the simplest method that can be thought: scan the string from the end and truncate it at the first dot (not inclusive). In Java you could use lastIndexOf and substring to take only the part you are interested in. The initial dot should be considered as a special case and if the last occurrence of "." is at the beginning, the whole string should be returned.

ShinTakezou
A: 

I know a regex to do it, but in Java do i have to write like 10 lines of code to do a simple regex substitution?

With and without killing hidden files:

^(.*)\..*$
^(..*)\..*$
LatinSuD
A: 

filename.replace("$(.+).\w+", "\1");

William
+1  A: 

It's actually very easy, assuming that you have a valid filename.

In Windows filenames the dot character is only used to designate an extension. So strip off the dot and anything after it.

In unix-like filenames the dot indicates an extension if it's after the last separator ('/') and has at least one character between it and the last separator (and is not the first character, if there are no separators). Find the last dot, see if it satisfies the conditions, and strip it and any trailing characters if it does.

It's important that you validate the filename before you do this, as this algorithm on an invlaid filename might do something unexpected and generate a valid filename. So in Windows you may need to check that there isn't a backslash, or a colon, after the dot.

If you don't know what kind of filename you are dealing with, treating them all like Unix will get you most of the way.

DJClayworth
A: 

Use new Remover().remove(String),

jdb@Vigor14:/tmp/stackoverflow> javac Remover.java && java Remover
folder > folder
hello.txt > hello
read.me > read
hello.bkp.txt > hello.bkp
weird..name > weird.
.hidden > .hidden

Remover.java,

import java.util.*;

public class Remover {

    public static void main(String [] args){
        Map<String, String> tests = new LinkedHashMap<String, String>();
        tests.put("folder", "folder");
        tests.put("hello.txt", "hello");
        tests.put("read.me", "read");
        tests.put("hello.bkp.txt", "hello.bkp");
        tests.put("weird..name", "weird.");
        tests.put(".hidden", ".hidden");

        Remover r = new Remover();
        for(String in: tests.keySet()){
            String actual = r.remove(in);
            log(in+" > " +actual);
            String expected = tests.get(in);
            if(!expected.equals(actual)){
                throw new RuntimeException();
            }
        }
    }

    private static void log(String s){
        System.out.println(s);
    }

    public String remove(String in){
        if(in == null) {
            return null;
        }
        int p = in.lastIndexOf(".");
        if(p <= 0){
            return in;
        }
        return in.substring(0, p);
    }
}
Janek Bogucki
Fails on this new case: rare.folder/hello > rare.folder/hello
Janek Bogucki
+1  A: 
int p=name.lastIndexOf('.');
if (p>0)
  name=name.substring(0,p);

I said "p>0" instead of "p>=0" because if the first character is a period we presumably do not want to wipe out the entire name, as in your ".hidden" example.

Do you want to actually update the file name on the disk or are you talking about just manipulating it internally?

Jay
Just internally.
hgpc
I'm assuming you only have the filename here, i.e. no path. I see someone else brought up that question. To handle a full path would require another couple of lines of code. Still not a big deal.
Jay
+1  A: 

I'm going to have a stab at this that uses the two-arg version of lastIndexOf in order to remove some special-case checking code, and hopefully make the intention more readable. Credit goes to Justin 'jinguy' Nelson for providing the basis of this method:

public static String removeExtention(String filePath) {
    // These first few lines the same as Justin's
    File f = new File(filePath);

    // if it's a directory, don't remove the extention
    if (fisDirectory()) return filePath;

    String name = f.getName();

    // Now we know it's a file - don't need to do any special hidden
    // checking or contains() checking because of:
    final int lastPeriodPos = name.lastIndexOf('.', 1);
    if (lastPeriodPos == -1)
    {
        // No period after first character - return name as it was passed in
        return filePath;
    }
    else
    {
        // Remove the last period and everything after it
        File renamed = new File(f.getParent(), name.substring(0, lastPeriodPos));
        return renamed.getPath();
    }
}

To me this is clearer than special-casing hidden files and files that don't contain a dot. It also reads clearer to what I understand your specification to be; something like "remove the last dot and everything following it, assuming it exists and is not the first character of the filename".

Note that this example also implies Strings as inputs and outputs. Since most of the abstraction requires File objects, it would be marginally clearer if those were the inputs and outputs as well.

Andrzej Doyle
A nitpick: if your input is a file path (as the variable name states), this solution fails in some cases. See the revised question.
hgpc
Ah, good point. I've revised the answer so that the input string is returned in all cases where we detect no change is required; and if we do return a modified string this is prefixed with the directories (if any) in the input path.
Andrzej Doyle
This is indeed a better looking version of mine I think.
jjnguy