ansaurus

Question

What's the best way to replace the first letter of a string in Java?

Answer 1

+6 A:

I would suggest you to take a look at Commons-Lang library from Apache. They have a class

StringUtils

which allows you to do a lot of tasks with Strings. In your case just use

StringUtils.uncapitalize( value )

read here about uncapitalize as well as about other functionality of the class suggested

Added: my experience tells that Coomon-Lang is quite good optimized, so if want to know what is better from algorithmistic point of view, you could take a look at its source from Apache.

Maxym 2010-03-15 13:37:24

Except that it uses `Character.toLowerCase` (http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/main/java/org/apache/commons/lang3/StringUtils.java?view=markup), which is not a good idea (specifically anti-recommended in the Java docs) because it doesn't handle locales or Unicode correctly. See my comment (jambjo originally pointed this out, but on a deleted answer) on Marcus' answer for details: http://stackoverflow.com/questions/2447427/whats-the-best-way-to-replace-the-first-letter-of-a-string-in-java/2447655#2447655

T.J. Crowder 2010-03-15 14:18:33

Then I would say that using Character.toLowerCase or String#toLowerCase depends on situation. If you are sure the Character.toLowerCase will not cause problems in your case -> use it because its performance should be better. It depends ;-)

Maxym 2010-03-15 14:31:33

@Maxym: Fair enough, but I doubt there's going to be sufficient performance gain to make it worth setting up the maintenance issue down-the-line (as today's apps deal with a *lot* of locale and Unicode stuff). :-)

T.J. Crowder 2010-03-15 14:53:00

@Crowder: totally agree, as for me there is enough information for author of question (froadie) on this page to decide what to do, anyway he/she knows requirements for task and can discuss future of project with customers (if needed, based on solution) ;-)

Maxym 2010-03-15 15:24:01

Answer 2

+1 A:

The downside of the code you used (and I've used in similar situations) is that it seems a bit clunky and in theory generates at least two temporary strings that are immediately thrown away. There's also the issue of what happens if your string is fewer than two characters long.

The upside is that you don't reference those temporary strings outside the expression (leaving it open to optimization by the bytecode compiler or the JIT optimizer) and your intent is clear to any future code maintainer.

Barring your needing to do several million of these any given second and detecting a noticeable performance issue doing so, I wouldn't worry about performance and would prefer clarity. I'd also bury it off in a utility class somewhere. :-) See also jambjo's response to another answer pointing out that there's an important difference between String#toLowerCase and Character.toLowerCase. (Edit: The answer and therefore comment have been removed. Basically, there's a big difference related to locales and Unicode and the docs recommend using String#toLowerCase, not Character.toLowerCase; more here.)

Edit Because I'm in a weird mood, I thought I'd see if there was a measureable difference in performance in a simple test. There is. It could be because of the locale difference (e.g., apples vs. oranges):

public class Uncap
{
    public static final void main(String[] params)
    {
        String  s;
        String  s2;
        long    start;
        long    end;
        int     counter;

        // Warm up
        s = "Testing";
        start = System.currentTimeMillis();
        for (counter = 1000000; counter > 0; --counter)
        {
            s2 = uncap1(s);
            s2 = uncap2(s);
            s2 = uncap3(s);
        }

        // Test v2
        start = System.currentTimeMillis();
        for (counter = 1000000; counter > 0; --counter)
        {
            s2 = uncap2(s);
        }
        end = System.currentTimeMillis();
        System.out.println("2: " + (end - start));

        // Test v1
        start = System.currentTimeMillis();
        for (counter = 1000000; counter > 0; --counter)
        {
            s2 = uncap1(s);
        }
        end = System.currentTimeMillis();
        System.out.println("1: " + (end - start));

        // Test v3
        start = System.currentTimeMillis();
        for (counter = 1000000; counter > 0; --counter)
        {
            s2 = uncap3(s);
        }
        end = System.currentTimeMillis();
        System.out.println("3: " + (end - start));

        System.exit(0);
    }

    // The simple, direct version; also allows the library to handle
    // locales and Unicode correctly
    private static final String uncap1(String s)
    {
        return s.substring(0,1).toLowerCase() + s.substring(1);
    }

    // This will *not* handle locales and unicode correctly
    private static final String uncap2(String s)
    {
        return Character.toLowerCase(s.charAt(0)) + s.substring(1);
    }

    // This will *not* handle locales and unicode correctly
    private static final String uncap3(String s)
    {
        StringBuffer sb;

        sb = new StringBuffer(s);
        sb.setCharAt(0, Character.toLowerCase(sb.charAt(0)));
        return sb.toString();
    }
}

I mixed up the order in various tests (moving them around and recompiling) to avoid issues of ramp-up time (and tried to force some initially anyway). Very unscientific, but uncap1 was consistently slower than uncap2 and uncap3 by about 40%. Not that it matters, we're talking a difference of 400ms across a million iterations on an Intel Atom processor. :-)

So: I'd go with your simple, straightforward code, wrapped up in a utility function.

T.J. Crowder 2010-03-15 13:41:13

This sort of microbenchmark is not appropriate for the JVM in many cases. See http://www.ibm.com/developerworks/java/library/j-benchmark1.html

deamon 2010-03-15 14:06:43

@deamon: I wouldn't be at all surprised. :-) (And thanks for the link, have to give it a read sometime.) I did say "unscientific" and that he really shouldn't worry about it regardless until or unless there were a specific performance problem to fix.

T.J. Crowder 2010-03-15 14:09:57

nice job! Just take a look how uncapitalized method will be implemented at Common-Lang 3.0 (or probably already impemented the same way in current version 2.3): new StringBuilder(strLen).append(Character.toLowerCase(str.charAt(0))).append(str.substring(1)).toString();

Maxym 2010-03-15 14:10:56

Just tried jour classes, and added uncap4 method with Apache's approach. Results from my console: 2: 3281: 4693: 5634: 265- Apache is the winner :)

Maxym 2010-03-15 14:11:51

or isn't (reading @deamon). btw strLen - length of string str (just to make it clear)

Maxym 2010-03-15 14:13:33

@Maxym: Aside from the fact that `Character.toLowerCase` isn't a good way to do this, the measurement mechanism is suspect, and the improvement too small to care about. **;-)**

T.J. Crowder 2010-03-15 14:13:53

Answer 3

+1 A:

Watch out for any of the character functions in strings. Because of unicode, it is not always a 1 to 1 mapping. Stick to string based methods unless char is really what you want. As others have suggested, there are string utils out there, but even if you don't want to use them for your project, just make one yourself as you work. The worst thing you can do is to make a special function for lowercase and hide it in a class and then use the same code slightly differently in 12 different places. Put it somewhere it can easily be shared.

Russell Leggett 2010-03-15 13:42:04

Answer 4

A:

Use StringBuffer:

buffer.setCharAt(0, Character.toLowerCase(buffer.charAt(0)));

Marcus 2010-03-15 14:01:46

You want to avoid the character versions of these things: *"In general, String.toLowerCase() should be used to map characters to lowercase. String case mapping methods have several benefits over Character case mapping methods. String case mapping methods can perform locale-sensitive mappings, context-sensitive mappings, and 1:M character mappings, whereas the Character case mapping methods cannot."* - http://java.sun.com/javase/6/docs/api/java/lang/Character.html#toLowerCase%28char%29 Also, explicitly using a `StringBuffer` for one `String` concat won't actually help perf (and can hinder).

T.J. Crowder 2010-03-15 14:04:03

ansaurus

tags:

views:

answers:

What's the best way to replace the first letter of a string in Java?

related questions