ansaurus

Question

How do I convert CamelCase into human-readable names in Java?

Answer 1

+2 A:

The following Regex can be used to identify the capitals inside words:

"((?<=[a-z0-9])[A-Z]|(?<=[a-zA-Z])[0-9]]|(?<=[A-Z])[A-Z](?=[a-z]))"

It matches every capital letter, that is ether after a non-capital letter or digit or followed by a lower case letter and every digit after a letter.

How to insert a space before them is beyond my Java skills =)

Edited to include the digit case and the PDF Loader case.

Jens 2010-04-01 10:47:58

what about digits?

Yaneeve 2010-04-01 10:48:37

@Yaneeve: I just saw the digits... this might make things more complicated. Probably another Regex to catch those would be the easy way.

Jens 2010-04-01 10:50:54

@Jens: Will it match the `L` in `PDFLoader`?

Jørn Schou-Rode 2010-04-01 10:52:21

how about (?<=[a-z0-9])[A-Z0-9] ?

Yaneeve 2010-04-01 10:52:55

@Jørn: Good point! Need to think about that. =) .... ok, edited something in to catch those.

Jens 2010-04-01 10:54:18

@Yaneeve: That will unfortunately match the second 1 in 11.

Jens 2010-04-01 10:59:43

Now, I vastly admire your Regex skill, but I'd hate to have to maintain that.

Chris Knight 2010-04-01 11:07:39

@Chris: Yep, thats true. Regex is more of a write-only language. =) Although this particular expression is not very hard to read, if you read `|` as "or". Well... maybe it is... I've seen worse =/

Jens 2010-04-01 11:18:36

Answer 2

A:

RegEx should work, something like ([A-Z]{1}). This will capture all Capital Letters, after that you could replace them with \1 or how ever you can refer to RegEx Groups in Java.

Bobby 2010-04-01 10:54:42

`{1}` is redundant.

Jonathan Feinberg 2010-04-01 11:34:09

Answer 3

+1 A:

I think you will have to iterate over the string and detect changes from lowercase to uppercase, uppercase to lowercase, alphabetic to numeric, numeric to alphabetic. On every change you detect insert a space with one exception though: on a change from upper- to lowercase you insert the space one character before.

Felix 2010-04-01 11:06:05

Answer 4

+18 A:

This works with your testcases:

static String splitCamelCase(String s) {
   return s.replaceAll(
      String.format("%s|%s|%s",
         "(?<=[A-Z])(?=[A-Z][a-z])",
         "(?<=[^A-Z])(?=[A-Z])",
         "(?<=[A-Za-z])(?=[^A-Za-z])"
      ),
      " "
   );
}

Here's a test harness:

    String[] tests = {
        "lowercase",        // [lowercase]
        "Class",            // [Class]
        "MyClass",          // [My Class]
        "HTML",             // [HTML]
        "PDFLoader",        // [PDF Loader]
        "AString",          // [A String]
        "SimpleXMLParser",  // [Simple XML Parser]
        "GL11Version",      // [GL 11 Version]
        "99Bottles",        // [99 Bottles]
        "May5",             // [May 5]
        "BFG9000",          // [BFG 9000]
    };
    for (String test : tests) {
        System.out.println("[" + splitCamelCase(test) + "]");
    }

It uses zero-length matching regex with lookbehind and lookforward to find where to insert spaces. Basically there are 3 patterns, and I use String.format to put them together to make it more readable.

The three patterns are:

UC behind me, UC followed by LC in front of me

  XMLParser   AString    PDFLoader
    /\        /\           /\

non-UC behind me, UC in front of me

 MyClass   99Bottles
  /\        /\

Letter behind me, non-letter in front of me

 GL11    May5    BFG9000
  /\       /\      /\

References

regular-expressions.info/Lookarounds

Related questions

Using zero-length matching lookarounds to split:

polygenelubricants 2010-04-01 11:35:13

I like your concern for readability

Yaneeve 2010-04-01 11:46:36

C'est chic. Oh la la.

Jonathan Feinberg 2010-04-01 12:02:29

Awesome. The trick of using look-behind regexes makes this a very elegant solution. Thank you!

Frederik 2010-04-01 12:42:57

Answer 5

A:

http://code.google.com/p/inflection-js/

You could chain the String.underscore().humanize() methods to take a CamelCase string and convert it into a human readable string.

atomicguava 2010-05-03 14:43:38

inflection-js is in Javascript. I'm looking for a Java solution.

Frederik 2010-05-04 10:06:26

Sorry about that.

atomicguava 2010-05-05 16:12:20

Answer 6

A:

I'm not a regex ninja, so I'd iterate over the string, keeping the indexes of the current position being checked & the previous position. If the current position is a capital letter, I'd insert a space after the previous position and increment each index.

Joel 2010-06-04 00:20:26

Answer 7

A:

Sorry, my solution is not correct. Deleted.

Peter Mucsi 2010-07-21 10:08:34

ansaurus

tags:

views:

answers: