views:

423

answers:

3

I have a string like this "HelloWorldMyNameIsCarl" and I want it to become something like "Hello_World_My_Name_Is_Carl". How can I do this?

+2  A: 

Here's a hint to get you thinking along a possible solution:

  1. Find a way of splitting the string into parts at each capital letter
  2. Join the split strings back up with underscores between them

Useful keywords:

  • split
  • regular expression/regex
Rob
This is a living proof of how great this quote really is: http://stackoverflow.com/questions/58640/great-programming-quotes/58646#58646 :-)
ChssPly76
What do you have against regular expressions, ChssPly76?
Stefan Kendall
String.split removes the matched (=splitting) characters and is therefore not really helpful here.
Henning
Some people, when confronted with a problem, think, "I know, I'll steal a quote from jwz". Now they have two problems. More seriously though, I don't care how the OP does the splitting; the point I was trying to make was to encourage him/her to break down the problem a little and find solutions to smaller problems. You can farm out the split-at-uppercase to a pack of trained monkeys if you so wish.
Rob
@Rob - I didn't steal anything, I've ...umm... quoted the quote :-) And I totally agree with your point; I just don't think regular expressions are the best way to solve this particular problem. That said, OP did tag this as "regex" (which I didn't notice when I provided my answer) so what I think is irrelevant :-)
ChssPly76
@ChssPly76: If the OP wrongly presupposes using a RegEx to solve the problem and you point out the pitfalls, then what you think is completely relevant.
Software Monkey
+3  A: 

Is this homework? To get you started:

  1. Create a StringBuffer
  2. Iterate over your string.
  3. Check each character to be uppercase (java.lang.Character class will help)
  4. Append underscore to buffer if so.
  5. Append current character to buffer.
ChssPly76
Splitting or regex seem much more natural.
Stefan Kendall
He even asked for regex.
Stefan Kendall
+1 - This seems like a perfectly workable solution to me (particularly if this is homework).
Matthew Murdoch
Regex is not mentioned anywhere in the question, though it is indeed tagged thus (which I honestly haven't noticed right up until your comment). That said, doing something like this with regex is completely ridiculous.
ChssPly76
No, it's not. Your method would require excessive documentation and preconditions to understand or test fully. I could read the regex and understand its intent immediately, and the comments would be simple.
Stefan Kendall
Regular expressions aid in pattern matching. We seem to be matching a pattern here. I find the application straightforward.
Stefan Kendall
It always makes me feel creepy when people won't just iterate over a damn list. This is the easy, natural obvious answer. Regex would be (significantly) slower, more prone to bugs and, in this case, probably less elegant/longer. Not that "Elegant" is a good measure of code except in those rare cases where it happens to coincide with "Readable"
Bill K
@Bill: Do you not use regex often? I use it all the time in personal text editors, validation, and the like, and it tends to save my life more often than not. I can only think that the regex naysayers just don't understand regex well enough to recommend it as a solution. Furthermore, speed won't be a consideration in a desktop application. I've spawned perl processes to run regex and parse text **in real time with the user's keystroke** and the performance hit was not noticeable. Matching was perceivably instantaneous, even when spawning a separate process and runtime to perform the match.
Stefan Kendall
I also use regular expression all the time but I wouldn't use it here, because a) it's homework anyway and b) you probably miss some obscure international uppercase letters (Yes, there is an uppercase i with the dot!). @Stefan Flaming does not help. Neither does screaming in bold.
Henning
Bill mentioned a performance loss; I explained why he was **wrong** for must purposes. There's no "flaming" here. And even if it was homework, it was a) tagged with regex and b) internationalization and its applications in regular expressions were not the OP's question or likely use case.
Stefan Kendall
+14  A: 

Yes, regular expressions can do that for you:

"HelloWorldMyNameIsCarl".replaceAll("(.)([A-Z])", "$1_$2")

The expression [A-Z] will match every upper case letter and put it into the second group. You need the first group . to avoid replacing the first 'H'.

As Piligrim pointed out, this solution does not work for arbitrary languages. To catch any uppercase letter defined by the Unicode stardard we need the Unicode 4.1 subproperty \p{Lu} which matches all uppercase letters. So the more general solution looks like

"HelloWorldMyNameIsCarl".replaceAll("(.)(\\p{Lu})", "$1_$2")

Thanks Piligrim.

Peter Kofler
Beat me to it. I couldn't remember if Java regex replacements were \1 or $1, so I had to look it up.
Stefan Kendall
Note that this solution (unlike the one given by ChssPly76) does not work for all uppercase letters, but only for the english ones: Ä Ö Ü Â Ê Î Ô Û and so on are all missing.
Henning
I'd argue that his example string implies English.
Stefan Kendall
Furthermore, this can be adapted with the correct regex engine to handle unicode uppercase characters. The concept is still correct.
Stefan Kendall
according to java spec, \p{Lu} will work for Unicode upper case letters. So the regex is "(.)(\\p{Lu})"
tulskiy
Thanks that works great.
w4nderlust