views:

69

answers:

3

Dear Stackoverflow,

I have a nice CamelCase string such as ImageWideNice or ImageNarrowUgly. Now I want to break that string in its substrings, such as Image, Wide or Narrow, and Nice or Ugly.

I thought this could be solved simply by

camelCaseString =~ /(Image)((Wide)|(Narrow))((Nice)|(Ugly))/

But strangely, this will only fill $1 and $2, but not $3.

Do you have a better idea for splitting that string?

+7  A: 
s = 'nowIsTheTime'

s.split /(?=[A-Z])/

=> ["now", "Is", "The", "Time"]
DigitalRoss
Have you tried `NowIsTheTime`?
splash
@splash: it still works fine
ryeguy
During my tests this regex results in `["", "Now", "Is", "The", "Time"]` if the first letter is an uppercase letter. What do I wrong?
splash
splash: I tried it in 1.8.7 and 1.9 and it worked on NowIsTheTime. With what language/version did you get the zero-length first element?
DigitalRoss
Sorry @DigitalRoss, I forgot to mention that I tested with RegexBuddy. But I wonder why this is valid in Ruby? Java also gives an empty String for the first array element: `"NowIsTheTime".split("(?=[A-Z])")`
splash
You could always just add `reject {|x| x.empty? }`
Ollie Saunders
No, not necessary, it works fine in Ruby. It was with Java and with RegexBuddy that splash had troubles.
DigitalRoss
+2  A: 

Have you tried

camelCaseString =~ /(Image)(Wide|Narrow)(Nice|Ugly)/

?

pjmorse
+2  A: 

Event though this is a Ruby regex question and the answer by DigitalRoss is correct and shines by its simplicity, I want to add a Java answer:

// this regex doesn't work perfect with Java and other regex engines
"NowIsTheTime".split("(?=[A-Z])"); // ["", "Now", "Is", "The", "Time"]

// this regex works with first uppercase or lowercase characters
"NowIsTheTime".split("(?!(^|[a-z]|$))"); // ["Now", "Is", "The", "Time"]
"nowIsTheTime".split("(?!(^|[a-z]|$))"); // ["now", "Is", "The", "Time"]
splash