ansaurus

Question

Regular expression to identify CamelCased words

Answer 1

+10 A:

([A-Z][a-z0-9]+)+

Assuming English. Use appropriate character classes if you want it internationalizable. This will match words such as "This". If you want to only match words with at least two capitals, just use

([A-Z][a-z0-9]+){2,}

UPDATE: As I mentioned in a comment, a better version is:

[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*

It matches strings that start with an uppercase letter, contain only letters and numbers, and contain at least one lowercase letter and at least one other uppercase letter.

Adam Crume 2009-07-14 22:04:34

What about words with a subsequence of uppercase characters or ending with an uppercase character?

ephemient 2009-07-15 02:04:53

If you want to match only words with more than one uppercase character, it'd be something like this: ([A-Z][a-z0-9]*){2,}

Adam Crume 2009-07-15 12:43:17

Right, but that matches all-uppercase words too, which (IMO) shouldn't be considered CamelCase.

ephemient 2009-07-15 14:27:09

Okay, then: [A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*It matches strings that start with an uppercase letter, contain only letters and numbers, and contain at least one lowercase letter and at least one other uppercase letter.

Adam Crume 2009-07-15 17:50:31

Answer 2

A:

([A-Z][a-z\d]+)+

Should do the trick for upper camel case. You can add leading underscores to it as well if you still want to consider something like _IsRunning upper camel case.

Hawker 2009-07-14 22:06:50

Answer 3

+1 A:

This seems to do it:

/^[A-Z][a-z]+([A-Z][a-z]+)+/

I've included Ruby unit tests:

require 'test/unit'

REGEX = /^[A-Z][a-z]+([A-Z][a-z]+)+/

class RegExpTest < Test::Unit::TestCase
  # more readable helper
  def self.test(name, &block)
    define_method("test #{name}", &block)
  end

  test "matches camelcased word" do
    assert 'FooBar'.match(REGEX)
  end

  test "does not match words starting with lower case" do
    assert ! 'fooBar'.match(REGEX)
  end

  test "does not match words without camel hump" do
    assert ! 'Foobar'.match(REGEX)
  end

  test "matches multiple humps" do
    assert 'FooBarFizzBuzz'.match(REGEX)
  end
end

nakajima 2009-07-14 22:08:41

Adam's is better, and it passes all the tests I wrote.

nakajima 2009-07-14 22:32:33

Answer 4

+3 A:

Adam Crume's regex is close, but won't match for example IFoo or HTTPConnection. Not sure about the others, but give this one a try:

\b[A-Z][a-z]*([A-Z][a-z]*)*\b

The same caveats as for Adam's answer regarding digits, I18N, underscores etc.

You can test it out here.

Vinay Sajip 2009-07-14 22:10:22

ansaurus

tags:

views:

answers:

Regular expression to identify CamelCased words

related questions