views:

264

answers:

3

I'm about to write a parser for a language that's supposed to have strict syntactic rules about naming of types, variables and such. For example all classes must be PascalCase, and all variables/parameter names and other identifiers must be camelCase.

For example HTMLParser is not allowed and must be named HtmlParser. Any ideas for a regexp that can match something that is PascalCase, but does not have two capital letters in it?

+1  A: 
/([A-Z][a-z]+)*[A-Z][a-z]*/

But I have to say your naming choice stinks, HTMLParser should be allowed and preferred.

Roger Pate
+1 for a regex and a comment on the naming convention that both look suspiciously similar to what I was going to post, though I would simplify the regex to `/(?:[A-Z][a-z]+)+/` (I don't think the OP is concerned with allowing `AaA` as a class name).
Chris Lutz
Yeah, I considered that, but figured AaA doesn't have two consecutive uppercase letters. A bigger problem not yet addressed by this scheme is numbers, do they count as upper, lower, neither, or both?
Roger Pate
Also underscores.
Chris Lutz
It's missing some details - like numbers, other than that it seems to work.
Marcin
A: 

camelCase:

^[a-z]+(?:[A-Z][a-z]+)*$

PascalCase:

^[A-Z][a-z]+(?:[A-Z][a-z]+)*$
Alix Axel
A: 

I don't believe the items listed can start with numbers (thought I read it somewhere so take it with a grain of salt) so the best case would be something like Roger Pate's with a few minor modifications (in my opinion)

/([A-Z][a-z0-9]+)*[A-Z][a-z0-9]*/

Should be something like, Look for a Capital Letter, then at least one small case or number, or more, as well as it looks like it handles just a capital letter as that seems to be required, but the additional letters are optional.

Good luck

onaclov2000