tags:

views:

52

answers:

4

I would like to capitalize names properly, which in this case means:

  • The first letter is capitalized.
  • The first letter after a space is capitalized ('Van Helsing', not 'Van helsing')
  • The first letter after a dash is capitalized ('Johnson-Smith', not 'Johnson-smith')
  • No other letters are capitalized.

The first and last requirements are easily handled:

CONCAT(LEFT(name, 1), LOWER(RIGHT(name, LENGTH(name) - 1)))

The others are harder. I've written a 54-times-nested REPLACE statement (not by hand of course, I used Excel)

REPLACE(REPLACE(REPLACE(REPLACE(...,' b',' B'),'-b','-B'),' a',' A'),'-a','-A')

but I feel like there must be a more elegant and maintainable solution. Any ideas?

If there's a built-in function that is similar but not identical to my requirements that would probably be fine.


Edit: This script will only run on names that have lost their capitalization already, so there's no danger in mis-handling obscure names. Handling apostrophes like spaces and dashes would be nice, though. A glance through the current data shows that many (~30%) of the names have at least one of [-' ].

+1  A: 

It won't be pretty, but you can used SUBSTRING_INDEX to find the first space and dash, and conditionally capitalise whatever succeeds it.

Marcelo Cantos
Good thought, +1 for that.
Charles
+4  A: 

What about "Van den Berg" (which has a lower case 'd'). What about "McDonald" or "O'Reilly".

It's generally considered a bad idea to over-validate people's names, because there's always going to be a legitimate name which breaks your validation.

See also my answer to a previous similar question here: http://stackoverflow.com/questions/3853346/how-to-validate-human-names-in-cakephp/3853820#3853820

Spudley
It would be acceptable for "Van den Berg" to be changed to "Van Den Berg". This is not a script that will run on all names but one which will run on names which are presently in all capital letters (for various reasons). We *will* need to have a secretary or the like review the names, but a script like this should cut down the work required dramatically.
Charles
Fair enough -- that's a pretty good argument.
Spudley
@Spudley: My fault -- I should have specified that initially. I've edited it into the question now.
Charles
@Charles: No worries. Given that, I've commented on one of the answers which I think will lead you in the right direction. MySQL doesn't have a function to do what you want, but you could write one (or more likely find one that's been written by someone else).
Spudley
+1  A: 

You could probably use an User Defined Function, that would be much easier to reuse.

David Mårtensson
Since MySQL doesn't support REGEXP_REPLACE, see this question http://stackoverflow.com/questions/986826/how-to-do-a-regular-expression-replace-in-mysql for a very similar answer, with a useful-looking link to get you started.
Spudley
+1  A: 

It looks like MySQL doesn't have an INITCAP function, but I found code for one here:

http://dev.mysql.com/doc/refman/5.1/en/string-functions.html

And the code too:

DELIMITER $$
DROP FUNCTION IF EXISTS `initcap` $$
CREATE FUNCTION `initcap`(x char(30)) RETURNS char(30) CHARSET utf8
BEGIN
SET @str='';
SET @l_str='';
WHILE x REGEXP ' ' DO
SELECT SUBSTRING_INDEX(x, ' ', 1) INTO @l_str;
SELECT SUBSTRING(x, LOCATE(' ', x)+1) INTO x;
SELECT CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(@l_str,1,1)),LOWER(SUBSTRING(@l_str,2)))) INTO @str;
END WHILE;
RETURN LTRIM(CONCAT(@str, ' ', CONCAT(UPPER(SUBSTRING(x,1,1)),LOWER(SUBSTRING(x,2)))));
END $$
DELIMITER ;

Disclaimer: I didn't write this code, I haven't even tested it...

FrustratedWithFormsDesigner