tags:

views:

22

answers:

2

I've got a table in MySQL that looks roughly like:

value | count
-------------
Fred  | 7
FRED  | 1
Roger | 3
roger | 1

That is, it was created with string ops outside of MySQL, so the values are case- and trailing-whitespace-sensitive.

I want it to look like:

value | count
-------------
Fred  | 8
Roger | 4

That is, managed by MySQL, with value a primary key. It's not important which one (of "Fred" or "FRED") is kept.

I know how to do this in code. I also know how to generate a list of problem values (with a self-join). But I'd like to come up with a SQL update/delete to migrate my table, and I can't think of anything.

If I knew that no pair of records had variants of one value, with the same count (like ("Fred",4) and ("FRED",4)), then I think I can do it with a self-join to copy the counts, and then an update to remove the zeros. But I have no such guarantee.

Is there something simple I'm missing, or is this one of those cases where you just write a short function outside of the database?

Thanks!

+1  A: 

Strangely, MySQL seems to do this for you. I just tested this in MySQL 5.1.47:

create table c (value varchar(10), count int);
insert into c values ('Fred',7), ('FRED',1), ('Roger',3), ('roger',1);

select * from c;

+-------+-------+
| value | count |
+-------+-------+
| Fred  |     7 |
| FRED  |     1 |
| Roger |     3 |
| roger |     1 |
+-------+-------+

select value, sum(count) from c group by value;

+-------+------------+
| value | sum(count) |
+-------+------------+
| Fred  |          8 |
| Roger |          4 |
+-------+------------+

I was surprised to see MySQL transform the strings like that, and I'm not sure I can explain why it did that. I was expecting to have to get four distinct rows, and to have to use some string functions to map the values to a canonical form.

Bill Karwin
It's not changing the names, your GROUP BY clause is simply using the first occurrence as the value in the resultset; both of which are name case.
JYelton
have you tried with different amounts of trailing whitespaces?
mcabral
@mcabral: That should only matter if the data type is CHAR
OMG Ponies
You're right, I was thinking it was far more complex than it turned out to be. Great!
Ken
mcabral: Seems to work (i.e., do what I want) in that respect, too.
Ken
The behavior may be due to the collation
OMG Ponies
+2  A: 

As an example of how to obtain the results you are looking for with a SQL query alone:

SELECT UPPER(value) AS name, SUM(count) AS qty FROM table GROUP BY name;

If you make a new table to hold the correct values, you INSERT the above query to populate the new table as so:

INSERT INTO newtable (SELECT UPPER(value) AS name, SUM(count) AS qty FROM table GROUP BY name);
JYelton