views:

110

answers:

4

I have a legacy table with about 100 columns (90% nullable). In those 90 columns I want to remove all empty strings and set them to null. I know I can:

update table set column = NULL where column = '';
update table set column2 = NULL where column2 = '';

But that is tedious and error prone. There has to be a way to do this on the whole table? Thanks!

Kyle

A: 

I think you'll need to pull each row into a language like C#, php, etc.

Something like:

rows = get-data()
foreach row in rows
    foreach col in row.cols
        if col == ''
            col = null
        end if
    next
next
save-data()
Nate Bross
This will work, but it's doable in T-SQL using a loop (e.g. using cursors) over the list of columns. For a one-time task, I suppose it really doesn't matter how you do it, though.
Brian
I tend to gravitate away from SQL for this type of maintenance, since in most cases I end up needing to add more functionality over time, and I end up pulling it into a full OO language anyway. But thats a personal bias... ;)
Nate Bross
+1  A: 

There isn't a standard way - but you can interrogate the system catalog to get the relevant column names for the relevant table and generate the SQL to do it. You can also probably use a CASE expression to handle all the columns in a single pass - a bigger SQL statement.

UPDATE Table
   SET Column1 = CASE Column1 = ' ' THEN NULL ELSE Column1 END,
       ...

Note that once you've generated the big UPDATE statement, all the work is done down in the server. This is much more efficient than selecting data to the client application, changing it there, and writing the result back to the database.

Jonathan Leffler
+6  A: 
UPDATE
    TableName
SET
    column01 = CASE column01 WHEN '' THEN NULL ELSE column01 END,
    column02 = CASE column02 WHEN '' THEN NULL ELSE column02 END,
    column03 = CASE column03 WHEN '' THEN NULL ELSE column03 END,
    ...,
    column99 = CASE column99 WHEN '' THEN NULL ELSE column99 END

This is still doing it manually, but is slightly less painful than what you have because it doesn't require you to send a query for each and every column. Unless you want to go to the trouble of scripting it, you will have to put up with a certain amount of pain when doing something like this.

Edit: Added the ENDs

Hammerite
Missing the END for each CASE, but other than that nice solution. +1
Justin K
@OMG: why? Presumably, you are referring to the ELSE columnXX END column name, but why bother with TRIM?
Jonathan Leffler
@Jonathan Leffler: This is what I meant: `CASE TRIM(column01) WHEN ''...` I should have been more clear that I meant the comparison, sorry.
OMG Ponies
@OMG: OK - I now know what you meant, but I reiterate my question: why? What is the possible benefit of trimming compared to simple comparison with a blank string? The DBMS is likely to end up doing blank padding of the trimmed string because the trimmed value might be as long as the untrimmed value, so you are just making it do vacuous work which it may or may not spot is vacuous.
Jonathan Leffler
@Jonathan Leffler: Because "\s\s\s" != "" unless you run TRIM/etc over it. Collation can affect that comparison.
OMG Ponies
@OMG: Where does "\s\s\s" == ""? In anything resembling standard SQL, a string containing backslashes cannot match the empty string. I suppose the question is tagged MySQL - but I find the concept that backslash-ess repeated 3 times is an empty string ... strange. Very strange.
Jonathan Leffler
@Jonathan Leffler: This comment system considers pressing the space bar 15 times to be a zero length string... It's a very common interpretation.
OMG Ponies
@OMG: In [MySQL](http://dev.mysql.com/doc/refman/5.5/en/string-syntax.html), "\s" is interpreted the same as "s". In SQL, a CHAR(40) value, say, is blank padded to full length; two strings `" abc "` and `" abc "` compare equal if their types are CHAR(40). If the columns in the question are CHAR, then I don't see TRIM as being necessary - certainly in the DBMS I use, it would be unnecessary. If the string literal is interpreted as a VARCHAR(n) and if two VARCHAR(n) values are not compared with blank padding - and some SQL collations would not - then you may need to TRIM. [...continued...]
Jonathan Leffler
+5  A: 

One possible script:

for col in $(echo "select column_name from information_schema.columns
where table_name='$TABLE'"|mysql --skip-column-names $DB)
do
echo update $TABLE set $col = NULL where $col = \'\'\;
done|mysql $DB
Matthew Flaschen
+1 Now that is really nice! :)
Dave Rix
Yeah, don't do anything that you can convince your computer to do for you... :)
GalacticCowboy