tags:

views:

250

answers:

2

Hello,

I have a file that can contain from 3 to 4 columns of numerical values which are separated by comma. Empty fields are defined with the exception when they are at the end of the row:

1,2,3,4,5
1,2,3,,5
1,2,3

The following table was created in MySQL:

+-------+--------+------+-----+---------+-------+
| Field | Type   | Null | Key | Default | Extra |
+-------+--------+------+-----+---------+-------+
| one   | int(1) | YES  |     | NULL    |       | 
| two   | int(1) | YES  |     | NULL    |       | 
| three | int(1) | YES  |     | NULL    |       | 
| four  | int(1) | YES  |     | NULL    |       | 
| five  | int(1) | YES  |     | NULL    |       | 
+-------+--------+------+-----+---------+-------+

I am trying to load the data using MySQL LOAD command:

load data infile '/tmp/testdata.txt' into table moo fields terminated by "," lines terminated by "\n";

The resulting table:

+------+------+-------+------+------+
| one  | two  | three | four | five |
+------+------+-------+------+------+
|    1 |    2 |     3 |    4 |    5 | 
|    1 |    2 |     3 |    0 |    5 | 
|    1 |    2 |     3 | NULL | NULL | 
+------+------+-------+------+------+

The problem lies with the fact that when a field is empty in the raw data and is not defined, MySQL for some reason does not use the columns default value (which is NULL) and uses zero. NULL is used correctly when the field is missing alltogether.

Unfortunately, I have to be able to distinguish between NULL and 0 at this stage so any help would be appreciated.

Thanks S.

edit

The output of SHOW WARNINGS:

+---------+------+--------------------------------------------------------+
| Level   | Code | Message                                                |
+---------+------+--------------------------------------------------------+
| Warning | 1366 | Incorrect integer value: '' for column 'four' at row 2 | 
| Warning | 1261 | Row 3 doesn't contain data for all columns             | 
| Warning | 1261 | Row 3 doesn't contain data for all columns             | 
+---------+------+--------------------------------------------------------+
A: 

Preprocess your input CSV to replace blank entries with \N.

Attempt at a regex: s/,,/,\n,/g and s/,$/,\N/g

Good luck.

Sam Goldman
+2  A: 

MySQL manual says:

When reading data with LOAD DATA INFILE, empty or missing columns are updated with ''. If you want a NULL value in a column, you should use \N in the data file. The literal word “NULL” may also be used under some circumstances.

So you need to replace the blanks with \N like this:

1,2,3,4,5
1,2,3,\N,5
1,2,3
Janci
Thanks for the tip - I am sceptical to edit the raw source data but if this is the only way around it I will try it out.
Spiros
I understand your scepticism, no one likes editing raw data, it just doesn't feel right. However, if you think about it for a minute, there has to be a way to distinguish between NULL and empty string. Should blank entries be translated to NULLs, you'd need a special sequence for empty string. It would nice to have a way how to tell MySQL how to treat blank entries though, something like LOAD DATA INFILE '/tmp/testdata.txt' INTO TABLE moo TREAT BLANKS AS NULL...
Janci