views:

138

answers:

0

Hi folks! Am developing an extract-transform-load script with sqlalchemy. Scenario is as follows:

  • take 30+ mln text file (csv, tab-delimited or any other...).
  • parse it and generate file, suitable for 'Load data infile' mySQL import command (as described http://dev.mysql.com/doc/refman/5.0/en/load-data.html )
  • From within script, disable MyISAM indexes, import file 'Load data infile', recreate indexes.

The thing is that while parsing a text file, i'd like to check whether the given string will be suitable for a given column's data type(according to it's definition in mysql table), as it'd shame if the 30 millionth row would contain a 50 symbol string while the mysql column have been described as VARCHAR(49)...

I imagine the type checking could be implemented as follows:

types = [col.type for col in q.columns]
text = ['a','b','asdasd','','45.5','čąęąčę','2005-09-13 12:12:12']
if len(types)==len(text):
    for i in range(len(types)):
        assert(types[i].some_cool_type_checking_method(text[i]))

where some_cool_type_checking_method would be a simple method of the sqlalchemy.dialects.mysql.base.(INTEGER|VARCHAR|etc) class, which takes the string as an argument and returns True if the string "will be accepted, according to the Column metadata retrieved from DB via reflection and False, if the given string is for some reason misformated(too many characters for a VARCHAR, mismatching charsets, Float where the integer is expected, etc) for a given Db column.

some_cool_type_checking_method is something i would originaly expect, but as from the public methods avialable for sqlalchemy.dialects.mysql.base.(INTEGER|VARCHAR|etc) classes, i don't seem to find a similar one. Of course it's naive to hope that developers think the same way I do:) - so the question is how is this type of type-checking implemented following SQLAlchemy best practices?

Thanks in advance for any ideas!