tags:

views:

440

answers:

6

What's a good language for validating CSV files?

Edit: Yes I am looking for an excuse to learn a new language. Often the files have extra blank rows or fields, or the fields are too long. Currently I'm using VBA script in Excel, but was wanting to try some other languages.

+3  A: 

lacking better knowledge of your problem, I'd say python, because of its builtin csv module.

Jimmy
+4  A: 

One that you already know, unless you are looking for an excuse to learn a new language, in which case I would suggest Python. Python has a builtin CSV module, which can be useful if you're validating the data and not just the formatting.

Dan Homerick
A: 

There are perl modules that manipulate CSV files, e.g, Text::CSV.

David Norman
+1  A: 

Validate how? Is there some spec or standard that you are checking against? If you just need to work with CSV files, pick a dynamic language with good text processing facilities, so Python, Perl, Ruby.

Alex
A: 

Do this in two parts:

  1. To test validity of the file format, use any common scripting language - PHP, Python, Perl, and Ruby come to mind

  2. To test the data against business rules, use SQL. After validating the format, import it into tables, then run SELECT queries to find non-conforming rows:

    SELECT id FROM data_table WHERE age NOT REGEXP '^[1-9][0-9]{0,2}$'

rooskie
Anyone storing age in a database instead of (or as well as) date of birth should be tied up in the town square and pelted with rotten vegetables. Your regex permits ages 1 to 199 but disallows 0 -- an application-dependant range would be a good idea. You have 20 tests so you give the user 20 text files each containing nothing but a user-meaningless id???
John Machin
Sounds like the concept of an example is utterly lost on you.
rooskie
Sounds like the concept of an example that's useful and meaningful and indicative of good practice hasn't occurred to you.
John Machin
A: 

If you want a challenge, try bash and sed or awk. A Unix shell and simple text parsing tools are powerful mojo. Perl was specifically designed for text processing and is highly portable.

Dave Jarvis