ansaurus

Question

Answer 1

+2 A:

Not SQL, but take a look at my own OSS project CSVfix, which does some of what you want.

anon 2009-02-27 14:28:47

looks very interesting... can one combine files?and is there a documentation/overview page of some sorts?

Toad 2009-02-27 15:55:04

you can join two csb files on specific fields. One of the downloads is the full manual in HTML format. No overview page as yet.

anon 2009-02-27 16:20:58

great tool Just read the documentation. I do have some tips and ideas. I'll email you those via your tool site.

Toad 2009-02-27 18:43:40

if you or anyone else else has ideas/comments on csvfix, please post in the support forum - address on the main site page - cheers!

anon 2009-02-27 18:53:51

Answer 2

A:

AutoMate (you can have trial version), but it seems as an excess to me using automation tool for such simple task.

Max Gontar 2009-02-27 14:32:24

Answer 3

+5 A:

You can import CSV into sqlite

Nerdling 2009-02-27 14:34:31

Answer 4

+1 A:

Partly it depends on what exactly you want to be able to do -- do you have some specific problems you'd like to solve? This kind of thing I tend to address using awk / grep, and if it gets complicated I'd write a script in Ruby or Python.

On the other hand, I had almost exactly the requirement you have recently and I solved it using MySQL. I also tried SQLite, but it was too slow (basically because I needed to run thousands of separate queries). SQLite is a good option, particularly if you're working with a language like Python, but I understand it works best if you can do one large query and then process the results in code, which might not really meet your needs.

Ben 2009-02-27 14:38:52

Answer 5

A:

If you have the sql server tools installed, you might be able to use the included command line query tool (osql.exe) and treat the file as an ole data source.

I expect you want something a little simpler to deploy, though, and you didn't even mention if you were on windows.

Joel Coehoorn 2009-02-27 14:42:11

I'm on server 2003But I'm really looking for some simple commandline thingy since it automates the easiest.

Toad 2009-02-27 15:57:20

Answer 6

A:

You don't say what OS you're on, which would help. I suspect from your stated command line preference that you may be on *nix, though.

If you're on Windows, you could investigate using the Microsoft Text Driver to create a data source targeted on a directory. Then each CSV can be treated as a table.

Mike Woodhouse 2009-02-27 14:49:30

i googled for 'microsoft test driver' but couldn't find any reference...do you have a link?

Toad 2009-02-27 15:55:58

Ah, nuts. I meant "Windows Text Driver". It should be visible as one of the options when you create a Data Source.

Mike Woodhouse 2009-02-27 21:21:18

Answer 7

+1 A:

If you're not afraid of Perl, DBI and DBD::CSV.

http://search.cpan.org/~jzucker/DBD-CSV-0.22/lib/DBD/CSV.pm

However, whether this handles aggregate functions is unknown to me. It is best to import it into a real database first (SQLite as has been mentioned is a good candidate)

Frakkle 2009-02-27 14:49:45

Answer 8

A:

I've stumbled upon pig and hadoop. It looks like it could do the job. I'll investigate this some more too.

Toad 2009-02-27 15:58:27

http://hadoop.apache.org/pig/hadoop makes it possible to handle gigantic files (using a virtual fielsystem).pig can run on (or without it) hadoop, and gives the user some kind of sql like syntax to do queries on huge files(mostly csv)at least this is what I got from it

Toad 2009-02-28 12:02:22

Answer 9

+1 A:

The Jet Database Engine is installed with Windows, or can be downloaded http://support.microsoft.com/kb/239114/en-us. You can use it with script to query a number of file types.

Dim cn: Set cn = CreateObject("ADODB.Connection")
Dim rs: Set rs = CreateObject("ADODB.Recordset")

cn.Open _
    "Provider = Microsoft.Jet.OLEDB.4.0; " & _
    "Data Source =c:\docs\;" _
    & "Extended Properties=""text;HDR=Yes;FMT=Delimited"";"

rs.Open "SELECT * FROM Test.csv", cn

a = rs.GetString
MsgBox a

Remou 2009-02-27 16:12:12

Answer 10

+2 A:

Log Parser

cindi 2009-02-27 16:59:11

wow! Exactly what I needed! Thanks!

Toad 2009-02-27 18:40:40

the only thing lacking is a join. but the csvfix (mentioned elsewhere) is capable of this.

Toad 2009-02-28 12:04:05

Answer 11

A:

There is a Groovy script, gcsvsql, which does exactly what you are asking for, including joins. You can do things like

gcsvsql "select name,age from /users/data/people.csv where age > 40"

gcsvsql "select sum(score) from people.csv where age < 40"

gcsvsql " select people.name,children.child from people.csv,children.csv where people.name=children.name and people.age < 40"

You can get it from Google code here:

http://code.google.com/p/gcsvsql/

Kolmogorov 2010-03-31 23:52:13

ansaurus

tags:

views:

answers:

SQL on csv files (without a database)

related questions