ansaurus

Question

Database suggestion for processing/reporting on large amount of log file type data

Answer 1

A:

Be aware that most RDBMS have bulk insertion functionality - MySQL's for instance

INSERT INTO YOUR_TABLE
VALUES
  (value, value2, value3),
  (value4, value5, value6),
  (value7, value8, value9);

OMG Ponies 2010-10-28 01:55:58

Answer 2

+1 A:

We've had a similar problem at work and managed to solve it by dumping the data into a column based database. These kinds of databases are much better at analytical queries of the kind you're describing. There are several options:

http://en.wikipedia.org/wiki/Column-oriented_DBMS

We've had good experience with InfiniDB:

http://infinidb.org/

Using this approach we managed to speed up the queries by approx. 10x, however is not a silver bullet and eventually you'll run into the same problems again.

You might also want to look at partitioning your data to improve performance.

srkiNZ84 2010-10-28 02:17:17

Answer 3

A:

There are a couple of reasons why I wouldn't necessarily look right away to a NoSQL solution:

Yours is a known schema which sounds like it won't be changing.
There doesn't seem to be a lot of denormalizing potential for you, as you've pretty much got a single flat table structure.
You haven't made any reference to application scalability (# of users), just the size of the query.

And those are three of the big 'wins' for NoSQL as I know it.

That being said, I'm no expert, and I don't know for sure that it wouldn't make for faster reads, so it's definitely worth a try!

LesterDove 2010-10-28 02:17:32

Good analysis and breakdown. Thanks! I'll give @srkiNZ84 suggestion of infinidb a shot and see where that takes us.

whatupwilly 2010-10-28 14:46:15

ansaurus

tags:

views:

answers:

Database suggestion for processing/reporting on large amount of log file type data

related questions