I need a real DBA's opinion. Postgres 8.3 takes 200 ms to execute this query on my Macbook Pro while Java and Python perform the same calculation in under 20 ms (350,000 rows):
SELECT count(id), avg(a), avg(b), avg(c), avg(d) FROM tuples;
Is this normal behaviour when using a SQL database?
The schema (the table holds responses to a survey):
CREATE TABLE tuples (id integer primary key, a integer, b integer, c integer, d integer);
\copy tuples from '350,000 responses.csv' delimiter as ','
I wrote some tests in Java and Python for context and they crush SQL (except for pure python):
java 1.5 threads ~ 7 ms
java 1.5 ~ 10 ms
python 2.5 numpy ~ 18 ms
python 2.5 ~ 370 ms
Even sqlite3 is competitive with Postgres despite it assumping all columns are strings (for contrast: even using just switching to numeric columns instead of integers in Postgres results in 10x slowdown)
Tunings i've tried without success include (blindly following some web advice):
increased the shared memory available to Postgres to 256MB
increased the working memory to 2MB
disabled connection and statement logging
used a stored procedure via CREATE FUNCTION ... LANGUAGE SQL
So my question is, is my experience here normal, and this is what I can expect when using a SQL database? I can understand that ACID must come with costs, but this is kind of crazy in my opinion. I'm not asking for realtime game speed, but since Java can process millions of doubles in under 20 ms, I feel a bit jealous.
Is there a better way to do simple OLAP on the cheap (both in terms of money and server complexity)? I've looked into Mondrian and Pig + Hadoop but not super excited about maintaining yet another server application and not sure if they would even help.