+1  A: 

For very large data cubes storage and retrieval, HDF5 storage is working rather well (h5py or PyTables for a Python interface). Your application can then either run on a machine with a local copy of the HDF5 database or make an ad-hoc server solution (still in Python).

I have been designing hybrid SQL / HDF5 storage strategies when needed, and they are performing rather well.

If really need the MDX query language:

  • as an ORM (earlier answers on stackoverflow)

  • cubulus (although only a subset of MDX is implemented)

  • run the OLAP of your choice as a separate server and communicate with it through an ad-hoc interface (might even be XML through http).

lgautier
Perhaps HDF5 is a better solution for storage, but i need absolutely a database like PG + OLAP, with simple query langage like MDX or SQL to manipulate/aggregate/visualize dimension easily. It's for beginner scientist users...
reyman64
+1  A: 

I don't know python, but I am author of mondrian/olap4j.

If you can use py4j to access olap4j, great. If not, definitely consider XMLA. It may not be as slow as you think (unless python's XML parsing is slow). The biggest problem is the complexity of constructing SOAP requests and understanding the responses.

Julian

Julian Hyde