Hi,
Please, excuse me in advance for my poor english as it is not my mother tongue.
We have a national application & the users would like to have accurate business statistics regarding some tables.
We are using tomcat, Spring Ws & hibernate on top of that.
We have think about many solutions :
plain old query for each user request
The problem is those tables contains millions of records. Every query will take many seconds at least. Solution never used.the actual solution used: create trigger.
But it is painful to create & difficult to maintain (no OO, no cool EDI, no real debug).The only helping part is the possibility to create Junit Test on a higher level to verify the expected result.
And for each different statistic on a table we have to create an other trigger for this table.Using the quartz framework to consolidate data after X minutes.
I have learn that databases are NOT DESIGNED for these heavy and complicated queries.
A separate data warehouse opyimize for reading only queries will be better. (OLAP??) But I don't have any clue where to start with postGresql. (pentaho is the solution or just a part?)
- How could we extract datas from the production database ? Using some extractor ?
- And when ?Every night ?
- If it is periodically - How will we manage to maintain near real time statistics if the data are just dumped on our datawarehouse one time per day ?
THanks in advance for your answer I really appreciate.