My company has a solution that takes a radically different approach than traditional warehousing and map/reduce. We turn the model upside down by starting with the Answer Set required and then work backwards to figure out the best way to load and store the data to maximize throughput and query speed.
We call it the ProcessOnce Toolkit. We take massive datasets and put them through a two stage process, first is to normalize and compress the data into what we call our Staging File. At that point you have a queryable flat table with all of the dimensions (columns) that you would ever want to analyze.
The second stage is where the real magic happens, we use proprietary, sophisticated mathematical algorithms to pre-compute ALL of the combinations of Metrics and Dimensions (including distinct counts) as we build the file we call the Answer Set. This super-compressed Answer Set file is ODBC compatible, so you can point any front end tool you want at it. Since all the heavy lifting is now complete, queries run at "the speed of thought" because all they are really doing is looking up answers, not scanning huge tables.
With this methodology and toolkit, we can take multibillion row tables and completely process them (from raw data to pre-computed Answer Set) in minutes-to-small hours all running on commodity Linux servers.
In another post, someone asked how many users you have. With our Toolkit, it doesn't matter how many because you can think of the Answer Set file mentioned above like a PDF of the data. Just like a PDF, the file can be copied to any location or served across the Internet. Our client-side ODBC driver is the equivalent of the Adobe Acrobat Reader. So you can have as many users as you want.
You mentioned that you have hierarchies in your data. We fully support hierarchies and have a feature we call KeyBack, which lets you drill down to the detailed record level behind any query. Essentially, the Answer Set talks back to the Staging File to get the detailed records only when needed.
For your application, our Toolkit would handle daily processing in approx 2 hours, running on a $3000 linux server with a Toolkit license cost of $5000/month. You can serve the data to users on any old web server you want. The license covers unlimited data volume under that 4-processor server configuration and an unlimited number of users accessing the data.
The company is called BigEye Analytics, we've developed and have been using this Toolkit internally for many years servicing our consulting clients in the database marketing arena. In addition, a limited number of large telecom companies have been using the Toolkit for network analysis. We have a placeholder up at www.bigeyeanalytics.com as we get ready to launch the Toolkit commercially in the next month.
You can catch me on Skype if you'd like more details, davegoulden1.