I've really been struggling to make SQL Server into something that, quite frankly, it will never be. I need a database engine for my analytical work. The DB needs to be fast and does NOT need all the logging and other overhead found in typical databases (SQL Server, Oracle, DB2, etc.)
Yesterday I listened to Michael Stonebraker speak at the Money:Tech conference and I kept thinking, "I'm not really crazy. There IS a better way!" He talks about using column stores instead of row oriented databases. I went to the Wikipedia page for column stores and I see a few open source projects (which I like) and a few commercial/open source projects (which I don't fully understand).
My question is this: In an applied analytical environment, how do the different column based DB's differ? How should I be thinking about them? Anyone have practical experience with multiple column based systems? Can I leverage my SQL experience with these DBs or am I going to have to learn a new language?
I am ultimately going to be pulling data into R for analysis.
EDIT: I was requested for some clarification in what exactly I am trying to do. So, here's an example of what I would like to do: Create a table that has 4 million rows and 20 columns (5 dims, 15 facts). Create 5 aggregation tables that calculate max, min, and average for each of the facts. Join those 5 aggregations back to the starting table. Now calculate the percent deviation from mean, percent deviation of min, and percent deviation from max for each row and add it to the original table. This table data does not get new rows each day, it gets TOTALLY replaced and the process is repeated. Heaven forbid if the process must be stopped. And the logs... ohhhhh the logs! :)