views:

577

answers:

4

I have a database containing tables with more than 600 million records and a set of stored procedures that make complex search operations on the database. The performance of the stored procedures is so slow even with suitable indexes on the tables. The design of the database is a normal relational db design. I want to change the database design to be multidimensional and use the MDX queries instead of the traditional T-SQL queries but the question is: Is the MDX query better than the traditional T-SQL query with regard to performance? and if yes, to what extent will that improve the performance of the queries?

Thanks for any help.

+1  A: 

"The performance of the stored procedures is so slow even with suitable indexes"

I'd be surprised if the stored procedure is the real problem, maybe the the way the procedures are used is slow, but a stored procedure by definition doesn't make it slow. Have you found out what about your procedures is slow? Have your profiled them? I would take a deep long look at that route before redesigning my database. Multi-dimensional databases are for OLAP is your database strictly an OLAP database or is it a hybrid of OLAP and OLTP? Maybe you need to de-normalized and replicate data in your OLTP design into the de-normalize d structure? 600 million records in a table is not by any means huge, it's not small but that doesn't lead me to believe that dropping stored procedures will magically make things fast. Profile your stored procs and see where the performance bottlenecks are before jumping into a bigger project to fix the issue.

StarShip3000
a simple query like:[select idfrom articles where CategoryName in ('A','B','C')]with an index on CategoryName takes about 60 seconds to get the result.By the way the database contains only static data but it was design as OLTP database.
SubPortal
What query plan does that give you? How many rows does it return? Is column id indexed? The IN on ('A','B','C') won't be able to use an index.
StarShip3000
Here's a link that has some high level tips that might be useful http://blogs.techrepublic.com.com/datacenter/?p=173
StarShip3000
+1  A: 

MS SSAS OLAP cube can be used in several storage modes:

  1. Relational (OLAP) - the data and metadata stays in your DB and few more materialized views are added. May or may not be faster.

  2. Hybrid (HOLAP) - metadata and (pre-calculated) aggregations are stored on a new server running a SSAS instance. This should speed-up all the queries using aggregations, like "total employee hours for last year by month", but queries which drill-through to specific records may be as before.

  3. Multi-dimensional OLAP (MOLAP) where all your data plus metadata and aggregations are copied to the SSAS server. This is usually the fastest, but duplicates storage.

Before starting this, you should consider optimizing you table layout for reporting and analytics, in other words use a data warehouse (DW) -- put your data in a Kimball star dimension and fact tables. Then you load the DW using ETL(SSIS) periodically and point your reporting and analytics to the DW. It may be that you do not need to use SSAS at all -- SQL queries running against a star table layouts are usually considerably faster than against a normalized DB --operational database. If this is still too slow, build SSAS cubes on top of the DW. Once you start loading your DW, you may be able to remove records form your operational database, making it faster for every-day use.

To summarize, my rule-of thumb would be:
1. Build a DW and set your ETL process
2. Try T-SQL reports against the DW, it may be good enough.
3. If still slow, build SSAS cubes (on top of the DW) in HOLAP mode and use MDX to query them.

Damir Sudarevic
+6  A: 

Apples and oranges: An analysis services OLAP cube is a fundamentally different type of storage than a SQL Server database, and they are designed to do different things. Technically MDX is not "faster" than T-SQL, or vice versa -- they are just languages, but designed for different needs.

Having said that, a cube is usually what works best for doing numeric analysis of static data, such as aggregating large numbers of sales/transactions/whatever records over time. In contrast, a traditional relational database generally works just fine, if the schema and indexes are well constructed, for search. A simple way to judge: if your SQL queries have to do a lot of

select grock, sum/min/max/avg( foo ) 
from bar 
group by grock -- Ideal Analysis Services problem

then a cube may help (it's designed for aggregate math functions - sum() and group by). OTOH if your queries do a lot of

select cols 
from foo 
where <complicated search> -- Not so much

then a cube probably will not help, and I would focus instead on tuning the schema, the queries and indexing, and perhaps table partitioning if the data can be suitably partitioned.

Do you have a clustered index and covering non-clustered indexes that match the queries?

onupdatecascade
A: 

Have you considered PowerPivot (the Excel add-on)? It uses vertical compression to compress data about 95%, locally, so you can analyze to your heart's content.

http://technet.microsoft.com/en-us/library/ee210692.aspx

WoundedEgo