views:

823

answers:

6

I have been interested in SSD drives for quite sometime. I do a lot of work with databases, and I've been quite interested to find benchmarks such as TPC-H performed with and without SSD drives.

On the outside it sounds like there would be one, but unfortunately I have not been able to find one. The closest I've found to an answer was the first comment in this blog post.

http://dcsblog.burtongroup.com/data_center_strategies/2008/11/intels-enterprise-ssd-performance.html

The fellow who wrote it seemed to be a pretty big naysayer when it came to SSD technology in the enterprise, due to a claim of lack of performance with mixed read/write workloads.

There have been other benchmarks such as this and this that show absolutely ridiculous numbers. While I don't doubt them, I am curious if what said commenter in the first link said was in fact true.

Anyways, if anybody can find benchmarks done with DBs on SSDs that would be excellent.

+1  A: 

A whitepaper highlighted by Paul Randall

My personal and humble view (in a SQL Server sense): useful for hosting tempdb and maybe the log files.

gbn
+4  A: 

I've been testing and using them for a while and whilst I have my own opinions (which are very positive) I think that Anandtech.com's testing document is far better than anything I could have written, see what you think;

http://www.anandtech.com/show/2739

Regards,

Phil.

Chopper3
A: 

The formal TPC benchmarks will probably take a while to appear using SSD because there are two parts to the TPC benchmark - the speed (transactions per unit time) and the cost per (transaction per unit time). With the high speed of SSD, you have to scale the size of the DB even larger, thus using more SSD, and thus costing more. So, even though you might get superb speed, the cost is still prohibitive for a fully-scaled (auditable, publishable) TPC benchmark. This will remain true for a while yet, as in a few years, while SSD is more expensive than the corresponding quantity of spinning disk.

Jonathan Leffler
+2  A: 

The issue with SSD is that they make real sense only when the schema is normalized to 3NF or 5NF, thus removing "all" redundant data. Moving a "denormalized for speed" mess to SSD will not be fruitful, the mass of redundant data will make SSD too cost prohibitive.

Doing that for some existing application means redefining the existing table (references) to views, encapsulating the normalized tables behind the curtain. There is a time penalty on the engine's cpu to synthesize rows. The more denormalized the original schema, the greater the benefit to refactor and move to SSD. Even on SSD, these denormalized schemas will run slower, likely, due to the mass of data which must be retrieved and written.

Putting logs on SSD is not indicated; this is a sequential write-mostly (write-only under normal circumstances) operation, physics of SSD (flash type; a company named Texas Memory Systems has been building RAM based sub-systems for a long time) makes this non-indicated. Conventional rust drives, duly buffered, will do fine.

Note the anandtech articles; the Intel drive was the only one which worked right. That will likely change by the end of 2009, but as of now only the Intel drives qualify for serious use.

A: 

Commenting on:

"...quite interested to find benchmarks such as TPC-H performed with and without SSD drives."

(FYI and full disclosure, I am pseudonymously "J Scouter", the "pretty big naysayer when it came to SSD technology in the enterprise" referred to and linked above.)

So....here's the first clue to emerge.

Dell and Fusion-IO have published the first EVER audited benchmark using a Flash-memory device for storage.

The benchmark is the TPC-H, which is a "decision support" benchmark. This is important because TPC-H entails an almost exclusively "read-only" workload pattern -- perfect context for SSD as it completely avoids the write performance problem.

In the scenarios painted for us by the Flash SSD hypesters, this application represents a soft-pitch, a gentle lob right over the plate and an easy "home run" for a Flash-SSD database application.

The results? The very first audited benchmark for a flash SSD based database application, and a READ ONLY one at that resulted in (drum roll here)....a fifth place finish among comparable (100GB) systems tested.

This Flash SSD system produced about 30% as many Queries-per-hour as a disk-based system result published by Sun...in 2007.

Surely though it will be in price/performance that this Flash-based system will win, right?

At $1.46 per Query-per-hour, the Dell/Fusion-IO system finishes in third place. More than twice the cost-per-query-per-hour of the best cost/performance disk-based system.

And again, remember this is for TPC-H, a virtually "read-only" application.

This is pretty much exactly in line with what the MS Cambridge Research team discovered over a year ago -- that there are no enterprise workloads where Flash makes ROI sense from economic or energy standpoints

Can't wait to see TPC-C, TPC-E, or SPC-1, but according the the research paper that was linked above, SSDs will need to become orders-of-magnitude cheaper for them to ever make sense in enterprise apps.

+2  A: 

I've been running a fairly large SQL2008 database on SSDs for 9 months now. (600GB, over 1 billion rows, 500 transactions per second). I would say that most SSD drives that I tested are too slow for this kind of use. But if you go with the upper end Intels, and carefully pick your RAID configuration, the results will be awesome. We're talking 20,000+ random read/writes per second. In my experience, you get the best results if you stick with RAID1. I can't wait for Intel to ship the 320GB SSDs! They are expected to hit the market in September 2009...

Dennis Kashkin