views:

1137

answers:

9

I like InnoDB's safety, consistency, and self-checking.

But I need MyISAM's speed and light weight.

How can I make MyISAM less prone to corruption due to crashes, bad data, etc.? It takes forever to go through a check (either CHECK TABLE or myisamchk).

I'm not asking for transactional security -- that's what InnoDB is for. But I do want a database I can restart quickly rather than hours (or days!) later.

UPDATE: I'm not asking how to load data into tables faster. I've beat my head against that already, and determined that using the MyISAM tables for my LOAD DATA is simply much faster. What I'm after now is mitigating the risks of using MyISAM tables. That is, reducing chances of damage, increasing speed of recovery.

A: 

Are you married to MySQL? Postgres is ACID-compliant (like innoDB) and (when well-tuned) nearly as speedy as MyISAM.

Tim Howland
Yes I'm married to MySQL. Thanks for playing...
JBB
+3  A: 

MyISAM's supposed speed benefits can actually go away pretty quickly - the fact that it lacks row-level locking means small updates can cause large amounts of data to be locked, and queries to block. Because of that, I'm skeptical of claimed MyISAM speed benefits: start doing several UPDATEs, and the queries per second will tank.

I think you're better off asking "How can applications backed with InnoDB be made faster?" and the answer then deals with caching data, perhaps at the object level, in lightweight caches - there is a cost for ACID, and for, say, web applications, it's not really needed.

If UPDATEs are rare (if they aren't, MyISAM isn't a good choice) then you can even use the MySQL query cache.

memcached (http://www.danga.com/memcached/) is a very popular option for object caching. Depending on your application you have other options as well (HTTP caches, etc.)

Daniel Papasian
No, the major problem is the amazingly disk-intensive initial import of data into the table. MyISAM time: 12 minutes. InnoDB time: 3+ hrs. After my initial load, UPDATEs are non-existent and INSERTs are rare. No known solution to InnoDB's disappointing load operation.
JBB
Where is the time being spent? Are you disabling the keys on the tables, loading the data, and then enabling keys? Is the time spent getting the data into the table, or rebuilding the keys? Are you using multiple INSERT statements, or using LOAD DATA?
Daniel Papasian
Time's being spent writing to the disk. "disable keys" doesn't work for innodb tables, only for myisam. innodb builds and rebuilds the keys as you insert data. Times for enhanced INSERT statements and LOAD DATA INFILE statements (via mysqlimport) are very similar.
JBB
Have you followed everything on http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html ? Turning off autocommit, grouping many inserts into a single commit, turning off unique checks and foreign key checks, and so on?
Daniel Papasian
+1  A: 

The performance advantages of MyISAM are actually pretty minimal in some cases; you need to benchmark your own application MyISAM vs InnoDB. Using the InnoDB transactional engine exclusively gives other benefits too.

In my testing InnoDB will use up typically about 150% more disc space than MyISAM- this is because of its block structure and lack of index compression.

If you can afford it, just use InnoDB instead.

As far as answering your actual question goes: If you partition your table into multiple MyISAM tables, the amount of repair needed in a crash will be much less; if your data are large, this might be a good idea anyway for other reasons.

MarkR
A: 

Your comment:

No, the major problem is the amazingly disk-intensive initial import of data into the table. MyISAM time: 12 minutes. InnoDB time: 3+ hrs. After my initial load, UPDATEs are non-existent and INSERTs are rare. No known solution to InnoDB's disappointing load operation.

suggests dropping constraints and indexes, then enabling / rebuilding them after the load may significantly speed it up- I assume you tried that? Did that improve things?

Tim Howland
A: 

Given you talk of hours or days, I reckon you are managing some very large datasets.

I did some googling, but only found this post on the MySQL Performance blog.

Basically this tells you how to trade crash recovery time (length of accumulated operations in log-file) against operational speed (constantly flushing the log to disk or not).

Looks like this optimization can be very time consuming though. Hope this helps.

pi
A: 

This really depends a lot on how your use of the tables. If they are write heavy, then you may want to consider removing indexes, which will speed up the recovery time. If they are read heavy, you may want to consider using replication which will serialise all writes to your tables, minimising the recovery time for your read copy after a crash.

Once thing you could do is write to an InnoDB copy of the table, and then replicate to a MyISAM copy. The performance benefits of MyISAM are mostly read-oriented anyway.

Using replication of course, you will have lag time between reads and writes

Marc Gear
A: 

Get a good UPS, with decent power conditioning. Run on stable and redundant hardware.

I don't trust MyISAM tables to ever survive a crash during a write, so I think your best bet is on reducing the occurrence of crashes (and writes).

Daniel Papasian
+1  A: 

in normal practice, you shouldn't get corruption. if you are getting corruption, you need to look at things like bad memory, bad hard drive, bad drive controller, or possibly a mysql bug.

if you want to side-step all that, you could set up a replication slave. when the master dies, stop the replication on the slave and make it your new master. the clear the data off your old master and set it up as a slave. user down-time will be limited to the amount of time it takes to detect that the master died and bring the slave up.

this has the added benefit of being a good way to achieve a zero-downtime backup: shut down the slave process and back up the slave.

A: 

While I agree with the innodb comments, I will give a solution to your MyISAM problem.

A good way to prevent corruption and increasing speed would be to use MERGE tables

You can use 2 or more MyISAM files. One is usually for backup'd old data that isn't used that often and the other is newer data. Then you will have 2 FRM (the MyISAM table files) on your harddisk and one will be protected. Usually you compress the old MyISAM tables and then they will defiantly not be corrupted, since they become read-only.

This technique is usually used to speed up big MyISAM tables, but you can apply it here as well.

Hope that helped your question. While I realize it didn't really help crash-proof MyISAM, it does give quite a bit of protection.

Jonathan