All depends on the level of your paranoia and the sensitivity of the key/data. In the extreme cases, as soon as you have an unencrypted key in memory, one can retrieve it using coldboot techniques. There is an interesting development at frozencache to try to defeat that. I merely casually read it, did not try it in practice, but it seems like an interesting approach to try.
With the tinfoil hat off, though - (1), (2), (3) do seem reasonable. (4) won't cut it precisely for the reason you mentioned. (Not only it is slow, but assuming you read into the stack, with different stack depths the key might become visible more than once).
Assuming the decrypted data is worth it, and it would be in the swappable memory, you definitely should encrypt the swap itself as well. Also, the root, /tmp partitions should also be encrypted. This is a fairly standard setup which is readily available in most guides for the OSes.
And then, of course, you want to ensure the high level of physical security for the machine itself and minimize the functions that it performs - the less code runs, the less the exposure is. You also might want to see how you can absolutely minimize the possibilities for the remote access to this machine as well - i.e. use the RSA-keys based ssh, which would be blocked by another ACL controlled from another host. portknocking can be used as one of the additional vectors of authentications before being able to log in to that second host. To ensure that if the host is compromised, it is more difficult to get the data out, ensure this host does not have the direct routable connection to the internet.
In general, the more painful you make it to get to the sensitive data, the less chance someone is going to going to get there, however there this is also going to make the life painful for the regular users - so there needs to be a balance.
In case the application is serious and the amount of things at stake is high, it is best to build the more explicit overall threat model and see what are the possible attack vectors that you can foresee, and verify that your setup effectively handles them. (and don't forget to include the human factor :-)
Update: and indeed, you might use the specialized hardware to deal with the encryption/decryption. Then you don't have to deal with the storage of the keys - See Hamish' answer.