I have to store e-mail messages for use with our application. I have "metadata" for all messages inside a relational database, but I don't feel comfortable keeping message content (gigabytes and terabytes of email data) inside a database. I'm currently using IMAP as a storage, but I have my doubts if I choose correctly. First of all there is a problem of uidvalidity and how to keep a permanent reference to message inside IMAP. Second, I'm not sure if this is the most robust solution in terms of backup/restore strategies, corruption of store, replication ... Positive side is that I can query IMAP using the headers because the data is mostly indexed.
I don't know if key-value stores are a better approach (Casandra, Tokyo cabinet, redis). How they handle storing 1KB and 50MB of data. How they prevent corruption and when corruption or device failure happens how can I repair the store.