We have a console application (currently .NET) that sends out mail in bulk to people that have subscribed to a mailing list. We are re-implementing this application because of limitations that it currently has.
One of the features it must have, is that it can resume after an unexpected interrupt of its operation. This means that every time it succesfully sends an e-mail, it has to keep track in a way that it can pick back up right where it left off. It'll get the information it needs (basically the list of recipients which are identified using a numeric id) from a different server, which has the database containing this information.
Our setup is simple: we have one Windows-based web/database server that contains the recipients, and we have the SMTP-server running Debian.
We have come up with several options that would solve this:
- Send a signal back to the database after every send operation
- Keep track in a small file by writing only the last id of the recipient to this file (overwriting its contents with each write) after every send operation.
- Keep track in a database that runs on the host machine (mysql, postgresql, sqlite, etc)
The constraints are that the application is supposed to send mails fast. As for amounts of mails it has to send, it'll vary between several hundreds to several tens of thousands per batch, and it could be several batches per day, too. Overall it's usally between 1000 and 50.000 mails on a day, but this will grow. Also, it must be able to resume accurately so I can't wait until, say, 50 mails are sent, and then write the progress to a file or database or so.
This what I came up with so far with regards to the above solutions:
- We currently have our application use this solution. But the application will run on a different server than the database server (they aren't in the same network either, but the application will run on the mail server, as opposed to the current situation) so I can't imagine that being the most efficient solution.
- This could be very fast, but wouldn't it strain the hard drive to the point where its lifespan could be severely shortened? (This server is an older Opteron, I believe, it may pre-date SATA, but if so, not by much.)
- This may be very fast, and efficient, but would it be necessary to setup a database for the purpose of only storing 2 numbers (id of the batch, and id of the last recipient within that batch)? Would overhead maybe slow this down?
Apart from the above solutions, are there other options I haven't yet considered, to keep track without really slowing the application down? Are my assumptions accurate?