I'd go for a so far not yet mentioned (except in comments perhaps) option mentioned in the subject matter of my blog post about blobstreams: set up a processing pipeline of streams that take care of downloading and interpreting the file you need. Then use code to read interpreted records from this compound stream and do the needed inserts/updates in your database inside one transaction (per file/per record, as per your functional requirements).
This kind of scenario is where Stream
based classes excel. It would mean that you never have the entire file anywhere on disk or in memory at the same time while processing. As you mentioned downloading the file takes minutes, it could be big. Can your system take the intermediate storage of the full file (maybe more than once: memory and on disk)? Even if multiple files get processed concurrently?
Also, if you would find out in practice that the chain is not reliable enough for you and you would like to be able to temporarily store the downloaded file to disk and indeed want to then repeat processing of it without having to download it again, this is easy. All that is needed is an extra Stream
in the pipeline that would check if the file needed is already in your "already downloaded files" cache (in some folder, in isolated storage, whatever) and return the bytes in that instead of actually looping a downloading Stream
into your processing pipeline.