views:

54

answers:

2

I'm interested in getting solution ideas for a problem we have.

Background:

We have software tools that run on laptops and flash data onto hardware components. This software reads in a series of data files in order to do the programming on the hardware. It's in a manufacturing environment and is running continuously throughout the day.

Problem:

Currently, they're a central repository that the software connects to to read the data files. The software reads the files and retains a lock on them throughout the entire flashing process. This is running all throughout the day on different hardware components, so it's feasible that these files could be "locked" for most of the day.

There's new requirements that state these data files that the software is reading need to be updated in real time, will minimal impact to the end user who is doing the flashing. We will be writing the service that drops the files out there in real time.

The software is developed by a third party vendor and is not modifiable by us. However, it expects a location to look for the data files, so everything up until the point of flashing is our process that we're free to change.

Question:

What approach would you take to solve this from a solution programming standpoint? We're not sure how to drop files out there in real time given the locks that will be present on them throughout the day. We'll settle for an "as soon as possible" solution if that is significantly easier.

A: 

"As soon as possible" is your only option. You can't update a file that's locked, that's the whole point of a lock.


Edit: Would it be possible to put the new file in a different location and then tell the 3rd party service to look in that location the next time it needs the file?

Rory
Edit++ or keep track of the locked file and when the lock is removed copy the file being held in a different location?
runrunraygun
+1  A: 

The only way out of this conundrum seems to be the introduction of an extra file repository, along with a service-like piece of logic in charge of keeping these repositories synchronized.

In other words, the file upload takes places in one of the repositories (call it the "input repository"), and the flashing process uses the other repository (call it the "ouput repository"). The synchronization logic permanently pools the input repository for new files (based on file time stamp or other...) and when it finds such new files, attempts to copy these to the "output directory"; such copy either takes place instantly, when the flashing logic hasn't locked the corresponding file in the output directory, or it is differed till the file gets unlocked.

Note: During the file copy, the synchronization logic can/should lock the file, hence very temporarily preventing the file to be overwritten by new uploads, but ensuring full integrity of the copied file. The difference with the existing system is that the lock is held for a much shorter amount of time.

The drawback of this system is the full duplication of the repository, and this could be a problem if the repository is very big. However there doesn't appear to be many alternatives since we do not have control over the flashing process.

mjv
I'm going to accept this answer. This is actually what is currently in place today except it's clunky because of the coordination that needs to happen between the repositories. Not to get off topic from the original question, but the unmentioned additional piece of complexity associated with this answer is that the "input" and "output" repositories are distributed geographically from each other. The files we're talking about number about ~15,000 and are ~500MB in size. They are being transferred between repositories continuously throughout the day across the country.
The Matt