tags:

views:

158

answers:

2

Hi! I want to build an Azure application that has two worker roles and NO web roles. When the worker roles first start up I want ONLY ONE of the roles to do the following a single time:

  • Download and parse a master file then enqueue multiple "child" tasks based on the contents of the master file
  • Enqueue a single master file download "child" task to run the next day

Each of the "child" tasks would then be done by both of the workers until the task queue was exhausted. Think of the whole things as "priming the pump"

This sort of thing is really easy if I add the the first "master" task manually in a queue by calling a web role but seems to be really hard to do in an auto-start mode.

Any help in this regard would be greatly appreciated!

Thanks.....

A: 

One possibility: instead of calling a web role, just load the queue directly. (It sounds like this is the sort of application you'll want to automatically spin up to do some work and then shut down again... if you're automating that, it should be trivial to also automate loading the queue.)

A (perhaps) better option: Use some sort of locking mechanism to make sure only one worker instance does the initialization work. One way to do this is to try to create the queue (or a blob, or an entity in a table). If it already exists, then the other instance is handling initialization. If the create succeeds, then it's this instance's job.

Note that it's always better to use a lease than a lock, in case the instance that's doing the initialization fails. Consider using a timeout (e.g. storing a timestamp in table storage or in the metadata of the blob or in the name of the queue...).

smarx
Thanks for your quick reply! Excuse me for sounding dense, though, if I ask "How do I automate loading the queue?" Could you show me some code?In terms of your second idea I may be missing something but I'm confused as to how you can use an entity like a blob to create a lock without being snagged by race condition(S)? As before, some code might help me understand it.Again, thanks for your quick reply and help....
lsb
In the first idea, I just meant that instead of calling a web role to kick off the initialization, run some code locally that does the initialization. (Whatever code would have been in that web service, run it as a standalone client tool.)
smarx
For the second idea, see if you can spot a race condition, but how about:1) read the entity, which should have a "leasename" property (string) and a "done" property (bool)2) if done is true, initialization is done, move on3) if leasename is empty, set it to a random guid and update the entity (optimistic concurrency will prevent the edit if someone else has already edited in the meantime). Now do your initialization.4) if leasename is set but the timestamp on the entity is "old" (define a good timeout, being wary of clock drift), consider the lease available, so do as in (3).Make sense?
smarx
+1  A: 

We did end-up with the exact same sort of problem, that's why we introduced a O/C mapper (object to cloud). Basically, you want to introduce two types of cloud services:

  1. QueueService that consumes messages whenever available.
  2. ScheduledService that triggers operations on a scheduled basis.

Then, as others suggested, in the cloud, you really prefer using leases instead of locks, in order to avoid your cloud app to end up freezed forever due to a temporary hardware (or infrastructure) issue.

Joannes Vermorel