ansaurus

Question

Weighted, load-balancing resource scheduling algorithm

Answer 1

+1 A:

As you point out, this is a load-balancing problem. It's not really a scheduling problem, since you're not trying to minimise anything (total time, number of concurrent workers, etc.). There are no special constraints (job duration, time clashes, skill sets to match etc.) So really your problem boils down to selecting an appropriate weighting function.

You say there are some situations you want to avoid, like user weightings that are too close together. Can you provide more details? For example, what's wrong with making the chance of assignment just proportional to the current workload, normalised by the workload of the other workers? You can visualise this as a sequence of blocks of different lengths (the tasks), being packed into a set of bins (the workers), where you're trying to keep the total height of the bins as even as possible.

With more information, we could make specific recommendations of functions that could work for you.

Edit: example load-balancing functions

Based on your comments, here are some example of simple functions that can give you different balancing behaviour. A basic question is whether you want deterministic or probabilistic behaviour. I'll give a couple of examples of each.

To use the example in the question - there are 4 + 5 + 0 + 7 + 9 = 25 jobs currently assigned. You want to pick who gets job 26.

1) Simple task farm. For each job, always pick the worker with the least jobs currently pending. Fast workers get more to do, but everyone finishes at about the same time.

2) Guarantee fair workload. If workers work at different speeds, and you don't want some doing more than others, then track the number of completed + pending jobs for each worker. Assign the next job to keep this number evenly spread (fast workers get free breaks).

3) Basic linear normalisation. Pick a maximum number of jobs each worker can have. Each worker's workload is normalised to that number. For example, if the maximum number of jobs/worker is 15, then 50 more jobs can be added before you reach capacity. So for each worker the probability of being assigned the next job is

P(A) = (15 - 4)/50 = 0.22  
P(B) = (15 - 5)/50 = 0.2  
P(C) = (15 - 0)/50 = 0.3  
P(D) = (15 - 7)/50 = 0.16  
P(E) = (15 - 9)/50 = 0.12

If you don't want to use a specific maximum threshold, you could use the worker with the highest current number of pending jobs as the limit. In this case, that's worker E, so the probabilities would be

P(A) = (9 - 4)/20 = 0.25  
P(B) = (9 - 5)/20 = 0.2  
P(C) = (9 - 0)/20 = 0.45 
P(D) = (9 - 7)/20 = 0.1  
P(E) = (9 - 9)/20 = 0

Note that in this case, the normalisation ensures worker E can't be assigned any jobs - he's already at the limit. Also, just because C doesn't have anything to do doesn't mean he is guaranteed to be given a new job (it's just more likely).

You can easily implement the choice function by generating a random number r between 0 and 1 and comparing it to these boundaries. So if r is < 0.25, A gets the job, 0.25< r < 0.45, B gets the job, etc.

4) Non-linear normalisation. Using a log function (instead of the linear subtraction) to weight your numbers is an easy way to get a non-linear normalisation. You can use this to skew the probabilities, e.g. to make it much more likely that workers without many jobs are given more.

The point is, the number of ways of doing this are practically unlimited. What weighting function you use depends on the specific behaviour you're trying to enable. Hopefully that's given you some ideas which you can use as a starting point.

ire_and_curses 2009-09-11 09:53:16

I'm fine with that approach; I'm just not just not sure of a good normalization technique to use. I tried a handful of methods (don't remember exactly, sorry - but things like using the mean workload of all workers, the difference in workload between a worker and the highest workload, etc.), but I didn't like the numbers across varying sets of data. For example, if I have 50 workers with loads up to 15, I might end up with little difference between a load of 1 and a load of 15.I'm sure this is simple and I'm just not seeing the solution. My stats and OR classes were a long time ago! :-)

2009-09-11 14:46:33

@unknown (google): Ok. I've added a few specific examples to my answer. Hope this is helpful!

ire_and_curses 2009-09-11 16:19:34

Very helpful, thanks. Quick question, though: how did you come up with the value 20 in this example:P(A) = (9 - 4)/20 = 0.25P(B) = (9 - 5)/20 = 0.2[...]

2009-09-11 20:22:05

It's the number of free job slots available (9 is the maximum in that example): (9-4) + (9-5) + (9-0) + (9-7) + (9-9) = 20

ire_and_curses 2009-09-11 20:29:44

Most excellent; thanks!

2009-09-14 20:30:06

ansaurus

tags:

views:

answers:

Weighted, load-balancing resource scheduling algorithm

Edit: example load-balancing functions

related questions