Job queue optimization algorithms

views:

534

answers:

+8 Q:

Job queue optimization algorithms

We have an application that requires assignment of jobs to resources. The resources have a number of attributes that define their suitability to a particular job -- some are preferences, some are hard constraints (all of the membership variety, e.g. "resource A is suited to jobs with color X, Y, or Z".

Resources have a cost associated with them (the duration they spend on-line). We have the ability to recruit resources -- this takes a variable amount of time. We can recruit for a fixed interval of time.

To give an idea of scale: There will be about 20 resources at any given time, 100 outstanding jobs. Completion of jobs takes 5-15 seconds. Recruiting a resource takes about 1-2 minutes, and we can recruit from 1-30 minutes of time (rerecruiting is allowed). We don't have much heads-up on jobs being submitted, maybe a few seconds.

The goal is completion of jobs with lowest cost (resource usage) for a given average latency (job completion time).

I'd appreciate pointers to algorithms, software libraries, or approaches to solving this problem.

+1 A:

Might want to look into the knapsack problem or the bin packing problem as those are similar in principle to what you are trying to do here.

In your problem description you mention that the goal is the completion of jobs with the lowest latency. If that is actually your only goal, then the solution is simple - hire all available resources. Many of them will be idle much of the time, but it pretty much guarantees the lowest possible latency.

I suspect that your real goal though is to minimize both latency and idle resources as much as possible, so there will always be some tradeoff between latency and wasted resources in play here.

Eric Petroelje 2009-06-23 15:17:34

Yep, I did not make clear that resources have a cost associated with their on-line time. Sorry.

sehugg 2009-06-23 15:37:52

I would look at it this way... not sure if it covers everything.

1) A "resource" could actually be seen as a "workcenter". How many workcenters you have open to work on "jobs" is relative to who is signed into the system.

2) Assign resources by waiting time - the longer they have been waiting for a job, the higher they should be on the list for the next request. That way no one gets cold or slows down. You will have to find and set a threshold by which (relative to resources and jobs). You can decide if you want them to click to pick up their next job, or for the system to automatically get one in between jobs

3) Setup a Job Schedule queue - I don't know if it's relevant, but there might be high/low priority jobs, etc. You need a Pool of jobs, listed "by attack order." Each job on the job schedule can go through the different stages so you know where everything is at and know when you're done to move onto the next one. The Job scheduler will look for available resources and assign them to jobs on the schedule. This will likely be most of the brains of your scheduling system.

I just hope you're not building an outbound call center :P

Jas Panesar 2009-06-23 15:25:16

uhm... 2)? gets cold slows down? Are we talking about real persons? I thought this is about computer proccess being the workers doing "some work" (e.g. grep something online)

jitter 2009-06-23 15:54:30

Jas is right, resources do get cold. The initial application is human-powered although this algorithm applies to machines as well (recruiting == staging new virtual server)

sehugg 2009-06-23 16:42:47

btw it's not an outbound call center ;)

sehugg 2009-06-23 16:48:52

@Jitter; yes, I'm talking about people. I'm not sure if it this scenario is applying to people or not, so I am assuming it is. When working on a production line/queue of any type, keeping a steady flow and pace is critical. You don't want to go too fast, or too slow as both affect output (and errors) too much. It's similar for programmers when we want to get into, and stay into a state of flow. If we're waiting too long in between input / output, we can lose our focus easier. Hope that helps.

Jas Panesar 2009-06-23 16:51:11

@Jas you are right about this, I have lots of lower-priority jobs so I can keep resources busy. My main concern is latency of higher-priority jobs.

sehugg 2009-06-23 17:29:47

@sehugg You might want to seperate your resources based on talent/skill/ability/speed to put the faster + more accurate ones on the higher-priority part of the queue, and let the rest of them work their way up. That way you bake the culture of error free and fast work into the bread of the application and organization. Glad to hear it's not an outgoing CC. If you can give more details maybe what I said can be more narrowed down?

Jas Panesar 2009-06-23 19:53:08

You may also want to track the efficiency of the system per service, and over all with the number of resources you do / don't have to bring online and the cost of downtime or lost capacity during that time. 100 requests in a queue may be too little in the beginning but if you can measure it now, you can extrapolate more in the future too.. if you can't measure it, you don't know what's happening ;)

Jas Panesar 2009-06-23 19:55:15

Further, you can create a performance formula for your resources so you can trend the ones that are performing good and should be (or could qualify for) the high queue. A simple formula at first should be enough...

Jas Panesar 2009-06-23 19:56:06

@Jas These are all things I want to do, but I do want a mathematical basis for the actual scheduling. But you and I are definitely thinking along the same lines. Have you done something similar in the past?

sehugg 2009-06-23 21:24:42

@Sehugg; Yes, I have. I build and integrate business systems for quite a few industries. This question reminded me of a factory I worked on recently, very similar concept. The key is finding the measuring points and coming up with that formula as a starting point, and making sure you can edit and improve the formula over time. If you need a hand we can connect off line.

Jas Panesar 2009-06-24 16:21:49

+1 A:

This feels like a few things: Economic Order Quantity, balancing upfront recruitment cost with run cost; an LP or IP, minimizing a formula for overall cost subject to various constraints; and then there are the probability distributions (time to recruit; job resources required?), making the whole thing stochastic.

It sounds sufficiently complex that, if I were doing it, I would probably set up a simulation. The system doesn't seem too complicated to do that way, or too mathematically onerous to run for large numbers of iterations or long run time, so you can get some fairly stable and useful results.

John Pirie 2009-06-23 15:50:52

John, thanks for the comment, I now have new and old things to Google ;)The recruitment problem seems largely a prediction problem, as by the time I've recruited resources the desired job completion time has passed. The only way I know to solve prediction problems is "go long" :)I think the part that can be optimized is making best use of available resources, and that seems NP-hard to me. Simulation is probably the way to go.

sehugg 2009-06-23 17:28:34

The Poisson process seems to be in there somewhere too...

sehugg 2009-06-23 17:33:37

You'll have to set up some sort of heuristic for recruitment, obviously -- based on anticipated demand, available current capacity, upfront cost to recruit vs. cost to carry unused resource time, etc. That's the nice thing about a simulation: you set it up, then test various heuristics until you get one that looks low-cost.

John Pirie 2009-06-23 22:52:39

I'm not aware of the literature on problems like this. I assume there is some, though, since queueing theory is a large academic area, and this doesn't sound like a ridiculously contrived situation. Mind you, the fact that you care about average latency, rather than worst-case latency or Nth percentile latency, might put you in the minority.

My first instinct is that since there seem to be plenty of jobs around, a good solution would have several "flexible" workers continuously employed. This is a set of workers who, between them, can complete most types of common jobs with acceptable latency. The lower you want latency to be, the more resources in this set and the more time they spend idle. Also the more "bursty" your input is (assuming bursts are unpredictable), the more idle time you need in order to prevent high latency during bursts.

Then on two occasions you hire additional "specialised" workers:

1) A rare type of job comes in which your current set of hires can only handle at high time cost or not at all. So you hire (roughly speaking) whoever can shift it and then do the most possible of the rest of your queue.

2) No such job is in, but you spot an opportunity to hire someone who just so happens to be able to take some combination of jobs off the current queue, and do them more cheaply than your current hires but without leaving your current hires idle. So you hire that resource.

As for the actual algorithm: it's almost certainly not computationally feasible to find the best solution, so the right answer depends on processing resources and you're looking at heuristics and solving partial problems. As long as everyone you hire is busy, and you can't hire anyone else without causing significant idle time at some point in the future, you're probably in the vicinity of a good solution, and somewhere near the "most bang per buck" point of the latency/cost tradeoff. Hiring more resources after that gives diminishing returns, but that doesn't mean you're not willing to do it for a specified latency and/or specified budget.

But it depends what the incoming jobs look like: if you have a resource that can only do one type of job, and that job only comes in once a day/week/year, then it's probably better to hire them once a day than to wait until you have enough of that job to fill their minimum possible timeslice (which is why firefighters spend most of their time playing card games, whereas typists spend most of their time typing. There's always enough typing to keep at least one typist busy, but fires are rare. Furthermore, we don't want the "most bang per buck" solution for fires, we want lower latency than that). So there are probably opportunities to tweak the algorithm for your particular set of resources and pattern of incoming jobs, if you're solving one particular instance of the problem rather than writing general scheduling software.

Plus presumably if the resources are human beings, you can't actually guarantee that hiring succeeds. They aren't going to sit around all day only getting paid when there happens to be work on a minute-by-minute basis, are they?

Steve Jessop 2009-06-23 17:51:41

Thanks for the post, good ideas. I think the hiring situation is interesting, because you want to hire "generalists" that have the best chance of solving the most upcoming jobs, rather than "specialists" that do a better job on some jobs, but can't fulfill others. I'm almost worried that my problem is just too general :) And I am concerned about worst-case latency, my gut just tells me that it would be intractable to do it that way.

sehugg 2009-06-23 21:27:00

This problem can be viewed as a linear optimization problem, so this should be a start. I have used this library however it has quite a lot of other things, so it may be overkill. Instead, it is not difficult to develop your own library, this book has a good chapter on LP.

eulerfx 2009-06-23 18:04:07

I'm afraid I don't have an easy answer for you, but here are some more related resources to comb throughfor ideas.

On Multi-dimensional Packing Problems

A Vector-based Strategy for Dynamic Resource Allocation

bdk 2009-06-23 18:20:40

Awesome question!!

I would look into dynamic programming, linear optimization, and queueing theory. Unfortunately, there's no real easy way for me to answer these things. I do not have the mathematical expertise necessary to give you a solid, optimal answer for these things.

However, if you are keen on such things, this sounds like an opportunity to get a good, though likely not optimal, solution using an artificial intelligence algorithm. I would recomment either a genetic algorithm or a simulated annealer. Either of these will not take very long to code. The idea is that you pick random, valid inputs and you can tweak these potential solutions, evolving them into better ones (or picking new ones automatically) as time goes by. These are a good compromise between getting optimal answers and spending forever to code and prove correctness.

San Jacinto 2009-06-23 18:46:23

This sounds very much like Real-Time Operating System Scheduling. Wikipedia details some of the algorithms that are used:

Cooperative scheduling

Round-robin scheduling

Preemptive scheduling

Fixed priority pre-emptive scheduling, an implementation of preemptive time slicing

Critical section preemptive scheduling

Static time scheduling

Earliest Deadline First approach

Advanced scheduling using the stochastic and MTG

Gavin Miller 2009-06-23 18:52:29

This is true, but figuring out the schedule is the difficult part here, not merely choosing the scheduling algorithm that will be used.

San Jacinto 2009-06-23 19:51:20

A few thoughts:

are there groups of jobs that can be grouped together - all having the same base requirements?
are there people that can also be groups together - all having the basic skills

If so, than you can reduce some of the complexity from the outset and then use some form of weighted averages for the preferences. Also, when you recruit, since the min. you can recruit for is 30 minutes, and assumption they are a higher cost, you probably want to make sure they have the highest utilization levels.

Here's some articles that might help:

Job Shop Scheduling - http://en.wikipedia.org/wiki/Job_Shop_Scheduling
Competative Analysis - http://en.wikipedia.org/wiki/Competitive_analysis_(online_algorithm)
k server problem - http://en.wikipedia.org/wiki/K-server_problem

meade 2009-06-25 01:35:28

ansaurus

tags:

views:

answers:

Job queue optimization algorithms

related questions