views:

676

answers:

6

I'm developing an application that optimally assigns shifts to nurses in a hospital. I believe this is a linear programming problem with discrete variables, and therefore probably NP-hard:

  • For each day, each nurse (ca. 15-20) is assigned a shift
  • There is a small number (ca. 6) of different shifts
  • There is a considerable number of constraints and optimization criteria, either concerning a day, or concerning an emplyoee, e.g.:
    • There must be a minimum number of people assigned to each shift every day
    • Some shifts overlap so that it's OK to have one less person in early shift if there's someone doing intermediate shift
    • Some people prefer early shift, some prefer late shift, but a minimum of shift changes is required to still get the higher shift-work pay.
    • It's not allowed for one person to work late shift one day and early shift the next day (due to minimum resting time regulations)
    • Meeting assigned working week lengths (different for different people)
    • ...

So basically there is a large number (aout 20*30 = 600) variables that each can take a small number of discrete values.

Currently, my plan is to use a modified Min-conflicts algorithm

  • start with random assignments
  • have a fitness function for each person and each day
  • select the person or day with the worst fitness value
  • select at random one of the assignments for that day/person and set it to the value that results in the optimal fitness value
  • repeat until either a maximum number of iteration is reached or no improvement can be found for the selected day/person

Any better ideas? I am somewhat worried that it will get stuck in a local optimum. Should I use some form of simulated annealing? Or consider not only changes in one variable at a time, but specifically switches of shifts between two people (the main component in the current manual algorithm)? I want to avoid tailoring the algorithm to the current constraints since those might change.

Edit: it's not necessary to find a strictly optimal solution; the roster is currently done manual, and I'm pretty sure the result is considerably sub-optimal most of the time - shouldn't be hard to beat that. Short-term adjustments and manual overrides will also definitely be necessary, but I don't believe this will be a problem; Marking past and manual assignments as "fixed" should actually simplify the task by reducing the solution space.

A: 

Dynamic programming a la Bell? Kinda sounds like there's a place for it: overlapping subproblems, optimal substructures.

Charlie Martin
I don't think the technique is applicable, since the subproblems influence each other and are therefore not independant and the solutions not reusable.
Michael Borgwardt
+3  A: 

Umm, did you know that some ILP-solvers do quite a good job? Try AIMMS, Mathematica or the GNU programming kit! 600 Variables is of course a lot more than the Lenstra theorem will solve easily, but sometimes these ILP solvers have a good handle and in AIMMS, you can modify the branching strategy a little. Plus, there's a really fast 100%-approximation for ILPs.

nes1983
thanks for the pointer - I'll look into these and see if I can use them.
Michael Borgwardt
Good pointer, those solvers have been studied in much more depth than the OP will study his problem. You should probably point out that the ILP use heuristic searches as well (I think branch and bound.)
ldog
A: 

One thing you can do is to try to look for symmetries in the problem. E.g. can you treat all nurses as equivalent for the purposes of the problem? If so, then you only need to consider nurses in some arbitrary order -- you can avoid considering solutions such that any nurse i is scheduled before any nurse j where i > j. (You did say that individual nurses have preferred shift times, which contradicts this example, although perhaps that's a less important goal?)

j_random_hacker
Preferred shifts are not very important, but there are other differences between people that cannot be ignored, such as someone being exempt from night shifts due to medical conditions.
Michael Borgwardt
+7  A: 

This is a difficult problem to solve well. There has been many academic papers on this subject particularly in the Operations Research field - see for example nurse rostering papers 2007-2008 or just google "nurse rostering operations research". The complexity also depends on aspects such as: how many days to solve; what type of "requests" can the nurse's make; is the roster "cyclic"; is it a long term plan or does it need to handle short term rostering "repair" such as sickness and swaps etc etc.

The algorithm you describe is a heuristic approach. You may find you can tweak it to work well for one particular instance of the problem but as soon as "something" is changed it may not work so well (e.g. local optima, poor convergence).

However, such an approach may be adequate depending your particular business needs - e.g. how important is it to get the optimal solution, is the problem outline you describe expected to stay the same, what is the potential savings (money and resources), how important is the nurse's perception of the quality of their rosters, what is the budget for this work etc.

luapyad
Wow, I hadn't thought that this would be the subject of that much reseach work. Thanks, I'll definitely look at some of those papers. Will edit the question to specify some of the factors you mention.
Michael Borgwardt
+2  A: 

I solved a shift assignment problem for a large manufacturing plant recently. First we tried generating purely random schedules and returning any one which passed the is_schedule_valid test - the fallback algorithm. This was, of course, slow and indeterminate.

Next we tried genetic algorithms (as you suggested), but couldn't find a good fitness function that closed on any viable solution (because the smallest change can make the entire schedule RIGHT or WRONG - no points for almost).

Finally we chose the following method (which worked great!):

  1. Randomize the input set (i.e. jobs, shift, staff, etc.).
  2. Create a valid tuple and add it to your tentative schedule.
  3. If not valid tuple can be created, rollback (and increment) the last tuple added.
  4. Pass the partial schedule to a function that tests could_schedule_be_valid, that is, could this schedule be valid if the remaining tuples were filled in a possible way
  5. If !could_schedule_be_valid, simply rollback (and increment) the tuple added in (2).
  6. If schedule_is_complete, return schedule
  7. Goto (2)

You incrementally build a partial shift this way. The benefit is that some tests for valid schedule can easily be done in Step 2 (pre-tests), and others must remain in Step 5 (post-tests).

Good luck. We wasted days trying the first two algorithms, but got the recommended algorithm generating valid schedules instantly in under 5 hours of development.

Also, we supported pre-fixing and post-fixing of assignments that the algorithm would respect. You simply don't randomize those slots in Step 1. You'll find that the solutions doesn't have to be anywhere near optimal. Our solution is O(N*M) at a minimum but executes in PHP(!) in less than half a second for an entire manufacturing plant. The beauty is in ruling out bad schedules quickly using a good could_schedule_be_valid test.

The people that are used to doing it manually don't care if it takes an hour - they just know they don't have to do it manually any more.

Andy
My requirements are different - the result is not either right or wrong, it can be anywhere on a continuous scale from "horrible" over "acceptable" to "perfect" - actually, a combination of individual scales for each person and day.
Michael Borgwardt
I assume that with "tuple" you mean the combinatin of assignments for one day or one worker? You describe a basic backtracking algorithm, with step 4 apparently a crucial optimization, but I don't see how do implement it efficiently - I guess it might be very specific for your requirements.
Michael Borgwardt
tuple = [worker, shift, job]
Andy
Yea, what I'm saying is that this algorithm is not optimal but is "good enough". It's quite simple, but works. It is O(N*M), at the very least. For this problem that doesn't equate to long runtimes because N,M are small. The optimization is in not building entire schedules before eliminating them.
Andy
If your schedules vary on a continuous scale, you can still use a branch and bound algorithm to keep track of the score of the current solution, backtracking when it gets worse than the current "best score". Update the global "best score" any time you find a better solution.
j_random_hacker
j_random_hacker
Michael Borgwardt
j_random_hacker
+1  A: 

Mike,

Don't know if you ever got a good answer to this, but I'm pretty sure that constraint programming is the ticket. While a GA might give you an answer, CP is designed to give you many answers or tell you if there is no feasible solution. A search on "constraint programming" and scheduling should bring up lots of info. It's a relatively new area and CP methods work well on many types of problems where traditional optimization methods bog down.

Grembo