views:

1973

answers:

10

Hello all.

I've been wondering if there are known solutions for algorithm of creating a school timetable. Basically, it's about optimizing "hour-dispersion" (both in teachers and classes case) for given class-subject-teacher associations. We can assume that we have sets of classes, lesson subjects and teachers associated with each other at the input and that timetable should fit between 8AM and 4PM.

I guess that there is probably no accurate algorithm for that, but maybe someone knows a good approximation or hints for developing it.

P.S. I know, there was http://stackoverflow.com/questions/1259686/school-timetable-generation-algorithm-closed , but unlike in that case, I'm not looking for actual implementation, rather for thoughts on how to do it or why it's impossible.

+4  A: 

UPDATE: from comments ... should have heuristics too!

I'd go with Prolog ... then use Ruby or Perl or something to cleanup your solution into a prettier form.

teaches(Jill,math).
teaches(Joe,history).

involves(MA101,math).
involves(SS104,history).

myHeuristic(D,A,B) :- [test_case]->D='<';D='>'.
createSchedule :- findall(Class,involves(Class,Subject),Classes),
                  predsort(myHeuristic,Classes,ClassesNew),
                  createSchedule(ClassesNew,[]).
createSchedule(Classes,Scheduled) :- [the actual recursive algorithm].

I am (still) in the process of doing something similar to this problem but using the same path as I just mentioned. Prolog (as a functional language) really makes solving NP-Hard problems easier.

Reed Debaets
Prolog is certainly a great language for expressing the required problems, however as you point out: the problem IS NP-complete, if not NP-Hard. This means that Prolog may not be fast enough for a practical implementation.
Poindexter
if it has anything to do with NP and we won't be satisfied by approximation, any deterministic algorithm will exponentially suck:)
Gabriel Ščerbák
The goal then, is to implement effective heuristics ... for example a simple 9 task scheduling algorithm takes 3.078s to complete, but with a smallestWindowFirst heuristic implemented the same problem only takes: 0.123s
Reed Debaets
+1  A: 

Generally, constraint programming is a good approach to this type of scheduling problem. A search on "constraint programming" and scheduling or "constraint based scheduling" both within stack overflow and on Google will generate some good references. It's not impossible - it's just a little hard to think about when using traditional optimization methods like linear or integer optimization. One output would be - does a schedule exist that satisfies all the requirements? That, in itself, is obviously helpful.

Good luck !

Grembo
+2  A: 

Genetic algorithms are often used for such scheduling.

Found this example (Making Class Schedule Using Genetic Algorithm) which matches your requirement pretty well.

Christian Vik
+3  A: 

Here are a few links I found:

School timetable - Lists some problems involved

A Hybrid Genetic Algorithm for School Timetabling

Scheduling Utilities and Tools

Niyaz
+16  A: 

This problem is NP-Complete!
In a nutshell one needs to explore all possible combinations to find the list of acceptable solutions. Because of the variations in the circumstances in which the problem appears at various schools (for example: Are there constraints with regards to classrooms?, Are some of the classes split in sub-groups some of the time?, Is this a weekly schedule? etc.) there isn't a well known problem class which corresponds to all the scheduling problems. Maybe, the Knapsack problem has many elements of similarity with these problems at large.

A confirmation that this is both a hard problem and one for which people perennially seek a solution, is to check this (long) list of (mostly commercial) software scheduling tools

Because of the big number of variables involved, the biggest source of which are, typically, the faculty member's desires ;-)..., it is typically impractical to consider enumerating all possible combinations. Instead we need to choose an approach which visits a subset of the problem/solution spaces.
- Genetic Algorithms, cited in another answer is (or, IMHO, seems) well equipped to perform this kind of semi-guided search (The problem being to find a good evaluation function for the candidates to be kept for the next generation)
- Graph Rewriting approaches are also of use with this type of combinatorial optimization problems.

Rather than focusing on particular implementations of an automatic schedule generator program, I'd like to suggest a few strategies which can be applied, at the level of the definition of the problem.
The general rationale is that in most real world scheduling problems, some compromises will be required, not all constraints, expressed and implied: will be satisfied fully. Therefore we help ourselves by:

  • Defining and ranking all known constraints
  • Reducing the problem space, by manually, providing a set of additional constraints.
    This may seem counter-intuitive but for example by providing an initial, partially filled schedule (say roughly 30% of the time-slots), in a way that fully satisfies all constraints, and by considering this partial schedule immutable, we significantly reduce the time/space needed to produce candidate solutions.
    Another way additional constraints help is for example "artificially" adding a constraint which prevent teaching some subjects on some days of the week (if this is a weekly schedule...); this type of constraints results in reducing the problem/solution spaces, without, typically, excluding a significant number of good candidates.
  • Ensuring that some of the constraints of the problem can be quickly computed. This is often associated with the choice of data model used to represent the problem; the idea is to be able to quickly opt-for (or prune-out) some of the options.
  • Redefining the problem and allowing some of the constraints to be broken, a few times, (typically towards the end nodes of the graph). The idea here is to either remove some of constraints for filling-in the last few slots in the schedule, or to have the automatic schedule generator program stop shy of completing the whole schedule, instead providing us with a list of a dozen or so plausible candidates. A human is often in a better position to complete the puzzle, as indicated, possibly breaking a few of the contraints, using information which is not typically shared with the automated logic (eg "No mathematics in the afternoon" rule can be broken on occasion for the "advanced math and physics" class; or "It is better to break one of Mr Jones requirements than one of Ms Smith ... ;-) )

In proof-reading this answer , I realize it is quite shy of providing a definite response, but it none the less full of practical suggestions. I hope this help, with what is, after all, a "hard problem".

mjv
Great, accurate and elaborate answer, thanks for the hints and mention about NP-Completeness (it was my guess too).
cand
+5  A: 

It's a mess. a royal mess. To add to the answers, already very complete, I want to point out my family experience. My mother was a teacher and used to be involved in the process.

Turns out that having a computer to do so is not only difficult to code per-se, it is also difficult because there are conditions that are difficult to specify to a pre-baked computer program. Examples:

  • a teacher teaches both at your school and at another institute. Clearly, if he ends the lesson there at 10.30, he cannot start at your premises at 10.30, because he needs some time to commute between the institutes.
  • two teachers are married. In general, it's considered good practice not to have two married teachers on the same class. These two teachers must therefore have two different classes
  • two teachers are married, and their child attends the same school. Again, you have to prevent the two teachers to teach in the specific class where their child is.
  • the school has separate facilities, like one day the class is in one institute, and another day the class is in another.
  • the school has shared laboratories, but these laboratories are available only on certain weekdays (for security reasons, for example, where additional personnel is required).
  • some teachers have preferences for the free day: some prefer on Monday, some on Friday, some on Wednesday. Some prefer to come early in the morning, some prefer to come later.
  • you should not have situations where you have a lesson of say, history at the first hour, then three hours of math, then another hour of history. It does not make sense for the students, nor for the teacher.
  • you should spread the arguments evenly. It does not make sense to have the first days in the week only math, and then the rest of the week only literature.
  • you should give some teachers two consecutive hours to do evaluation tests.

As you can see, the problem is not NP-complete, it's NP-insane.

So what they do is that they have a large table with small plastic insets, and they move the insets around until a satisfying result is obtained. They never start from scratch: they normally start from the previous year timetable and make adjustments.

Stefano Borini
"NP-insane" - great name ;) I agree that it's really complex problem, thanks for comments on "real world" treatment of this problem. My father and my girlfriend are teachers as well and I know that most of schools have tables with plastic insets - it lead me to thinking of possible algorithm for this problem - because, if a man can solve it, maybe it will be possible to write it down as an algorithm.
cand
what you want to write is an expert system: a system made out of a bunch of heuristic rules. Expert systems aside, this is a field where genetic algorithms are among the best bets. The difficulty does not lie in solving the problem, not only. The difficulty also lies in stating the problem and its constraints.
Stefano Borini
You're right, the problem requires providing complex set of conditions and constraints to fulfill, probably with rating of "acceptable" solution. I agree about genetic algorithms, they should fit well with this problem, also I think that it'll be better to start developing with simple set of conditions, and enhancing it as correct answer for them is obtained.
cand
You could also pretty directly translate these constraints and goals into MAXSAT. MAXSAT algorithms are generally more reliable than genetic algorithms, but you may have clause explosion due to goals like the math classes should be spread out over the week.
Jules
+1 for NP-insane
drachenstern
+5  A: 

One of my half-term assignments was an genetic-algorithm school table generation.

Whole table is one "organism". There were some changes and caveats to the generic genetic algorithms approach:

  • Rules were made for "illegal tables": two classes in the same classroom, one teacher teaching two groups at the same time etc. These mutations were deemed lethal immediately and a new "organism" was sprouted in place of the "deceased" immediately. The initial one was generated by a series of random tries to get a legal (if senseless) one. Lethal mutation wasn't counted towards count of mutations in iteration.

  • "Exchange" mutations were much more common than "Modify" mutations. Changes were only between parts of the gene that made sense - no substituting a teacher with a classroom.

    • Small bonuses were assigned for bundling certain 2 hours together, for assigning same generic classroom in sequence for the same group, for keeping teacher's work hours and class' load continuous. Moderate bonuses were assigned for giving correct classrooms for given subject, keeping class hours within bonds (morning or afternoon), and such. Big bonuses were for assigning correct number of given subject, given workload for a teacher etc.

    • Teachers could create their workload schedules of "want to work then", "okay to work then", "doesn't like to work then", "can't work then", with proper weights assigned. Whole 24h were legal work hours except night time was very undesired.

    • The weight function... oh yeah. The weight function was huge, monstrous product (as in multiplication) of weights assigned to selected features and properties. It was extremely steep, one property easily able to change it by an order of magnitude up or down - and there were hundreds or thousands of properties in one organism. This resulted in absolutely HUGE numbers as the weights, and as a direct result, need to use a bignum library (gmp) to perform the calculations. For a small testcase of some 10 groups, 10 teachers and 10 classrooms, the initial set started with note of 10^-200something and finished with 10^+300something. It was totally inefficient when it was more flat. Also, the values grew a lot wider distance with bigger "schools".

    • Computation time wise, there was little difference between a small population (100) over a long time and a big population (10k+) over less generations. The computation over the same time produced about the same quality.

    • The calculation (on some 1GHz CPU) would take some 1h to stabilize near 10^+300, generating schedules that looked quite nice, for said 10x10x10 test case.

    • The problem is easily paralellizable by providing networking facility that would exchange best specimens between computers running the computation.

The resulting program never said daylight outside getting me a good grade for the semester. It showed some promise but I never got enough motivation to add any GUI and make it usable to general public.

SF.
So open it and advertise it and try and get some academics into it? Re-use it for further projects? Technically a program like that for 300 staff alone would be worth money to schools to produce optimum schedules, and they don't mind spending a few days to genetically calculate optimum schedules. Think batch processing. Hello hardware and software contracts ;)
drachenstern
A: 

I have designed commercial algorithms for both class timetabling and examination timetabling. For the first I used integer programming; for the second a heuristic based on maximizing an objective function by choosing slot swaps, very similar to the original manual process that had been evolved. They main things in getting such solutions accepted are the ability to represent all the real-world constraints; and for human timetablers to not be able to see ways to improve the solution. In the end the algorithmic part was quite straightforward and easy to implement compared with the preparation of the databases, the user interface, ability to report on statistics like room utilization, user education and so on.

Permaquid
+1  A: 

This problem is tougher than it seems.

As others have alluded to, this is a NP-complete problem, but let's analyse what that means.

Basically, it means you have to look at all possible combinations.

But "look at" doesn't tell you much what you need to do.

Generating all possible combinations is easy. It might produce a huge amount of data, but you shouldn't have much problems understanding the concepts of this part of the problem.

The second problem is the one of judging whether a given possible combination is good, bad, or better than the previous "good" solution.

For this you need more than just "is it a possible solution".

For instance, is the same teacher working 5 days a week for X weeks straight? Even if that is a working solution, it might not be a better solution than alternating between two people so that each teacher does one week each. Oh, you didn't think about that? Remember, this is people you're dealing with, not just a resource allocation problem.

Even if one teacher could work full-time for 16 weeks straight, that might be a sub-optimal solution compared to a solution where you try to alternate between teachers, and this kind of balancing is very hard to build into software.

To summarize, producing a good solution to this problem will be worth a lot, to many many people. Hence, it's not an easy problem to break down and solve. Be prepared to stake out some goals that aren't 100% and calling them "good enough".

Lasse V. Karlsen
Well, I'd argue that it's rather hard to know all of the constraints at the beginning, it needs experience and investigation of the matter. I'd rather divide the problem into two separate parts and develop them simultaneously. First will be general algorithm structure - saying how it should populate "next timetable generation", rather draft of mechanism, without too much "subject logic" behind (probably genetic algorithm). Second one will be a rule provider with set of constraints which check the "correctness" of timetable - it can be simple at first and enhanced later.
cand
+2  A: 

This paper describes the school timetable problem and their approach to the algorithm pretty well: "The Development of SYLLABUS—An Interactive, Constraint-Based Scheduler for Schools and Colleges."[PDF]

The author informs me the SYLLABUS software is still being used/developed here: http://www.scientia.com/uk/

Leftium
Thanks a lot. I'll surely read it!
cand