Hi there,
I'm doing a Simulated Annealing algorithm to optimise a given allocation of students and projects.
This is language-agnostic pseudocode from Wikipedia:
s ← s0; e ← E(s) // Initial state, energy.
sbest ← s; ebest ← e // Initial "best" solution
k ← 0 // Energy evaluation count.
while k < kmax and e > emax // While time left & not good enough:
snew ← neighbour(s) // Pick some neighbour.
enew ← E(snew) // Compute its energy.
if enew < ebest then // Is this a new best?
sbest ← snew; ebest ← enew // Save 'new neighbour' to 'best found'.
if P(e, enew, temp(k/kmax)) > random() then // Should we move to it?
s ← snew; e ← enew // Yes, change state.
k ← k + 1 // One more evaluation done
return sbest // Return the best solution found.
The following is an adaptation of the technique. My supervisor said the idea is fine in theory.
First I pick up some allocation (i.e. an entire dictionary of students and their allocated projects, including the ranks for the projects) from entire set of randomised allocations, copy it and pass it to my function. Let's call this allocation aOld
(it is a dictionary). aOld
has a weight related to it called wOld
. The weighting is described below.
The function does the following:
- Let this allocation,
aOld
be thebest_node
- From all the students, pick a random number of students and stick in a list
- Strip (DEALLOCATE) them of their projects ++ reflect the changes for projects (
allocated
parameter is nowFalse
) and lecturers (free up slots if one or more of their projects are no longer allocated) - Randomise that list
- Try assigning (REALLOCATE) everyone in that list projects again
- Calculate the weight (add up ranks, rank 1 = 1, rank 2 = 2... and no project rank = 101)
- For this new allocation
aNew
, if the weightwNew
is smaller than the allocation weightwOld
I picked up at the beginning, then this is thebest_node
(as defined by theSimulated Annealing
algorithm above). Apply the algorithm toaNew
and continue. - If
wOld < wNew
, then apply the algorithm toaOld
again and continue.
The allocations/data-points are expressed as "nodes" such that a node = (weight, allocation_dict, projects_dict, lecturers_dict)
Right now, I can only perform this algorithm once, but I'll need to try for a number N (denoted by kmax
in the Wikipedia snippet) and make sure I always have with me, the previous node
and the best_node
.
So that I don't modify my original dictionaries (which I might want to reset to), I've done a shallow copy of the dictionaries. From what I've read in the docs, it seems that it only copies the references and since my dictionaries contain objects, changing the copied dictionary ends up changing the objects anyway. So I tried to use copy.deepcopy()
.These dictionaries refer to objects that have been mapped with SQLA.
Questions:
I've been given some solutions to the problems faced but due to my über green-ness with using Python, they all sound rather cryptic to me.
Deepcopy isn't playing nicely with SQLA. I've been told thatdeepcopy on ORM objects probably has issues that prevent it from working as you'd expect. Apparently I'd be better off "building copy constructors, i.e. def copy(self): return FooBar(....)." Can someone please explain what that means?
I checked and found out that
deepcopy
has issues because SQLAlchemy places extra information on your objects, i.e. an_sa_instance_state
attribute, that I wouldn't want in the copy but is necessary for the object to have. I've been told: "There are ways to manually blow away the old_sa_instance_state
and put a new one on the object, but the most straightforward is to make a new object with__init__()
and set up the attributes that are significant, instead of doing a full deep copy." What exactly does that mean? Do I create a new, unmapped class similar to the old, mapped one?An alternate solution is that I'd have to "implement
__deepcopy__()
on your objects and ensure that a new _sa_instance_state is set up, there are functions in sqlalchemy.orm.attributes which can help with that." Once again this is beyond me so could someone kindly explain what it means?A more general question: given the above information are there any suggestions on how I can maintain the information/state for the
best_node
(which must always persist through my while loop) and theprevious_node
, if my actual objects (referenced by the dictionaries, therefore the nodes) are changing due to the deallocation/reallocation taking place? That is, without using copy?