ansaurus

Question

Answer 1

+2 A:

I think that your analysis is incorrect:

walking through the list to find out the average is O(n)
making lists of children with too many or too few data chunks is also O(n)
moving data is proportional to the amount of data

How did you arrive to O(n!)?

You can sort the list [O(n lg n) in the number of children], so that on the front you have children with too much work, and at the end children with too little work. Then traverse the list from both ends simultaneously: one iterator points to a child with excess data, the other to a child with lack of data. Transfer data, and move either one iterator forward, or the other backward.

zvrba 2008-09-26 13:57:34

Nicholas Mancuso 2008-09-26 14:18:07

Answer 2

+4 A:

@zvrba: You do not even have to sort the list. When traversing the list the second time just move all items with less the average workload to the end of the list (you can keep a pointer to the last item at your first traversal). The order does not have to be perfect, it just changes when the iterators have to be augmented or decreased in your last step.

See previous answer

The last step would look something like:

In the second step keep a pointer to the first item with less than average workload in child2 (to prevent the necessity to have a double link list).

for each child in list {
  if child2 == nil then assert("Error in logic");
  while child.workload > avg + 1 {
    sendwork(child, child2, min(avg + 1 - child2.workload, child.workload - (avg + 1)))
    if child2.workload == avg + 1 then child2 = child2.next;
  }
}

Ralph Rickenbach 2008-09-26 14:21:28

Answer 3

+1 A:

The code you have posted has complexity O(n^2). Still, it is possible to do it in linear time as malach has observed, where n is the number of items in the children list.

Consider: the inner loop has n iterations, and it is executed at most n times. n*n = n^2.

zvrba 2008-09-26 14:34:56

Are you sure? I would see it being O(n^2) if the inner loop were starting at child.pos + 1, but its starting at the beginning of the loop each time, and must, to ensure even load. It would make more sense to sort the list like you said, then the inner loop must start at child.pos + 1.

Nicholas Mancuso 2008-09-26 14:39:01

Yes, I'm sure. It's O(n^2).

zvrba 2008-09-26 14:42:02

I concur with zvrbra - that is a O(n^2) algorithm.

Rob 2008-09-26 14:53:34

You're right. I wasn't thinking. Foreach n, n . n * n. n^2. ;).

Nicholas Mancuso 2008-09-26 15:09:05

Answer 4

+2 A:

You may want to try a completely different approach, such as consistent hashing.

See here for a relatively easy introduction to the topic: http://www8.org/w8-papers/2a-webserver/caching/paper2.html

(There are deeper papers available as well, starting with Karger et al)

I have created a working implementation of consistent hashing in Erlang that you can examine if you wish:

http://distributerl.googlecode.com/svn/trunk/chash.erl

Justin Sheehy 2008-09-26 14:42:55

ansaurus

tags:

views:

answers:

Balanced Distribution Algorithm

related questions