tags:

views:

122

answers:

6

Hi,

I have had this problem for a while, still trying to work on a solution.

What is the best possible method for evenly distributing items in a list, with a low discrepancy?

Say we have a list or elements in an array:

Red, Red, Red, Red, Red, Blue, Blue, Blue, Green, Green, Green, Yellow

So ideally the output would produce something like:

Red, Blue, Red, Green, Red, Yellow, Blue, Red, Green, Red, Blue, Green.

Where each instance is "as far away" from another instance of itself as possible...

When, I first attempted trying to solve this problem, I must admit I was naive, so I just used some form of seeded random number to shuffle the list, but this leads to clumping of instances.

A suggestion was start with the item with the highest frequency, so red will be put in position n*12/5 for n from 0 to 4 inclusive.

Then place the next most repeated element (Blue) in positions n*12/3 + 1 for n from 0 to 2 inclusive. If something is already placed there, just put it in the next empty spot. etc etc. However, when jotting it out on paper this doesn't work in all circumstances,

Say the list is only

Red, Red, Red, Red, Red, Blue

It will fail.

Where either option has three same-color adjacencies

Red, Red, Blue, Red, Red, Red
Red, Red, Red, Blue, Red, Red

So please, any ideas, or implementations how to do this would be awesome.

If it matters i'm working on objective-c, but right now all I care about is the methodology how to do it.

+4  A: 

Just a quick idea: Use separate list for each type of item. Then using something like a merge sort insert one item from each list into a new list, always in the same order. Skip empty lists.

This of course does not yield the perfect solution, but it is very easy to implement and should be fast. A simple improvement is to sort the list by size, largest first. This gives slightly better results than a random order of lists.

Update: perhaps this could make it better: get the size of the largest list at algorithm start and call it LARGEST_SIZE - this one will get its turn in each round. Now for all other lists, they should be used only in starting_size_of_the_list/LARGEST_SIZE rounds. I hope you know what i mean. This way you should be able to evenly space all the items. But nevertheless, it still is not perfect!

OK so i will try to be more specific. Say you have 4 lists of sizes: 30 15 6 3

For the first list, you will use it every 30/30 round, which is 1, so every 1 round. This means each time. For the second list, you will use it 15/30 which is 0.5 so every 2 round. third list: 6/30 -> every 5 rounds. Last list: 3/30 -> every 10 rounds. This should really give you a nice spacing of items.

This is of course a nice example, for other numbers it gets a bit uglier. For very small amounts of items this wont get you perfect results. However for large amount of items it should work quite nice.

PeterK
this way you'll never build: blue,blue,red,blue,blue
RnR
I know. I never said this is perfect, its just a quick solution that yields good results.
PeterK
Thanks, um about the first suggestion, this is one of those things I try, as you say it works for the most part. I tried using itertools and a round robin in Python. About your update, sorry I am sure to most it makes sense, but I am pretty poor at maths :(
S1syphus
Updated the answer to clarify things a bit. hope this helps.
PeterK
That is a lot clearer, thank you very much.
S1syphus
+1  A: 

I think you'd need to optimize for some kind of an improvement function - say calculate how much "better" it will be to insert Blue at a certain position and do that for all possible insert positions and then insert to any location with the maximum value of this "gain" function and continue.

RnR
+2  A: 

You could do an inverse of K-means clustering, aiming to either:

  • maximise the number of clusters
  • define the proximity of the items to similar items using using some sort of inverse function so that clusters are created from similar items that are further apart rather than close together.
Ian Turner
+1  A: 

Sort the list using a dynamic score function, that for each element in the list returns the distance from the closest element with the same value.

Mau
+1  A: 

I'll post here the solution that i've used it in a few cases for this problem in algorithm contests.

You'll have a max heap of pairs(COUNTER, COLOUR), order by COUNTER, so the colour with the biggest COUNTER will be on the top. Each time you'll have two cases: if the value in the top it's not equal with the last element in the list, you'll remove the pair(COUNTERx, COLOURx) from the heap, add COLOURx to the end of the list, and add pair( (COUNTERx) - 1, COLOURx) to the heap if (COUNTERx) - 1 != 0. In the other case take the second greatest COUNTER pair from the heap instead of first and do the same like for the first pair. The time complexity is o(S log N), where N is the number of colours and S the size of the list.

Teodor Pripoae
+1  A: 
Svante