views:

825

answers:

12

I need to create a list of numbers from a range (for example from x to y) in a random order so that every order has an equal chance.

I need this for a music player I write in C#, to create play lists in a random order.

Any ideas?

Thanks.

EDIT: I'm not interested in changing the original list, just pick up random indexes from a range in a random order so that every order has an equal chance.

Here's what I've wrriten so far:

    public static IEnumerable<int> RandomIndexes(int count)
    {
        if (count > 0)
        {
            int[] indexes = new int[count];
            int indexesCountMinus1 = count - 1;

            for (int i = 0; i < count; i++)
            {
                indexes[i] = i;
            }

            Random random = new Random();

            while (indexesCountMinus1 > 0)
            {
                int currIndex = random.Next(0, indexesCountMinus1 + 1);
                yield return indexes[currIndex];

                indexes[currIndex] = indexes[indexesCountMinus1];
                indexesCountMinus1--;
            }

            yield return indexes[0];
        }
    }

It's working, but the only problem of this is that I need to allocate an array in the memory in the size of count. I'm looking for something that dose not require memory allocation.

Thanks.

+24  A: 

This can actually be tricky if you're not careful (i.e., using a naïve shuffling algorithm). Take a look at the Fisher-Yates/Knuth shuffle algorithm for proper distribution of values.

Once you have the shuffling algorithm, the rest should be easy.

Here's more detail from Jeff Atwood.

Lastly, here's Jon Skeet's implementation and description.

EDIT

I don't believe that there's a solution that satisfies your two conflicting requirements (first, to be random with no repeats and second to not allocate any additional memory). I believe you may be prematurely optimizing your solution as the memory implications should be negligible, unless you're embedded. Or, perhaps I'm just not smart enough to come up with an answer.

With that, here's code that will create an array of evenly distributed random indexes using the Knuth-Fisher-Yates algorithm (with a slight modification). You can cache the resulting array, or perform any number of optimizations depending on the rest of your implementation.

  private static int[] BuildShuffledIndexArray( int size ) {

     int[] array = new int[size];
     Random rand = new Random();
     for ( int currentIndex = array.Length - 1; currentIndex > 0; currentIndex-- ) {
        int nextIndex = rand.Next( currentIndex + 1 );
        Swap( array, currentIndex, nextIndex );
     }
     return array;
  }

  private static void Swap( IList<int> array, int firstIndex, int secondIndex ) {

     if ( array[firstIndex] == 0 ) {
        array[firstIndex] = firstIndex;
     }
     if ( array[secondIndex] == 0 ) {
        array[secondIndex] = secondIndex;
     }
     int temp = array[secondIndex];
     array[secondIndex] = array[firstIndex];
     array[firstIndex] = temp;
  }

NOTE: You can use ushort instead of int to half the size in memory as long as you don't have more than 65,535 items in your playlist. You could always programmatically switch to int if the size exceeds ushort.MaxValue. If I, personally, added more than 65K items to a playlist, I wouldn't be shocked by increased memory utilization.

Remember, too, that this is a managed language. The VM will always reserve more memory than you are using to limit the number of times it needs to ask the OS for more RAM and to limit fragmentation.

EDIT

Okay, last try: we can look to tweak the performance/memory trade off: You could create your list of integers, then write it to disk. Then just keep a pointer to the offset in the file. Then every time you need a new number, you just have disk I/O to deal with. Perhaps you can find some balance here, and just read N-sized blocks of data into memory where N is some number you're comfortable with.

Seems like a lot of work for a shuffle algorithm, but if you're dead-set on conserving memory, then at least it's an option.

Ryan Emerle
So you also have an iPod Shuffle? ;)
Frankie
I love the ubiquity of the naïve solution. I love hearing the same song three times before I hear half of the other songs. :)
Ryan Emerle
@Ryan Emerle, deleted my answer as there is no need to clutter. The link to Jon Skeet's implementation is really very enlightening. +1!
Frankie
Thanks, that's very nice, but it changes the elements in the a list, and I don't want to change the elements in the list. I just want random indexes... Thanks anyway.
TTT
@Alon - you can create a list of indexes, and apply the shuffling algorithm against that.. just a thought.
Ryan Emerle
@Ryan - it's a memory waste.
TTT
@Alon, while I agree it's not the most elegant solution, the memory utilization would be negligible. For an int[] of 100,000 indexes, you'd be looking at around 400K of memory. Alternatively, you could pick a random index, then keep a separate list of indexes already used, but then you'd have the same problem with a sloppier implementation.If you can, post whatever you've chosen as an answer to this question; I'm interested to see what you come up with.
Ryan Emerle
Anyone want to fix mplayer so that -shuffle doesn't force me to hear the same songs very often with a *.mp3 playlist? lol
Earlz
If @Alon wouldn't be so miserly about memory (400k, really?), he could store the Order along with the song in whatever structure the songs are held in, you then shuffle those around. At worst you add 1 `Int32` per song...which is the **best** you can hope to do. Spending any additional time on this problem is a waste.
sixlettervariables
@sixlettervariables: Yes, I'm very miserly about memory, and there's a good reason: I want to create fast software. My program is for slow computers. I don't to write a program like Windows Live Messenger, which is totally slow.
TTT
Thank you so much, sorry I haven't checked this answer before - I didn't have time until now. Thank you!!!
TTT
+5  A: 

Personally, for a music player, I wouldn't generate a shuffled list, and then play that, then generate another shuffled list when that runs out, but do something more like:

IEnumerable<Song> GetSongOrder(List<Song> allSongs)
{
    var playOrder = new List<Song>();
    while (true)
    {
        // this step assigns an integer weight to each song,
        // corresponding to how likely it is to be played next.
        // in a better implementation, this would look at the total number of
        // songs as well, and provide a smoother ramp up/down.
        var weights = allSongs.Select(x => playOrder.LastIndexOf(x) > playOrder.Length - 10 ? 50 : 1);

        int position = random.Next(weights.Sum());
        foreach (int i in Enumerable.Range(allSongs.Length))
        {
            position -= weights[i];
            if (position < 0)
            {
                var song = allSongs[i];
                playOrder.Add(song);
                yield return song;
                break;
            }
        }

        // trim playOrder to prevent infinite memory here as well.
        if (playOrder.Length > allSongs.Length * 10)
            playOrder = playOrder.Skip(allSongs.Length * 8).ToList();
    }    
}

This would make songs picked in order, as long as they haven't been recently played. This provides "smoother" transitions from the end of one shuffle to the next, because the first song of the next shuffle could be the same song as the last shuffle with 1/(total songs) probability, whereas this algorithm has a lower (and configurable) chance of hearing one of the last x songs again.

FryGuy
+1  A: 

I think you should stick to your current solution (the one in your edit).

To do a re-order with no repetitions & not making your code behave unreliable, you have to track what you have already used / like by keeping unused indexes or indirectly by swapping from the original list.

I suggest to check it in the context of the working application i.e. if its of any significance vs. the memory used by other pieces of the system.

eglasius
+3  A: 

Unless you shuffle the original song list (which you said you don't want to do), you are going to have to allocate some additional memory to accomplish what you're after.

If you generate the random permutation of song indices beforehand (as you are doing), you obviously have to allocate some non-trivial amount of memory to store it, either encoded or as a list.

If the user doesn't need to be able to see the list, you could generate the random song order on the fly: After each song, pick another random song from the pool of unplayed songs. You still have to keep track of which songs have already been played, but you can use a bitfield for that. If you have 10000 songs, you just need 10000 bits (1250 bytes), each one representing whether the song has been played yet.

I don't know your exact limitations, but I have to wonder if the memory required to store a playlist is significant compared to the amount required for playing audio.

Donnie DeBoer
A: 

you could use a trick we do in sql server to order sets in random like this with the use of guid. the values are always distributed equaly random.

private IEnumerable<int> RandomIndexes(int startIndexInclusive, int endIndexInclusive)
{
    if (endIndexInclusive < startIndexInclusive)
     throw new Exception("endIndex must be equal or higher than startIndex");

    List<int> originalList = new List<int>(endIndexInclusive - startIndexInclusive);
    for (int i = startIndexInclusive; i <= endIndexInclusive; i++)
     originalList.Add(i);

    return from i in originalList
        orderby Guid.NewGuid()
        select i;
}
Mladen Prajdic
Neat, but you still have an extra list and you've also introduced (IMHO) unnecessary overhead.
Ryan Emerle
A: 

From a logical standpoint, it is possible. Given a list of n songs, there are n! permutations; if you assign each permutation a number from 1 to n! (or 0 to n!-1 :-D) and pick one of those numbers at random, you can then store the number of the permutation that you are currently using, along with the original list and the index of the current song within the permutation.

For example, if you have a list of songs {1, 2, 3}, your permutations are:

0: {1, 2, 3}
1: {1, 3, 2}
2: {2, 1, 3}
3: {2, 3, 1}
4: {3, 1, 2}
5: {3, 2, 1}

So the only data I need to track is the original list ({1, 2, 3}), the current song index (e.g. 1) and the index of the permutation (e.g. 3). Then, if I want to find the next song to play, I know it's third (2, but zero-based) song of permutation 3, e.g. Song 1.

However, this method relies on you having an efficient means of determining the ith song of the jth permutation, which until I've had chance to think (or someone with a stronger mathematical background than I can interject) is equivalent to "then a miracle happens". But the principle is there.

FacticiusVir
Interesting, but this only works with very small values of _n_.. 50! = 3.04140932 × 10^64
Ryan Emerle
It is quite easy and fast to generate the permutation and select the correct index - you just need a little integer division and modulo calculation. However, that will also require modifying the existing list or creating a new one, since you need to keep track of the ones you've already selected. There is also the issue Ryan mentions about the size of these numbers, but if you can generate and manipulate these accurately, the algorithm will still work, because you can generate the permutation directly - but you will likely have extra overhead with non-native types, making it slower.
Michael Madsen
A: 

There are a number of methods of generating permutations without needing to store the state. See this question.

FryGuy
+4  A: 

If you use a maximal linear feedback shift register, you will use O(1) of memory and roughly O(1) time. See here for a handy C implementation (two lines! woo-hoo!) and tables of feedback terms to use.

And here is a solution:

public class MaximalLFSR
{
    private int GetFeedbackSize(uint v)
    {
        uint r = 0;

        while ((v >>= 1) != 0)
        {
          r++;
        }
        if (r < 4)
            r = 4;
        return (int)r;
    }

    static uint[] _feedback = new uint[] {
        0x9, 0x17, 0x30, 0x44, 0x8e,
        0x108, 0x20d, 0x402, 0x829, 0x1013, 0x203d, 0x4001, 0x801f,
        0x1002a, 0x2018b, 0x400e3, 0x801e1, 0x10011e, 0x2002cc, 0x400079, 0x80035e,
        0x1000160, 0x20001e4, 0x4000203, 0x8000100, 0x10000235, 0x2000027d, 0x4000016f, 0x80000478
    };

    private uint GetFeedbackTerm(int bits)
    {
        if (bits < 4 || bits >= 28)
            throw new ArgumentOutOfRangeException("bits");
        return _feedback[bits];
    }

    public IEnumerable<int> RandomIndexes(int count)
    {
        if (count < 0)
            throw new ArgumentOutOfRangeException("count");

        int bitsForFeedback = GetFeedbackSize((uint)count);

        Random r = new Random();
        uint i = (uint)(r.Next(1, count - 1));

        uint feedback = GetFeedbackTerm(bitsForFeedback);
        int valuesReturned = 0;
        while (valuesReturned < count)
        {
            if ((i & 1) != 0)
            {
                i = (i >> 1) ^ feedback;
            }
            else {
                i = (i >> 1);
            }
            if (i <= count)
            {
                valuesReturned++;
                yield return (int)(i-1);
            }
        }
    }
}

Now, I selected the feedback terms (badly) at random from the link above. You could also implement a version that had multiple maximal terms and you select one of those at random, but you know what? This is pretty dang good for what you want.

Here is test code:

    static void Main(string[] args)
    {
        while (true)
        {
            Console.Write("Enter a count: ");
            string s = Console.ReadLine();
            int count;
            if (Int32.TryParse(s, out count))
            {
                MaximalLFSR lfsr = new MaximalLFSR();
                foreach (int i in lfsr.RandomIndexes(count))
                {
                    Console.Write(i + ", ");
                }
            }
            Console.WriteLine("Done.");
        }
    }

Be aware that maximal LFSR's never generate 0. I've hacked around this by returning the i term - 1. This works well enough. Also, since you want to guarantee uniqueness, I ignore anything out of range - the LFSR only generates sequences up to powers of two, so in high ranges, it will generate wost case 2x-1 too many values. These will get skipped - that will still be faster than FYK.

plinth
+1 to you for solving the OPs problem the way he wants to
RCIX
Careful mathematical analysis is required to have any confidence a LFSR generates numbers that are sufficiently "random". Outside of that, though, i'd be interested to see how this would work in practice.
Ryan Emerle
A: 

If memory was really a concern after a certain number of records and it's safe to say that if that memory boundary is reached, there's enough items in the list to not matter if there are some repeats, just as long as the same song was not repeated twice, I would use a combination method.

Case 1: If count < max memory constraint, generate the playlist ahead of time and use Knuth shuffle (see Jon Skeet's implementation, mentioned in other answers).

Case 2: If count >= max memory constraint, the song to be played will be determined at run time (I'd do it as soon as the song starts playing so the next song is already generated by the time the current song ends). Save the last [max memory constraint, or some token value] number of songs played, generate a random number (R) between 1 and song count, and if R = one of X last songs played, generate a new R until it is not in the list. Play that song.

Your max memory constraints will always be upheld, although performance can suffer in case 2 if you've played a lot of songs/get repeat random numbers frequently by chance.

BarrettJ
A: 

You're going to have to allocate some memory, but it doesn't have to be a lot. You can reduce the memory footprint (the degree by which I'm unsure, as I don't know that much about the guts of C#) by using a bool array instead of int. Best case scenario this will only use (count / 8) bytes of memory, which isn't too bad (but I doubt C# actually represents bools as single bits).

    public static IEnumerable<int> RandomIndexes(int count) {
        Random rand = new Random();
        bool[] used = new bool[count];

        int i;
        for (int counter = 0; counter < count; counter++) {
            while (used[i = rand.Next(count)]); //i = some random unused value
            used[i] = true;
            yield return i;
        }
    }

Hope that helps!

Ian Henry
Trouble with this is that execution time is unpredictable. It is possible it will keep randomly choosing array items that are true for a very long time.
ICR
A: 

As many others have said you should implement THEN optimize, and only optimize the parts that need it (which you check on with a profiler). I offer a (hopefully) elegant method of getting the list you need, which doesn't really care so much about performance:

using System;
using System.Collections.Generic;
using System.Linq;

namespace Test
{
    class Program
    {
        static void Main(string[] a)
        {
            Random random = new Random();
            List<int> list1 = new List<int>(); //source list
            List<int> list2 = new List<int>();
            list2 = random.SequenceWhile((i) =>
                 {
                     if (list2.Contains(i))
                     {
                         return false;
                     }
                     list2.Add(i);
                     return true;
                 },
                 () => list2.Count == list1.Count,
                 list1.Count).ToList();

        }
    }
    public static class RandomExtensions
    {
        public static IEnumerable<int> SequenceWhile(
            this Random random, 
            Func<int, bool> shouldSkip, 
            Func<bool> continuationCondition,
            int maxValue)
        {
            int current = random.Next(maxValue);
            while (continuationCondition())
            {
                if (!shouldSkip(current))
                {
                    yield return current;
                }
                current = random.Next(maxValue);
            }
        }
    }
}
RCIX
A: 

It is pretty much impossible to do it without allocating extra memory. If you're worried about the amount of extra memory allocated, you could always pick a random subset and shuffle between those. You'll get repeats before every song is played, but with a sufficiently large subset I'll warrant few people will notice.

const int MaxItemsToShuffle = 20;
public static IEnumerable<int> RandomIndexes(int count)
{
    Random random = new Random();

    int indexCount = Math.Min(count, MaxItemsToShuffle);
    int[] indexes = new int[indexCount];

    if (count > MaxItemsToShuffle)
    {
        int cur = 0, subsetCount = MaxItemsToShuffle;
        for (int i = 0; i < count; i += 1)
        {
            if (random.NextDouble() <= ((float)subsetCount / (float)(count - i + 1)))
            {
                indexes[cur] = i;
                cur += 1;
                subsetCount -= 1;
            }
        }
    }
    else
    {
        for (int i = 0; i < count; i += 1)
        {
            indexes[i] = i;
        }
    }

    for (int i = indexCount; i > 0; i -= 1)
    {
        int curIndex = random.Next(0, i);
        yield return indexes[curIndex];

        indexes[curIndex] = indexes[i - 1];
    }
}
ICR