views:

161

answers:

5

I've a method which returns a generic list collection(List) from the database. This collection has got order details i.e., Order Id, order name, product details etc.

Also, method the method returns a collection having only the top 5 orders sorted by order date descending.

My requirement is that each time the client calls this method, I need to return collection which has got 5 random orders.

How do I achieve this using C#?

A: 

return collection.Where(()=>Random.Next(100) > (5 / collection.Count * 100)));

RA
As with mine, you can get dups (and might not get exactly five)
Ruben Bartelink
+1  A: 
return myList.OfType<Order>().OrderBy(o => Guid.NewGuid()).Take(5);
David Hedlund
This 'visits' each item of the array (yes, I know you knew that)
Ruben Bartelink
yes. i don't get where the *generics* requirement comes into play in the original question. if we just had a list of orders, we could drop the `OfType` and the above query would work just as well for a list as for a linq to sql table. If it was a linq to sql table, the `OrderBy` clause would actually resolve to `order by newid()` randomization at the database level, which is totally desirable (as you've pointed out)
David Hedlund
@David: I think the asker means List<Order> hence OfType is irrelevant. Problem is NewGuid is less efficient than Random() (and remember, *every Guid is sacred, every Guid is great*. Hadnt realised the default L2S [and presumably EF and LLBLGP etc] default OrderBy translation - which makes this worty a +1. (You should have said that in the post, shouldnt you :D)
Ruben Bartelink
well i guess =) i got confused by the getting-five-orders-out-of-a-generic-list part. a blatant waste of guid's, tho, i'd hate to contribute to making them less unique
David Hedlund
+6  A: 

I wrote a TakeRandom extension method a while back which does this using a Fisher-Yates shuffle. It's pretty efficient as it only bothers to randomise the number of items that you actually want to return, and is guaranteed to be unbiased.

public static IEnumerable<T> TakeRandom<T>(this IEnumerable<T> source, int count)
{
    var array = source.ToArray();
    return ShuffleInternal(array, Math.Min(count, array.Length)).Take(count);
}

private static IEnumerable<T> ShuffleInternal<T>(T[] array, int count)
{
    for (var n = 0; n < count; n++)
    {
        var k = ThreadSafeRandom.Next(n, array.Length);
        var temp = array[n];
        array[n] = array[k];
        array[k] = temp;
    }

    return array;
}

An implementation of ThreadSafeRandom can be found at the PFX team blog.

Greg Beech
+1: That's a pretty complete answer :D
Ruben Bartelink
Although it does seem to be copying the list (efficiency?) or modifying the existing one (not readily obvious from the signature).
Vilx-
@Vilk: as my crap answer proves, it's not easy to avoid duplicates efficiently so I suspect it's hard to beat it for efficiency
Ruben Bartelink
@Vilk - Yes, you're right, it did modify the original list. I've changed it so it doesn't now. It does unfortunately have to copy the collection to an array but that's pretty difficult to avoid (though I'm sure there is a way...)
Greg Beech
+1 for ThreadSafeRandom
Ryan Emerle
@Ryan: What difference would that make? It's returning random items, so whether the array is traversed forwards or backwards makes no difference.
LukeH
@Ryan - The Durstenfeld implementation can be written to reverse the list in either direction. While the reverse implementation is more commonly documented, the forward implementation is better here as it means it only has to shuffle the first n elements rather than all of them, as it doesn't need to shuffle elements that won't be returned.
Greg Beech
PS: I did check that the algorithm was unbiased... see: http://gregbeech.com/blog/determining-the-bias-of-a-shuffle-algorithm
Greg Beech
@Luke, these simple algorithms are easy to get wrong, and simple mistakes, such as running through the list forward, can introduce bias. But as Greg notes, he verified his solution. I have never seen this written this way, sorry for "_jump_ing to conclusions"
Ryan Emerle
@Ryan: Stepping forwards won't affect bias so long as the algorithm is implemented correctly, and it's possible to badly implement the backward-stepping algorithm too. Whether or not the results are biased is determined by the *correctness* of the implementation, *not* by the direction of traversal.
LukeH
I had wrongly commented that the list should be traversed in reverse. I deleted that comment.
Ryan Emerle
@Luke, right and I naively assumed that the implementation was incorrect. I have retracted the comment and feel sufficiently sheepish :)
Ryan Emerle
+2  A: 

You really should do this in the database - no point in returning a big stack of stuff only to drop all but five on the floor. You should amend your question to explain what typew of data access stack is involved so people can give better answers. For instance, you could do an ORDER BY RAND():

SELECT TOP 5 ... FROM orders
ORDER BY RAND()

But that visits every row, which you don't want. If you're using SQL Server [and would like to be tied to it :P], you could use TABLESAMPLE.

If you're using LINQ to SQL, go here

EDIT: Just pretend the rest of this isnt here - it's not efficient and hence Greg's answer is far more desirable if you do want to sort client-side.

But, for completeness, paste the following into LINQPad:

var orders = new[] { "a", "b", "c", "d", "e", "f" };
var random = new Random();
var result = Enumerable.Range(1,5).Select(i=>orders[random.Next(5)])
result.Dump();

EDIT: Brute force answer to Greg's point (Yes, not efficient or pretty)

var orders = new[] { "a", "b", "c", "d", "e", "f" };

var random = new Random();

int countToTake = 5;

var taken = new List<int>( countToTake);

var result = Enumerable.Range(1,countToTake)
 .Select(i=>{
  int itemToTake; 
  do { 
   itemToTake = random.Next(orders.Length); 
  } while (taken.Contains(itemToTake)); 
  taken.Add(itemToTake); 
  return orders[itemToTake];
 });

result.Dump();
Ruben Bartelink
That will return duplicates though, which probably isn't what he's after.
Greg Beech
@Greg Beech: Good point, will fix (Real problem is that his sorting needs to happen in the database)
Ruben Bartelink
Up vote. Definitely should do this on the database.
Firestrand
@Firestrand: Your second ever upvote, I'm honoured :D
Ruben Bartelink
A: 

Here's how I would do it, using an extension method to perform a Fisher-Yates-Durstenfeld shuffle.

(This is obviously very similar to Greg's answer, but there are a few differences that, in my opinion, merit a separate answer.)

List<Orders> listOfOrders = GetTheOrdersFromSomewhere();
var fiveRandomOrders = listOfOrders.Shuffle().Take(5);

// ...

public static class ExtensionMethods
{
    public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)
    {
        if (_threadStaticRng == null)
            _threadStaticRng = new Random();

        return source.Shuffle(_threadStaticRng);
    }

    public static IEnumerable<T> Shuffle<T>(
        this IEnumerable<T> source, Random rng)
    {
        // parameter null checking removed for brevity

        T[] sourceArray = source.ToArray();
        for (int n = 0; n < sourceArray.Length; n++)
        {
            int k = rng.Next(n, sourceArray.Length);
            yield return sourceArray[k];

            sourceArray[k] = sourceArray[n];
        }
    }

    [ThreadStatic]
    private static Random _threadStaticRng;
}
LukeH