tags:

views:

334

answers:

6

I am trying to think of an elegant way of getting a random subset from a set in F#

Any thoughts on this?

Perhaps this would work: say we have a set of 2x elements and we need to pick a subset of y elements. Then if we could generate an x sized bit random number that contains exactly y 2n powers we effectively have a random mask with y holes in it. We could keep generating new random numbers until we get the first one satisfying this constraint but is there a better way?

+2  A: 

Not having a really good grasp of F# and what might be considered elegant there, you could just do a shuffle on the list of elements and select the first y. A Fisher-Yates shuffle even helps you in this respect as you also only need to shuffle y elements.

Joey
F# sets don't allow for random access time O(1) so this algorithm would be ineffective without the set being converted into an array.
gradbot
Ok, would lie in the same complexity as your solution then, if I see that correctly. But you have code, I didn't get around learning F# so far, so you get my +1 :)
Joey
+1  A: 

Agree with @JohannesRossel. There's an F# shuffle-an-array algorithm here you can modify suitably. Convert the Set into an array, and then loop until you've selected enough random elements for the new subset.

Brian
It seems that @usernameIdentifier is a common convention here for referring to others (that I believe has been co-opted from another medium), so I am just attempting to go with the flow.
Brian
+2  A: 

If you don't want to convert to an array you could do something like this. This is O(n*m) where m is the size of the set.

open System

let rnd = Random(0);
let set = Array.init 10 (fun i -> i) |> Set.of_array

let randomSubSet n set =
    seq { 
        let i = set |> Set.to_seq |> Seq.nth (rnd.Next(set.Count))
        yield i
        yield! set |> Set.remove i 
        }
    |> Seq.take n
    |> Set.of_seq

let result = set |> randomSubSet 3 

for x in result do
    printfn "%A" x
gradbot
Since this approach is bound to perform poorly, you're better off converting to an array, shuffling, and then taking the first m results as a set - it's probably easier to read to boot.And if you _really_ don't want to convert your original set to an array, you could still generate a random boolean mask using an appropriately shuffled array (with m true's and n-m false's), and then simply zip the array with set, filter on the masks, and map back into the set - without ever converting the original set to an array, and still maintaining O(n) performance.
Eamon Nerbonne
A: 

rnd must be out of subset function.

let rnd = new Random()
let rec subset xs = 
    let removeAt n xs = ( Seq.nth (n-1) xs, Seq.append (Seq.take (n-1) xs) (Seq.skip n xs) )
    match xs with 
    | [] -> []
    | _ -> let (rem, left) = removeAt (rnd.Next( List.length xs ) + 1) xs
           let next = subset (List.of_seq left)
           if rnd.Next(2) = 0 then rem :: next else next
The_Ghost
A: 

Using Seq.fold to construct using lazy evaluation random sub-set:

let rnd = new Random()

let subset2 xs = let insertAt n xs x = Seq.concat [Seq.take n xs; seq [x]; Seq.skip n xs]
                 let randomInsert xs = insertAt (rnd.Next( (Seq.length xs) + 1 )) xs
                 xs |> Seq.fold randomInsert Seq.empty |> Seq.take (rnd.Next( Seq.length xs ) + 1)
The_Ghost
A: 

Do you mean a random subset of any size?

For the case of a random subset of a specific size, there's a very elegant answer here:

http://stackoverflow.com/questions/48087/select-a-random-n-elements-from-listt-in-c/48089#48089

Here it is in pseudocode:

RandomKSubset(list, k):
  n = len(list)
  needed = k
  result = {}
  for i = 0 to n:
    if rand() < needed / (n-i)
      push(list[i], result)
      needed--
  return result
dreeves