views:

101

answers:

2

I have below a description of a data structure I need and I want to implement it using immutable data structures. I'm trying to determine... is there an existing data structure out there that will support what I'm trying to do here or do I need to create one--and if I need to create it, what would be a good place to start (building blocks)?


I have a steady stream of incoming values of a certain type. I want to add them to a persistent/immutable data structure to hold a history of them, and on each add, it will review the history and determine if one or more oldest items will be removed (for example, if the history is > a certain length or a value has a certain property).

+2  A: 

Without knowing more about your requirements, I'd just say a vanilla Set<'a> does a more than adequate job. I'd prefer a 'Set' over a 'List' so you always have O(lg n) access to the largest and smallest items, allowing you to ordered your set by insert date/time for efficient access to the newest and oldest items.

Seems very easy to wrap up a set so that its Add/Remove methods invoke your callbacks:

type AwesomeSet(internalSet : Set<'a>, insertCallback : 'a -> unit, removeCallback : 'a -> unit) =
    member this.Add(x) =
        insertCallback(x)
        AwesomeSet(internalSet.Add x, insertCallback, removeCallback)

    member this.Remove(x) =
        removeCallback(x)
        AwesomeSet(internalSet.Remove x, insertCallback, removeCallback)

    member this.Count = internalSet.Count
    member this.Min = internalSet.MinimumElement
    member this.Max = internalSet.MaximumElement
Juliet
Lists are singly linked lists, right? So, access and removal on the tail would be very inefficient, so that wouldn't work well.I hadn't really considered set since I assumed it was unordered (silly me). But an ordered set... right, that would probably do the job. Except, every new one will always be added to the head. And if it's using binary trees, that means it will just end up being a doubly linked list and be very inefficient for my use case since removing off the end would involve copying all nodes.
taotree
@taotree: internal implementation `Set<'a>` is a red-black tree, so it supports very efficient O(lg n) for random access, insert, and delete. Since the tree is balanced, it doesn't turn into a linked list and doesn't require copying the entire tree.
Juliet
Okay, I had seen this:http://msdn.microsoft.com/en-us/library/ee353619.aspxand it said binary tree and I was thinking, uh oh...but then I saw at the very bottom of this:http://en.wikibooks.org/wiki/F_Sharp_Programming/Sets_and_Mapsright... red-black tree = performance goodness.Microsoft should be more specific in their doc.Thanks!
taotree
@Juliet: AVL not red black, IIRC.
Jon Harrop
If AVL, that makes the information at both the links I put in my comment incorrect or incomplete.
taotree
A: 

Thanks to Juliet's kind information, I have implemented what I need and I put it here in case anyone else might find it useful.

let rec removeLast (s : Set<'a>, num : int) : Set<'a> = 
    match num with
    | 0 -> s
    | _ -> removeLast(s.Remove(s.MinimumElement), num-1)


type History<'a when 'a : comparison>(underlying : Set<'a>, removal : History<'a> -> int) =
    member this.Add(x) =
        History(removeLast(underlying, removal(this)).Add x, removal)

    member this.Count = underlying.Count
    member this.Min = underlying.MinimumElement
    member this.Max = underlying.MaximumElement
    member this.Under = underlying

let maxHist = 2
let maxCountRemover (h : History<int>) =
    if h.Count >= maxHist
    then h.Count - maxHist + 1
    else 0


let testHistory =
    let s = History(Set.empty, r)
    let s = s.Add(1);
    printfn "%i: %i - %i" s.Count s.Min s.Max
    let s = s.Add(2);
    printfn "%i: %i - %i" s.Count s.Min s.Max
    let s = s.Add(3);
    printfn "%i: %i - %i" s.Count s.Min s.Max
    let s = s.Add(4);
    printfn "%i: %i - %i" s.Count s.Min s.Max
    let s = s.Add(5);
    printfn "%i: %i - %i" s.Count s.Min s.Max
    printfn "%A" s.Under
taotree
I recommend `printfn "%i %i" s.Min s.Max` and `printfn "%A" s.Under` in place of `System.Console.WriteLine(...)` :)
Juliet
Thanks. I was wondering about the canonical way of doing print formatting. I'll fix that sample code.
taotree