tags:

views:

134

answers:

3

Hi,

I have two sequences (of tuples) on which I need to do a join:

  • Seq 1: [(City1 * Pin1), (City2 * Pin2), (City1 * Pin3), (City1 * Pin4)]
  • Seq 2: [(Pin1 * ProductA), (Pin2 * ProductB), (Pin1 * ProductC), (Pin2 * ProductA)]

into the sequence (of tuples):

  • [(City1 * ProductA), (City2 * ProductB), (City * ProductC), (City2 * Product A)...]

In C# I could do this using the Linq Join extension method like:

seq1.Join(seq2, t => t.Item2, t=> t.Item1,
    (t,u) => Tuple.Create(t.Item1, u.Item2))

How do I accomplish this in F#? I cannot find join on Seq there.

+2  A: 

F# Interactive session:

> let seq1 = seq [("city1", "pin1"); ("city2", "pin2")];;

val seq1 : seq<string * string> = [("city1", "pin1"); ("city2", "pin2")]

> let seq2 = seq [("pin1", "product1"); ("pin2", "product2")];;

val seq2 : seq<string * string> = [("pin1", "product1"); ("pin2", "product2")]

> Seq.zip seq1 seq2;;
val it : seq<(string * string) * (string * string)> =
  seq
    [(("city1", "pin1"), ("pin1", "product1"));
     (("city2", "pin2"), ("pin2", "product2"))]
> Seq.zip seq1 seq2 |> Seq.map (fun (x,y) -> (fst x, snd y));;
val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

Also, you must be able to use Linq queries on sequences, just be sure you have a reference to the System.Linq assembly and opened a namespace open System.Linq

UPDATE: in a complex scenario you can use sequence expressions as follows:

open System

let seq1 = seq [("city1", "pin1"); ("city2", "pin2"); ("city1", "pin3"); ("city1", "pin4")]
let seq2 = seq [("pin1", "product1"); ("pin2", "product2"); ("pin1", "product3"); ("pin2", "product1")]

let joinSeq = seq { for x in seq1 do
                        for y in seq2 do
                            let city, pin = x
                            let pin1, product = y
                            if pin = pin1 then
                                yield(city, product) }
for(x,y)in joinSeq do
    printfn "%s: %s" x y

Console.ReadKey() |> ignore
Artem K.
3 seconds before me! Should have not added links...
Callum Rogers
Cities would have more than one pin and pins would be many to many with products.
SharePoint Newbie
So, the what the last command does: make a list of pairs of tuples as previous command shown (Seq.zip), then map this list to the new, getting first element of the first tuple(fst x) and the second element of second tuple(snd y).
Artem K.
@SharePoint Newbie: Can you explain further with examples?
Callum Rogers
I cannot figure out the syntax as I'm new to F#. Could you help converting the C# line to F#?
SharePoint Newbie
Please check update
Artem K.
+4  A: 

Edit: Actually, you can just use LINQ:

> open System.Linq;;
> let ans = seq1.Join(seq2, (fun t -> snd t), (fun t -> fst t), (fun t u -> (fst t, snd u)));;

Why not use F#'s native Seq functions? If you look at the docs and at this question you can simply use these instead of LINQ. Take the Seq.map2 function for example:

> let mapped = Seq.map2 (fun a b -> (fst a, snd b)) seq1 seq2;;

val it : seq<string * string> =
  seq [("city1", "product1"); ("city2", "product2")]

should give you what you want, where seq1 and seq2 are your first and second sequences.

Callum Rogers
Cities would be one to many with pin and pins would be many to many with products. Could you explain how it would work?
SharePoint Newbie
Do you mean that you could have `[(City1 * Pin1 * Pin2), (City2 * Pin2)]` and `[(Pin1 * ProductA), (Pin2 * ProductB * Productc)]` ie using tuples which more than 2 elements?
Callum Rogers
No, I mean I could have multiple items in the sequence with same city and different pin. Similarly I could have multiple items with same pin and different product or vice versa in seq 2. The tuple will always have 2 items though.
SharePoint Newbie
I haven't checked map2 before, thanks. Your code looks more readable.
Artem K.
@Share: So more like `[(City1 * Pin1), (City1 * Pin2), (City2 * Pin2)]` and `[(Pin1 * Product1), (Pin2 * Product1), (Pin3 * Product2)]`?
Callum Rogers
A pin cannot be in more than one city, but yes somewhat like that.
SharePoint Newbie
+1  A: 

I think that it is not exactly clear what results are you expecting, so the answers are a bit confusing. Your example could be interpreted in two ways (either as zipping or as joining) and they are dramatically different.

  • Zipping: If you have two lists of the same length and you want to align correspoding items (e.g. 1st item from first list with 1st item from the second list; 2nd item from first list with 2nd item from the second list, etc..), then look at the answers that use either List.zip or List.map2.

    However, this would mean that the lists are sorted by pins and pins are unique. In that case you don't need to use Join and even in C#/LINQ, you could use Zip extension method.

  • Joining: If the lists may have different lengths, pins may not be sorted or not unique, then you need to write a real join. A simplified version of the code by Artem K would look like this:

    seq { for city, pin1 in seq1 do 
            for pin2, product in seq2 do 
              if pin1 = pin2 then yield city, product }
    

    This may be less efficient than Join in LINQ, because it loops through all the items in seq2 for every item in seq1, so the complexity is O(seq1.Length * seq2.Length). I'm not sure, but I think that Join could use some hashing to be more efficient. Instead of using Join method directly, I would probably define a little helper:

    open System.Linq
    module Seq = 
      let join (seq1:seq<_>) seq2 k1 k2 =
        seq1.Join(seq2, (fun t -> k1 t), (fun t -> k2 t), (fun t u -> t, u)) 
    

    Then you can write something like this:

    (seq1, seq2) 
       ||> Seq.join snd fst 
       |> Seq.map (fun (t, u) -> fst t, snd u)
    

Finally, if you know that there is exactly one unique city for every product (the sequences have the same length and pins are unique in both of them), then you could just sort both sequences by pins and then use zip - this may be more efficient than using join (especially if you could keep the sequence sorted from some earlier operations).

Tomas Petricek
Hi, I wanted a join (cross product) and its implemented using a dictionary in LINQ which makes it faster.
SharePoint Newbie