views:

166

answers:

2

I'm writing a data access layer. It will have C# 2 and C# 3 clients, so I'm compiling against the 2.0 framework. Although encouraging the use of stored procedures, I'm still trying to provide a fairly complete ability to perform ad-hoc queries. I have this working fairly well, already.

For the convenience of C# 3 clients, I'm trying to provide as much compatibility with LINQ query syntax as I can. Jon Skeet noticed that LINQ query expressions are duck typed, so I don't have to have an IQueryable and IQueryProvider (or IEnumerable<T>) to use them. I just have to provide methods with the correct signatures.

So I got Select, Where, OrderBy, OrderByDescending, ThenBy, and ThenByDescending working. Where I need help are with Join and GroupJoin. I've got them working, but only for one join.

A brief compilable example of what I have is this:

// .NET 2.0 doesn't define the Func<...> delegates, so let's define some workalikes
delegate TResult FakeFunc<T, TResult>(T arg);
delegate TResult FakeFunc<T1, T2, TResult>(T1 arg1, T2 arg2);

abstract class Projection{
    public static Condition operator==(Projection a, Projection b){
        return new EqualsCondition(a, b);
    }
    public static Condition operator!=(Projection a, Projection b){
        throw new NotImplementedException();
    }
}
class ColumnProjection : Projection{
    readonly Table  table;
    readonly string columnName;

    public ColumnProjection(Table table, string columnName){
        this.table      = table;
        this.columnName = columnName;
    }
}
abstract class Condition{}
class EqualsCondition : Condition{
    readonly Projection a;
    readonly Projection b;

    public EqualsCondition(Projection a, Projection b){
        this.a = a;
        this.b = b;
    }
}
class TableView{
    readonly Table        table;
    readonly Projection[] projections;

    public TableView(Table table, Projection[] projections){
        this.table       = table;
        this.projections = projections;
    }
}
class Table{
    public Projection this[string columnName]{
        get{return new ColumnProjection(this, columnName);}
    }

    public TableView Select(params Projection[] projections){
        return new TableView(this, projections);
    }
    public TableView Select(FakeFunc<Table, Projection[]> projections){
        return new TableView(this, projections(this));
    }
    public Table     Join(Table other, Condition condition){
        return new JoinedTable(this, other, condition);
    }
    public TableView Join(Table inner,
                          FakeFunc<Table, Projection> outerKeySelector,
                          FakeFunc<Table, Projection> innerKeySelector,
                          FakeFunc<Table, Table, Projection[]> resultSelector){
        Table join = new JoinedTable(this, inner,
            new EqualsCondition(outerKeySelector(this), innerKeySelector(inner)));
        return join.Select(resultSelector(this, inner));
    }
}
class JoinedTable : Table{
    readonly Table     left;
    readonly Table     right;
    readonly Condition condition;

    public JoinedTable(Table left, Table right, Condition condition){
        this.left      = left;
        this.right     = right;
        this.condition = condition;
    }
}

This allows me to use a fairly decent syntax in C# 2:

Table table1 = new Table();
Table table2 = new Table();

TableView result =
    table1
    .Join(table2, table1["ID"] == table2["ID"])
    .Select(table1["ID"], table2["Description"]);

But an even nicer syntax in C# 3:

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    select new[]{t1["ID"], t2["Description"]};

This works well and gives me identical results to the first case. The problem is if I want to join in a third table.

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    join t3 in table3 on t1["ID"] equals t3["ID"]
    select new[]{t1["ID"], t2["Description"], t3["Foo"]};

Now I get an error (Cannot implicitly convert type 'AnonymousType#1' to 'Projection[]'), presumably because the second join is trying to join the third table to an anonymous type containing the first two tables. This anonymous type, of course, doesn't have a Join method.

Any hints on how I can do this?

+2  A: 

This is a very interesting design, I like it! As you're saying, the problem is that your definition of the Join method is too specific. The key difference between your definition and the one in LINQ is following:

public static IEnumerable<TResult> Join<TOuter, TInner, TKey, TResult>(
    /* cut */, Func<TOuter, TInner, TResult> resultSelector)

public TableView Join(
     /* cut */, FakeFunc<Table, Table, Projection[]> resultSelector)

When LINQ compiles a query with multiple join clauses, it calls them in a sequence and it generates resultSelector for the first one automatically - and the generated code returns a simple anonymous type containing the elements from both of the source tables. So, if I'm correct, in your case, the generated anonymous type would look like this:

new { t1 : Projection; t2 : Projection }

This is unfortunately incompatible with Projection[] (even though semantically, the difference isn't big). I'm afraid that the only way to solve this would be to use dynamic type casts and Reflection.

  • You need to modify Join so that it has generic type parameter TResult used in the resultSelector.

  • In the Join method you'd run TResult res = resultSelector(...) and then you need to do something with the res value.

  • If res is Projection[] then you can use your existing code (this is the case that will be used in a query that contains a single join clause)

  • In the other case, res is an anonymous type like the one above. This means that you'll need to use reflection to get values of the properties of the type and turn them into an array of Projection values (and then do the same thing as you're doing now).

I didn't try implementing that, but I think it may work...

Tomas Petricek
Thank you, you pointed me down the right path. Taking your suggestion into account, I started mocking things up with IEnumerables and poking around in Reflector. After enough poking, I figured out what I needed to do. Good news: there's no runtime reflection required. It requires an additional class, though. I'll post full details in another answer. You get the accepted answer and a +1, though.
P Daddy
@P Daddy: I'm glad my answer helped - I didn't realize that you can recover from anonymous types by adding an overload that takes an array - that's a nice trick!
Tomas Petricek
A: 

The following is a bit long. If you're only interested in getting this to work, and don't care about the why or how, then skip to the last two code sections.


Tomas Petricek's answer was the right direction to go, but only about half way there. The TResult of resultSelector does indeed need to be generic. The joins are indeed chained, with the intermediate results containing an anonymous type consisting of the left and right parts of each join (these are referred to as outer and inner, respectively).

Let's look at my query from before:

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    join t3 in table3 on t1["ID"] equals t3["ID"]
    select new[]{t1["ID"], t2["Description"], t3["Foo"]};

This is translated into something like this:

var intermediate =
    table1.Join(
        table2, t1=>t1["ID"], t2=>t2["ID"],
        (t1, t2)=>new{t1=t1, t2=t2}
    );
TableView result =
    intermediate.Join(
        table3, anon=>anon.t1["ID"], t3=>t3["ID"],
        (anon, t3)=>new[]{anon.t1["ID"], anon.t2["ID"], t3["Foo"]}
    );

Adding an additional join makes the pattern clearer:

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    join t3 in table3 on t1["ID"] equals t3["ID"]
    join t4 in table4 on t1["ID"] equals t4["ID"]
    select new[]{t1["ID"], t2["Description"], t3["Foo"], t4["Bar"]};

This translates roughly to:

var intermediate1 =
    table1.Join(
        table2, t1=>t1["ID"], t2=>t2["ID"],
        (t1, t2)=>new{t1=t1, t2=t2}
    );
var intermediate2 =
    intermediate1.Join(
        table3, anon1=>anon1.t1["ID"], t3=>t3["ID"],
        (anon1, t3)=>new{anon1=anon1, t3=t3}
    );                 
TableView result =
    intermediate2.Join(
        table4, anon2=>anon2.anon1.t1["ID"], t4=>t4["ID"],
        (anon2, t3)=>new[]{
            anon2.anon1.t1["ID"], anon2.anon1.t2["ID"],
            anon2.t3["Foo"], t4["Bar"]
        }
    );

So the return value of resultSelector will be two different things. For the final join, the result is the select list, and this is the case I was already handling. For every other join, it will be an anonymous type containing the joined tables, which are given names according to the aliases I've assigned them in the query. LINQ apparently takes care of the indirection in the anonymous type, stepping through as necessary.

It became clear that I needed not one, but two Join methods. The one I had worked just fine for the final join, and I needed to add another for the intermediate joins. Remember that the method I already had returns a TableView:

public TableView Join(Table inner,
                      FakeFunc<Table, Projection> outerKeySelector,
                      FakeFunc<Table, Projection> innerKeySelector,
                      FakeFunc<Table, Table, Projection[]> resultSelector){
    Table join = new JoinedTable(this, inner,
        new EqualsCondition(outerKeySelector(this), innerKeySelector(inner)));
    return join.Select(resultSelector(this, inner));
}

Now I needed to add one that returned something with a Join method, so it could be called in a chain:

public Table Join<T>(Table inner,
                     FakeFunc<Table, Projection> otherKeySelector,
                     FakeFunc<Table, Projection> innerKeySelector,
                     FakeFunc<Table, Table, T> resultSelector){
    Table join = new JoinedTable(this, inner,
        new EqualsCondition(outerKeySelector(this), innerKeySelector(inner)));
    // calling resultSelector(this, inner) would give me the anonymous type,
    // but what would I do with it?
    return join;
}

Adding this method made my joins almost work. I could finally join three or more tables, but I lost my aliases:

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    join t3 in table3 on t1["ID"] equals t3["ID"]
                       // ^  error, 't1' isn't a member of 'Table'
    select new[]{t1["ID"], t2["Description"], t3["Foo"]};
               // ^         ^  error, 't1' & 't2' aren't members of 'Table'

I almost stopped here. I could, after all, work around it by foregoing these lost aliases:

TableView result =
    from t1 in table1
    join t2 in table2 on t1["ID"] equals t2["ID"]
    join t3 in table3 on table1["ID"] equals t3["ID"]
    select new[]{table1["ID"], table2["Description"], t3["Foo"]};

This compiled, ran, and produced the expected result. Woo-hoo! Success, sort of. Losing the aliases was less than ideal, though. In a real query, the tables might be more complex:

TableView result =
    from t1 in table1
    join t2 in (
        from t in table2
        where t["Amount"] > 20
        select new[]{t["ID"], t["Description"]
    ).AsSubQuery() on t1["ID"] equals t2["ID"]
    join t3 in table3 on t1["ID"] equals t3["ID"]
    select new[]{table1["ID"], t2["Description"], t3["Foo"]};
                             // ^ error, 't2' isn't a member of 'Table'

Here, I couldn't do without the alias for t2 (well, I could, but it would involve moving the subquery out into another variable declared before the main query, but I'm trying to go for fluency, here).

After seeing the "'t1' isn't a member of 'Table'" message enough times, I finally realized that the secret was in the outerKeySelector parameter to Join. LINQ was simply looking for a property called t1 (or whatever) that was a member of the argument to this lambda. My outerKeySelector parameters were both declared like this:

FakeFunc<Table, Projection> outerKeySelector

The Table class certainly didn't have a property called t1, or any other alias. How could I add this property? If I were using C# 4, I could maybe use dynamic to do this, but if I were using C# 4, then the entire design of this would be different (and I do plan to redo this later in C# 4, for .NET 4.0 clients only, taking full advantage of dynamic typing to provide column projections as actual properties of the tables). In .NET 2.0, though, I have no dynamic types. So how could I make a type that had the table aliases as properties?

What a minute. Hold the phone. The resultSelector already returns one to me! Somehow I need to hold on to this object and pass it to the outerKeySelector in the next join. But how? I can't just store it in my JoinedTable class, because that class won't know how to cast it.

That's when it hit me. I needed a generic intermediate class to hold these intermediate join results. It would store a reference to the JoinedTable instance describing the actual join, as well as a reference to this anonymous type that held the aliases. Eureka!

The final, fully functional version of the code, then, adds this class:

class IntermediateJoin<T>{
    readonly JoinedTable table;
    readonly T           aliases;

    public IntermediateJoin(JoinedTable table, T aliases){
        this.table   = table;
        this.aliases = aliases;
    }

    public TableView Join(Table inner,
                          FakeFunc<T, Projection> outerKeySelector,
                          FakeFunc<Table, Projection> innerKeySelector,
                          FakeFunc<T, Table, Projection[]> resultSelector){
        var join = new JoinedTable(table, inner,
            new EqualsCondition(outerKeySelector(aliases), innerKeySelector(inner)));
        return join.Select(resultSelector(aliases, inner));
    }
    public IntermediateJoin<U> Join<U>(Table inner,
                                       FakeFunc<T, Projection> outerKeySelector,
                                       FakeFunc<Table, Projection> innerKeySelector,
                                       FakeFunc<T, Table, U> resultSelector){
        var join = new JoinedTable(table, inner,
            new EqualsCondition(outerKeySelector(aliases), innerKeySelector(inner)));
        var newAliases = resultSelector(aliases, inner);
        return new IntermediateJoin<U>(join, newAliases);
    }
}

and this method to the Table class:

public IntermediateJoin<T> Join<T>(Table inner,
                      FakeFunc<Table, Projection> outerKeySelector,
                      FakeFunc<Table, Projection> innerKeySelector,
                      FakeFunc<Table, Table, T> resultSelector){
    var join = new JoinedTable(this, inner,
        new EqualsCondition(outerKeySelector(this), innerKeySelector(inner)));
    var x = resultSelector(this, inner);
    return new IntermediateJoin<T>(join, x);
}

This provides fully functional join syntax!*

Thanks again to Tomas Petricek for taking time to read and understand my question, and to give me a thoughtful answer.

* GroupJoin and SelectMany are still to come, but I think I know the secrets well enough to tackle these, now.

P Daddy