tags:

views:

64

answers:

2

I want to perform a JOIN on two Datatables in a Dataset. For example, I am working on the AdventureWorks database. I need data from the tables [Person].[Address] and [Person].[StateProvince] and once the data is in the corresponding Datatables, I have to perform a JOIN on StateProvinceID column.

Data Structure -

Address {Address,A_StateID}
State {S_StateID,Name}

My search on the internet led me to a lot of examples which said I will have to use a Datarelation object in order to perform a JOIN. The examples I found were like -

DataColumn childcolumn = dsAdd.Tables["Address"].Columns["A_StateID"];
DataColumn parentcolumn = dsAdd.Tables["State"].Columns["S_StateID"];
DataRelation relStateID = new DataRelation("StateRel", parentcolumn, childcolumn,false);
dsAdd.Relations.Add(relStateID);

But, I have no idea what I have to do after I have added a DataRelation to perform the JOIN. So, I decided to do it myself -

    //I added a new column to my Address table obtain the State Name after performing the `JOIN`
    DataColumn A_State = new DataColumn("State");
    A_State.DataType = typeof(string);
    dsAdd.Tables["Address"].Columns.Add(A_State);

    foreach (DataRow dr in dsAdd.Tables["Address"].Rows)
    {           
        //for each row in the Address table I obtain the StateID
        string stateid = dr.ItemArray[1].ToString();
        string expression = "S_StateID =" + stateid;

        //Obtain the corresponding row from State table and update the value in new column in Address table 
        DataRow[] newdr;
        newdr = dsAdd.Tables["State"].Select(expression);
        string statename = newdr[0].ItemArray[1].ToString();
        dr.SetField("State", statename.ToString());
    }

    grdJoins.DataSource = dsAdd.Tables["Address"];
    grdJoins.DataBind();

This works fine but is there a better way to do this? How is a JOIN performed using a Datarelation object?

A: 

So if I understand what you're asking correctly, once you've added a DataRelation you can make a call to GetChildRows on a given row to get the joined data. The documentation has an example of what the code should look like.

R0MANARMY
+3  A: 

The best way to do this is to do it on the database. Joins are relatively expensive and slow operations; databases exist as entities separate from mere collections of tables for two reasons: enforcing safe concurrent access and to allow you do joins efficiently. They use all kinds of tricks like indexing, materialized views, caching, etc, to make sure most of the heavy lifting is already done for you.

The next best way is to use linq (to objects):

grdJoins.DataSource = dsAdd.Tables["Address"].AsEnumerable()
                .Join(dsAdd.Tables["State"].AsEnumerable(),
        add => add["A_StateID"], st => st["S_StateID"],
        (add, st) => new 
      { 
          Address = add["Address"],
          City = add["City"],
          StateProvince = add["StateProvince"],
          State = st["State"]
      });

grdJoins.DataBind();
Joel Coehoorn
Did you mean [Linq to DataSet](http://msdn.microsoft.com/en-us/library/bb386977.aspx) instead of Linq to Objects (since you're starting out with a DataSet instead of a regular collection)?
R0MANARMY
@R0MANARMY only in the sense that linq to dataset is really just a part of linq to objects. Note the call to AsEnumerable() - it's using the same pieces as any other linq to objects enumerable.
Joel Coehoorn
@R0MANARMY - linq to datasets is just a straight enumeration over a dataset. It's just a marketing name for what's already there. The linq to datasets samples all have to call AsEnumerale() as well.
Joel Coehoorn
Fair enough, thanks for clarifying. Thankfully I haven't had to use them very heavily (at all really).
R0MANARMY