views:

336

answers:

2

Scenario

I need to build a table of data from a list of objects given a specification. In example, I have a list/array of Order objects, and a string specification detailing what the data table should contain e.g "ID;Date;Customer.ID;Customer.Name;Orderlines.Product.ID;Orderlines.Quantity;Orderlines.UnitPrice".

The order class class contains a list (detail) of Orderlines and the Orderline class a reference to a Product and so on. A very object oriented design by all means.

I need to create a generic procedure that takes a list of objects and a string specification and then finds all the joins. E.g AddToDataTableWithJoins(DataTable table, object[] objects, string specification).

If there exists two orders in the array, each with three orderlines the result would be a datatable with 6 rows.

e.g

{1,'2009-12-12 00:00',14,'John','DRY14',12.00,19.99}
{1,'2009-12-12 00:00',14,'John','DRY15',9.00,12.00}
{1,'2009-12-12 00:00',14,'John','DRY16',3,3.00}
{2,'2009-12-13 00:00',17,'Mike','ALR',10.00,16.00}
{2,'2009-12-13 00:00',17,'Mike','BBR',1.00,11.50}
{2,'2009-12-13 00:00',17,'Mike','PPQ',4,6.00}

But then again, the Order class may have more than one list (detail) and I must admit, that even though I'm familiar with reflection and simple recursion I'm at a loss on this on.

Any advice on how to create this algorithm is greatly appreciated.

Ideas

A restriction must be implemented so that no more than one list exists in each level of the specification, and no list exists in a different branch. e.g If the Customer class has defined a list of Order objects the following specification cannot be allowed: "ID;Date;Customer.ID;Customer.Orders.ID;Orderlines.Product.ID".

Then I believe, the following approach must be used:

  1. Determine the branch that contains one or more one-to-many relationships.
  2. Traverse every root object in the collection (the Order objects).
  3. For every property in the root object, store the values of every property not involved in the one-to-many relationships in an array.
  4. Use recursion and traverse every object in the child collection copying the array.
  5. When reaching the outermost 'node' add a new row in the DataTable.

These points may be revised as they are only thoughts at this point, but I think I'm close to something.

Thanks, Stefan

+1  A: 

This sounds to me more like a flattening projection than either a union or a join. If this is the case, you should be able to do something like this:

var q = from o in orders
        from ol in o.OrderLines
        select new { o.Id, o.Date, o.Customer.Name, ol.Product.Id, ol.Quantity }

(I left out some properties in the projection, but you should get the general idea)

This will give you an IEnumerable of an anonymous type, and you can now loop through it to print out the data (or whatever you want to do):

foreach(var item in q)
{
    Console.Write(item.Id);
    Console.Write(item.Date);
    // etc.
}
Mark Seemann
It might indeed be a flattening projection if that's what your Linq query is. And that query looks much like what I'm trying to do, except it ain't generic. I haven't much experience with Linq I'm afraid.
Stefan
A: 

A rough outline, this is pseudo code:

void AddToDataTableWithJoins(DataTable table, object[] objects,
  string specification)
{
  // 1. Split specification into parts on semicolon separator
  string[] specificationParts = ...

  // 2. Split parts into name lists (split on dot)
  string[][] specificationPartsNameLists = ...

  // 3. Set up columns (use first object's field types as example)
  for (int c=0; c<specificationParts.length; c++) {
    string mungedSpecPart = // might replace "." with something, does "_" work?
    table.Columns.Add(mungedSpecPart,
      getTypeForPath(specificationPartsNameLists[c],
      objects[0]));
  }

  // 4. Set up row values container
  object[] rowItems = new object[specificationParts.length];

  for (int d=0; d < objects.length; d++) {
    object obj = objects[d];
    for (int c=0; c < specificationParts.length; c++) {
      // 5. Add row values
      rowItems[c] = getValueForPath(specificationPartsNameLists[c], obj);
    }
    // 6. Invoke row add
    SomeInvokerFramework.invoke(table.Rows, "Add", rowItems);
  }
  // 7. Return
}

object getTypeForPath(string[] path, object inObject) {
  // do reflection-ey stuff to retrieve named data path and return type
}

object getValueForPath(string[] path, object object) {
  // do reflection-ey stuff to retrieve named data path and return value
}

You might also want to add error checking / handling for if types of later object's fields mismatch or fields are not present (!) or objects are null. And you might want to add type check assertions as you proceed through the rows.

The code could search through all objects til it finds a non-NULL field for a column, to infer column type from (if you want to start supporting NULLs). Bear in mind that the type cannot be set up for a field if it is NULL in all rows as the routine then has nothing to infer type from. If you need to suport NULLs you may need to supply an array of types, or default an all-NULL column to type string or something.

Edit: Reformatted source code. Changed typeof call to call to getTypeForPath().

Edit: You added the requirement to do a SQL-join-like operation, basically where a data path includes a one-to-many join to repeat the row for each of child objects in the array for the one-to-many relationship. Presumably if there are several you want to sort by left-most one-to-many relationship first, then the second left-most etc.

Something like this, I suggest. As I said before this is just pseudo-code, and I'm really trying to illustrate the shape of the function and an approach, as its quite a hard problem, not write it for you. The following code probably contains errors and probably has a few mistakes in it:

void AddToDataTableWithJoins(DataTable table, object[] objects,
  string specification)
{
  // 1. Split specification into parts on semicolon separator
  string[] specificationParts = ...

  // 2. Split parts into name lists (split on dot)
  string[][] specificationPartsNameLists = ...

  // 2a. Set up data for whether field is simple or to be iterated
  boolean[][] specPartIsToBeIterated = ...

  // 3. Set up columns (use first object's field types as example)
  for (int c=0; c<specificationParts.length; c++) {
    string mungedSpecPart = // might replace "." with something, does "_" work?
    table.Columns.Add(mungedSpecPart,
      getTypeForPath(specificationPartsNameLists[c],
      objects));
    // 3a. set up should iterate flags
    for (int d=1; d < specificationPartsNameLists[c].length; d++) {
      string[] temp = new string[e];
      for (int e=0; e < d; e++) temp[e] = specificationPartsNameLists[c][e];
      specPartIsToBeIterated[c][d] = isDataPathOneToMany(temp, objects);
    }
  }

  // 4. Set up row values container
  object[] rowItems = new object[specificationParts.length];

  // 4a. Set up index positions container for one-to-many subelement iterations
  int[] rowIndices = new int[specificationParts.length];

  for (int d=0; d < objects.length; d++) {
    // 4b. Set up one-to-many position counters
    for (int e=0; e < rowIndices.length; e++) rowIndices[e] = 0;

    // 4c. Start subscript iterator loop
    for (;;) {

      object obj = objects[d];
      for (int c=0; c < specificationParts.length; c++) {
        // 5. Add row values
        rowItems[c] = getValueForPath(specificationPartsNameLists[c],
          rowIndices, obj);
      }
      // 6. Invoke row add
      SomeInvokerFramework.invoke(table.Rows, "Add", rowItems);

      // 6a. Work out whether we need to iterate more rows
      for (int e=rowIndices.length-1; e>=0; e--) {
        boolean domore=false;
        if (specPartIsToBeIterated[e]) {
          string[] pathToGetIndex = // calc string[] to get count of objects
          int count = getCountForPath(pathToGetIndex, rowIndices, obj);
          if (rowIndices[e]<(count-1)) {
            rowIndices[e]++; domore=true; break;
            for (e++; e<rowIndices.length; e++) {
              if (specPartIsToBeIterated[e]) rowIndices[e]=0;
            }
          }
        }
      }
      // 6b. Break to next object if we're done on this one
      if (!domore) break;
    }
  }
  // 7. Return
}

object getTypeForPath(string[] path, object[] inObjects) {
  // do reflection-ey stuff to retrieve named data path and return type
}

boolean isDataPathOneToMany(string[] path, object[] inObjects) {
  // do reflection-ey stuff to retrieve named data path and return type
}

object getValueForPath(string[] path, int[] rowIndices, object object) {
  // do reflection-ey stuff to retrieve named data path and return value
  // where there are one-to-many relationships corresponding item in rowIndices
  // array identifies which subelement in the array
  // etc
}

object getCountForPath(string[] path, int[] rowIndices, object object) {
  // do reflection-ey stuff to retrieve named data path and return count
  // where there are one-to-many relationships corresponding item in rowIndices
  // array identifies which subelement in the array.  for convenience function
  // accepts an over-long rowIndices array
}

Edit: Added "and probably has a few mistakes in it" :-)

martinr
As far as I can tell, this code only creates a single row for each 'master'. The easy part of this project is traversing every master and detail objects, the hard part (in my opinion) is creating the master/detail/detail.... rows (if that makes sense).
Stefan
It doesn't. I guessed what you wanted, I think I was at least close though union threw me off for a bit as its a term most often used in SQL, and now I'm more confused. Try adding a picture of the UI you want.
martinr
Well, there isn't a UI for the result as such, but rather an UI to create the specification. I'll edit my post and add some pseudo code myself of how I think the algorithm must work.
Stefan