views:

383

answers:

1

Hi I am trying to learn Linq, so I am not sure if this can be done.

I am working on an import project So I decided to import data using DataSets.

My challenge at this point: Having 2 DataTables with different schema, one of which contains my destination schema, and the other one my source schema.

What I need to do is perform some column matching where I can identify my source columns that are "somewhat similar" to my destination columns. I am looking for something at this point where if any part of destination column name is contained in the source, it's a possible match I don't know of any way to determine likeliness.

For example source [firstname, lastname, address] - > destination [fname, lname, addr1]

So, is LINQ a potential candidate for this job? Or regular expressions? I started with this

Having a source DataTable dt

var Lcols = from c in dt.Columns.Cast<System.Data.DataColumn>()
select c.ColumnName;

I am not sure where to go from here...

Thank you!

+1  A: 

LINQ could be a candidate here, but I think what you need to define (if you haven't done so already) is how columns can be similar. What is the cutoff point? Once you have the conditions under which you can map one column to another, you can apply those conditions to the cartesian product of the two sets (the columns from the source is the first set, while the second is the columns from the destination) to get a list of potential matches.


Assuming you had the source and destination columns, like so:

IEnumerable<DataColumn> sourceColumns = 
  sourceDt.Columns.Cast<System.Data.DataColumn>();
IEnumerable<DataColumn> destColumns =
  destDt.Columns.Cast<System.Data.DataColumn>();

And an IDictionary<string, string> which mapped names of the columns in the source to names of columns in the destination, you could do this:

var map =
  from s in sourceColumns
  from d in destColumns
where
  // colMap is the IDictionary<string, string>
  // The map has to contain the key.
  colMap.ContainsKey(s.ColumnName) &&

  // The destination column has to match the column map.
  // The OrdinalIgnoreCase option means to ignore case.
  string.Compare(colMap[s.ColumnName], 
      d.ColumnName, CompareOptions.OrdinalIgnoreCase) == 0
select
  new { SourceColumn = s, DestinationColumn = d };

And that will give you the possible mappings based on the items in the dictionary.

casperOne
That's an interesting idea - To be able to select every possible combination from each column name on both tables and then do a join. I am not sure how to do this though. I am thinking a series of loops to build my set, but there must be a more efficient way.
kiev
suppose that I had a hashtable or table that held most common source column names and the destination, how would I build my query then?
kiev
@kiev: Appended answer to reflect comments.
casperOne