views:

163

answers:

3

Hello all!

I have a method which in essence converts a datatable to a list of objects which I call 'Bags'. This code is called many times per session, with many sessions concurrently running and with sometimes thousands of rows. Because of this I need it to be as quick as possible. I have an xml file which contains the DataColumn to Property mappings. The main method to optimize is ConvertRowToBag - the type parameter passed in is a type which derives from BagBase.

It's a long bit of code, but any tips would be much appreciated.

public class BagBase
{
    /// <summary>
    /// Dictionary of properties and names
    /// </summary>
    private static Dictionary<string, PropertyInfo> propertyDictionary = new Dictionary<string, PropertyInfo>();

    /// <summary>
    /// Table of column/property mappings
    /// </summary>
    private static DataTable mappings = new DataTable("Mappings");

    /// <summary>
    /// Returns true if the map exists
    /// </summary>
    /// <param name="columnName"></param>
    /// <param name="type"></param>
    /// <returns></returns>
    private static bool MappingExists(string columnName, Type type)
    {
        DataRow [] rows = BagBase.mappings.Select(String.Format("Type = '{0}' and ColumnName = '{1}'", type.Name, columnName));
        return (rows != null && rows.Length > 0);
    }

    /// <summary>
    /// Converts the table to bags.
    /// </summary>
    /// <param name="table">The table.</param>
    /// <param name="outputType">Type of the output.</param>
    /// <returns></returns>
    protected static List<BagBase> ConvertTableToBags(DataTable table, Type outputType)
    {
        Trace.TraceInformation(String.Format("ConvertTableToBags : table={0} Type={1}", table.TableName, outputType.Name));

        // Create an empty list
        List<BagBase> result = new List<BagBase>();

        // Iterate through the rows
        foreach (DataRow row in table.Rows)
        {
            // Add to the list
            result.Add(ConvertRowToBag(outputType, row));
        }

        Trace.TraceInformation("ConvertTableToBags Finished.");

        return result;
    }

    /// <summary>
    /// Converts the row to bag.
    /// </summary>
    /// <param name="outputType">Type of the output.</param>
    /// <param name="row">The row.</param>
    /// <returns></returns>
    protected static BagBase ConvertRowToBag(Type outputType, DataRow row)
    {
        // Create an instance of the child class and store in the base
        BagBase bag = Activator.CreateInstance(outputType) as BagBase;

        // Iterate through the columns
        foreach (DataColumn column in row.Table.Columns)
        {
            // If this column has been mapped
            if (BagBase.MappingExists(column.ColumnName, outputType))
            {
                PropertyInfo property;

                string columnProperty = String.Format("{0}={1}", column.ColumnName, outputType.Name);

                // Get the property as defined in the map
                if (!propertyDictionary.ContainsKey(columnProperty))
                {
                    // Get the property
                    property = outputType.GetProperty(BagBase.GetColumnMapping(column.ColumnName, outputType));

                    // Add the property to the dictionary
                    propertyDictionary.Add(columnProperty, property);
                }
                else
                {
                    property = propertyDictionary[columnProperty];
                }

                if (property != null)
                {
                    if (!row.IsNull(column))
                    {
                        // Set the value to the in the table
                        if (property.PropertyType.BaseType != null && property.PropertyType.BaseType == typeof(Enum))
                        {
                            if (column.DataType != typeof(String))
                            {
                                property.SetValue(bag, Enum.ToObject(property.PropertyType, row[column]), null);
                            }
                            else
                            {
                                property.SetValue(bag, Enum.ToObject(property.PropertyType, Convert.ToChar(row[column])), null);
                            }
                        }
                        else if (property.PropertyType == typeof(DateTime?))
                        {
                            property.SetValue(bag, (DateTime?)row[column], null);
                        }
                        else
                        {
                            property.SetValue(bag, Convert.ChangeType(row[column], property.PropertyType), null);
                        }
                    }
                    else // No nulls
                    {
                        if (column.DataType == typeof(String))
                        {
                            property.SetValue(bag, String.Empty, null);
                        }
                    }

                    // Generate the unique class.property name
                    string propertyKey = String.Format("{0}.{1}", outputType.Name, property.Name);
                    if (!columnCaptions.ContainsKey(propertyKey))
                    {
                        // Add to the caption map
                        columnCaptions.Add(propertyKey, column.Caption);
                    }
                }
            }
            else
            {
                // If this column isn't mapped, add it to Other information
                if (bag.OtherInformation == null)
                {
                    bag.OtherInformation = new Dictionary<string, string>();
                }

                bag.OtherInformation.Add(column.ColumnName, !row.IsNull(column) ? row[column].ToString() : String.Empty);
            }
        }

        return bag;
    }
}
+5  A: 

Use a profiler. There's no way for us to know what actually takes the most time in your code.

There's just really no use trying to optimize line-by-line and many people seem to not know this. Computers are always waiting for a resource, sometimes it's CPU or disk IO, and often it's the user. To make any piece of code faster, find the bottlenecks using a profiler and work on making that code faster.

Ben S
You can earn so much reputation on this site just by saying "use a profiler" for this type of question, because it's *always* the right answer. Have another 10 points.
Greg Beech
+1 "use a profiler" is a correct, but not directly helpful answer.
Chris Ballance
+1, I think it's helpful. A lot of people obviously don't realize it.
erikkallen
Having spent the better part of the last two weeks looking for bottlenecks in our core product at work, I can say with certainty that "Use a profiler" is absolutely entirely the best answer to this type of question. I've refactored and optimized about 20 odd pieces of code and in only 1 of those I was spot on with my own intuition before I ran the profiler, the rest was just "Whoah, it spends time *there*?"
Lasse V. Karlsen
@Lasse V. Karlsen - Do you mind if I ask which profiler you used?
Philip Wallace
@PhilipW - My currently preferred profiler is Red Gate Ants. It's really easy to use and gives you line-level performance information.
Greg Beech
A: 

Remove the foreach loop and replace it with another type.

Woot4Moo
+2  A: 

Aside from the general advise of "use a profiler", there is probably not one bottleneck but either a series of slow calls or the very structure of the procedure is creating unnecessary iterations. At a glance:

  • The Select against a datatable is generally not very performant.
  • Reflection carries with it a lot of overhead, it looks like you are dependent on it but if you could limit its scope you will probably get better overall performance.
flatline
It sounds like you actually read the code! Thanks for the suggestions... :)
Philip Wallace