views:

731

answers:

1

I have a List<MyObject> with a million elements. (It is actually a SubSonic Collection but it is not loaded from the database).

I'm currently using SqlBulkCopy as follows:

private string FastInsertCollection(string tableName, DataTable tableData)
{
    string sqlConn = ConfigurationManager.ConnectionStrings[SubSonicConfig.DefaultDataProvider.ConnectionStringName].ConnectionString;
    using (SqlBulkCopy s = new SqlBulkCopy(sqlConn, SqlBulkCopyOptions.TableLock))
    {
        s.DestinationTableName = tableName;
        s.BatchSize = 5000;
        s.WriteToServer(tableData);
        s.BulkCopyTimeout = SprocTimeout;
        s.Close();
    }
    return sqlConn;
}

I use SubSonic's MyObjectCollection.ToDataTable() to build the DataTable from my collection. However, this duplicates objects in memory and is inefficient. I'd like to use the SqlBulkCopy.WriteToServer method that uses an IDataReader instead of a DataTable so that I don't duplicate my collection in memory.

What's the easiest way to get an IDataReader from my list? I suppose I could implement a custom data reader (like here http://blogs.microsoft.co.il/blogs/aviwortzel/archive/2008/05/06/implementing-sqlbulkcopy-in-linq-to-sql.aspx) , but there must be something simpler I can do without writing a bunch of generic code.

Edit: It does not appear that one can easily generate an IDataReader from a collection of objects. Accepting current answer even though I was hoping for something built into the framework.

+4  A: 

Get the latest version from the code on this post

Nothing like code churn in plain sight: Here is a pretty complete implementation. You can instantiate an IDataReader over IList IEnumerable, IEnumerable (ergo IQueryable). There is no compelling reason to expose a generic type parameter on the reader and by omitting it, I can allow IEnumerable<'a> (anonymous types). See tests.

The source, less xmldocs, is short enough to include here with a couple tests. The rest of the source, with xmldocs, and tests is here under Salient.Data.


using System;
using System.Linq;
using NUnit.Framework;

namespace Salient.Data.Tests
{
    [TestFixture]
    public class EnumerableDataReaderEFFixture
    {
        [Test]
        public void TestEnumerableDataReaderWithIQueryableOfAnonymousType()
        {
            var ctx = new NorthwindEntities();

            var q =
                ctx.Orders.Where(o => o.Customers.CustomerID == "VINET").Select(
                    o =>
                    new
                        {
                            o.OrderID,
                            o.OrderDate,
                            o.Customers.CustomerID,
                            Total =
                        o.Order_Details.Sum(
                        od => od.Quantity*((float) od.UnitPrice - ((float) od.UnitPrice*od.Discount)))
                        });

            var r = new EnumerableDataReader(q);
            while (r.Read())
            {
                var values = new object[4];
                r.GetValues(values);
                Console.WriteLine("{0} {1} {2} {3}", values);
            }
        }
    }
}

using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using NUnit.Framework;

namespace Salient.Data.Tests
{
    public class DataObj
    {
        public string Name { get; set; }
        public int Age { get; set; }
    }

    [TestFixture]
    public class EnumerableDataReaderFixture
    {

        private static IEnumerable<DataObj> DataSource
        {
            get
            {
                return new List<DataObj>
                           {
                               new DataObj {Name = "1", Age = 16},
                               new DataObj {Name = "2", Age = 26},
                               new DataObj {Name = "3", Age = 36},
                               new DataObj {Name = "4", Age = 46}
                           };
            }
        }

        [Test]
        public void TestIEnumerableCtor()
        {
            var r = new EnumerableDataReader(DataSource, typeof(DataObj));
            while (r.Read())
            {
                var values = new object[2];
                int count = r.GetValues(values);
                Assert.AreEqual(2, count);

                values = new object[1];
                count = r.GetValues(values);
                Assert.AreEqual(1, count);

                values = new object[3];
                count = r.GetValues(values);
                Assert.AreEqual(2, count);

                Assert.IsInstanceOf(typeof(string), r.GetValue(0));
                Assert.IsInstanceOf(typeof(int), r.GetValue(1));

                Console.WriteLine("Name: {0}, Age: {1}", r.GetValue(0), r.GetValue(1));
            }
        }

        [Test]
        public void TestIEnumerableOfAnonymousType()
        {
            // create generic list
            Func<Type, IList> toGenericList =
                type => (IList)Activator.CreateInstance(typeof(List<>).MakeGenericType(new[] { type }));

            // create generic list of anonymous type
            IList listOfAnonymousType = toGenericList(new { Name = "1", Age = 16 }.GetType());

            listOfAnonymousType.Add(new { Name = "1", Age = 16 });
            listOfAnonymousType.Add(new { Name = "2", Age = 26 });
            listOfAnonymousType.Add(new { Name = "3", Age = 36 });
            listOfAnonymousType.Add(new { Name = "4", Age = 46 });

            var r = new EnumerableDataReader(listOfAnonymousType);
            while (r.Read())
            {
                var values = new object[2];
                int count = r.GetValues(values);
                Assert.AreEqual(2, count);

                values = new object[1];
                count = r.GetValues(values);
                Assert.AreEqual(1, count);

                values = new object[3];
                count = r.GetValues(values);
                Assert.AreEqual(2, count);

                Assert.IsInstanceOf(typeof(string), r.GetValue(0));
                Assert.IsInstanceOf(typeof(int), r.GetValue(1));

                Console.WriteLine("Name: {0}, Age: {1}", r.GetValue(0), r.GetValue(1));
            }
        }

        [Test]
        public void TestIEnumerableOfTCtor()
        {
            var r = new EnumerableDataReader(DataSource);
            while (r.Read())
            {
                var values = new object[2];
                int count = r.GetValues(values);
                Assert.AreEqual(2, count);

                values = new object[1];
                count = r.GetValues(values);
                Assert.AreEqual(1, count);

                values = new object[3];
                count = r.GetValues(values);
                Assert.AreEqual(2, count);

                Assert.IsInstanceOf(typeof(string), r.GetValue(0));
                Assert.IsInstanceOf(typeof(int), r.GetValue(1));

                Console.WriteLine("Name: {0}, Age: {1}", r.GetValue(0), r.GetValue(1));
            }
        } 
        // remaining tests omitted for brevity
    }
}

/*!
 * Project: Salient.Data
 * File   : EnumerableDataReader.cs
 * http://spikes.codeplex.com
 *
 * Copyright 2010, Sky Sanders
 * Dual licensed under the MIT or GPL Version 2 licenses.
 * See LICENSE.TXT
 * Date: Sat Mar 28 2010 
 */


using System;
using System.Collections;
using System.Collections.Generic;

namespace Salient.Data
{
    /// <summary>
    /// Creates an IDataReader over an instance of IEnumerable&lt;> or IEnumerable.
    /// Anonymous type arguments are acceptable.
    /// </summary>
    public class EnumerableDataReader : ObjectDataReader
    {
        private readonly IEnumerator _enumerator;
        private readonly Type _type;
        private object _current;

        /// <summary>
        /// Create an IDataReader over an instance of IEnumerable&lt;>.
        /// 
        /// Note: anonymous type arguments are acceptable.
        /// 
        /// Use other constructor for IEnumerable.
        /// </summary>
        /// <param name="collection">IEnumerable&lt;>. For IEnumerable use other constructor and specify type.</param>
        public EnumerableDataReader(IEnumerable collection)
        {
            // THANKS DANIEL!
            foreach (Type intface in collection.GetType().GetInterfaces())
            {
                if (intface.IsGenericType && intface.GetGenericTypeDefinition() == typeof (IEnumerable<>))
                {
                    _type = intface.GetGenericArguments()[0]; 
                }
            }

            if (_type ==null && collection.GetType().IsGenericType)
            {
                _type = collection.GetType().GetGenericArguments()[0];

            }


            if (_type == null )
            {
                throw new ArgumentException(
                    "collection must be IEnumerable<>. Use other constructor for IEnumerable and specify type");
            }

            SetFields(_type);

            _enumerator = collection.GetEnumerator();

        }

        /// <summary>
        /// Create an IDataReader over an instance of IEnumerable.
        /// Use other constructor for IEnumerable&lt;>
        /// </summary>
        /// <param name="collection"></param>
        /// <param name="elementType"></param>
        public EnumerableDataReader(IEnumerable collection, Type elementType)
            : base(elementType)
        {
            _type = elementType;
            _enumerator = collection.GetEnumerator();
        }

        /// <summary>
        /// Helper method to create generic lists from anonymous type
        /// </summary>
        /// <param name="type"></param>
        /// <returns></returns>
        public static IList ToGenericList(Type type)
        {
            return (IList) Activator.CreateInstance(typeof (List<>).MakeGenericType(new[] {type}));
        }

        /// <summary>
        /// Return the value of the specified field.
        /// </summary>
        /// <returns>
        /// The <see cref="T:System.Object"/> which will contain the field value upon return.
        /// </returns>
        /// <param name="i">The index of the field to find. 
        /// </param><exception cref="T:System.IndexOutOfRangeException">The index passed was outside the range of 0 through <see cref="P:System.Data.IDataRecord.FieldCount"/>. 
        /// </exception><filterpriority>2</filterpriority>
        public override object GetValue(int i)
        {
            if (i < 0 || i >= Fields.Count)
            {
                throw new IndexOutOfRangeException();
            }

            return Fields[i].Getter(_current);
        }

        /// <summary>
        /// Advances the <see cref="T:System.Data.IDataReader"/> to the next record.
        /// </summary>
        /// <returns>
        /// true if there are more rows; otherwise, false.
        /// </returns>
        /// <filterpriority>2</filterpriority>
        public override bool Read()
        {
            bool returnValue = _enumerator.MoveNext();
            _current = returnValue ? _enumerator.Current : _type.IsValueType ? Activator.CreateInstance(_type) : null;
            return returnValue;
        }
    }
}

// <copyright project="Salient.Data" file="ObjectDataReader.cs" company="Sky Sanders">
// This source is a Public Domain Dedication.
// Please see http://spikes.codeplex.com/ for details.   
// Attribution is appreciated
// </copyright> 
// <version>1.0</version>


using System;
using System.Collections.Generic;
using System.Data;
using Salient.Reflection;

namespace Salient.Data
{
    public abstract class ObjectDataReader : IDataReader
    {
        protected bool Closed;
        protected IList<DynamicProperties.Property> Fields;

        protected ObjectDataReader()
        {
        }

        protected ObjectDataReader(Type elementType)
        {
            SetFields(elementType);
            Closed = false;
        }

        #region IDataReader Members

        public abstract object GetValue(int i);
        public abstract bool Read();

        #endregion

        #region Implementation of IDataRecord

        public int FieldCount
        {
            get { return Fields.Count; }
        }

        public virtual int GetOrdinal(string name)
        {
            for (int i = 0; i < Fields.Count; i++)
            {
                if (Fields[i].Info.Name == name)
                {
                    return i;
                }
            }

            throw new IndexOutOfRangeException("name");
        }

        object IDataRecord.this[int i]
        {
            get { return GetValue(i); }
        }

        public virtual bool GetBoolean(int i)
        {
            return (Boolean) GetValue(i);
        }

       public virtual byte GetByte(int i)
        {
            return (Byte) GetValue(i);
        }

        public virtual char GetChar(int i)
        {
            return (Char) GetValue(i);
        }

        public virtual DateTime GetDateTime(int i)
        {
            return (DateTime) GetValue(i);
        }

        public virtual decimal GetDecimal(int i)
        {
            return (Decimal) GetValue(i);
        }

        public virtual double GetDouble(int i)
        {
            return (Double) GetValue(i);
        }

        public virtual Type GetFieldType(int i)
        {
            return Fields[i].Info.PropertyType;
        }

        public virtual float GetFloat(int i)
        {
            return (float) GetValue(i);
        }

        public virtual Guid GetGuid(int i)
        {
            return (Guid) GetValue(i);
        }

        public virtual short GetInt16(int i)
        {
            return (Int16) GetValue(i);
        }

        public virtual int GetInt32(int i)
        {
            return (Int32) GetValue(i);
        }

        public virtual long GetInt64(int i)
        {
            return (Int64) GetValue(i);
        }

        public virtual string GetString(int i)
        {
            return (string) GetValue(i);
        }

        public virtual bool IsDBNull(int i)
        {
            return GetValue(i) == null;
        }

        object IDataRecord.this[string name]
        {
            get { return GetValue(GetOrdinal(name)); }
        }


        public virtual string GetDataTypeName(int i)
        {
            return GetFieldType(i).Name;
        }


        public virtual string GetName(int i)
        {
            if (i < 0 || i >= Fields.Count)
            {
                throw new IndexOutOfRangeException("name");
            }
            return Fields[i].Info.Name;
        }

        public virtual int GetValues(object[] values)
        {
            int i = 0;
            for (; i < Fields.Count; i++)
            {
                if (values.Length <= i)
                {
                    return i;
                }
                values[i] = GetValue(i);
            }
            return i;
        }

        public virtual IDataReader GetData(int i)
        {
            // need to think about this one
            throw new NotImplementedException();
        }

        public virtual long GetBytes(int i, long fieldOffset, byte[] buffer, int bufferoffset, int length)
        {
            // need to keep track of the bytes got for each record - more work than i want to do right now
            // http://msdn.microsoft.com/en-us/library/system.data.idatarecord.getbytes.aspx
            throw new NotImplementedException();
        }

        public virtual long GetChars(int i, long fieldoffset, char[] buffer, int bufferoffset, int length)
        {
            // need to keep track of the bytes got for each record - more work than i want to do right now
            // http://msdn.microsoft.com/en-us/library/system.data.idatarecord.getchars.aspx
            throw new NotImplementedException();
        }

        #endregion

        #region Implementation of IDataReader

        public virtual void Close()
        {
            Closed = true;
        }


        public virtual DataTable GetSchemaTable()
        {
            var dt = new DataTable();
            foreach (DynamicProperties.Property field in Fields)
            {
                dt.Columns.Add(new DataColumn(field.Info.Name, field.Info.PropertyType));
            }
            return dt;
        }

        public virtual bool NextResult()
        {
            throw new NotImplementedException();
        }


        public virtual int Depth
        {
            get { throw new NotImplementedException(); }
        }

        public virtual bool IsClosed
        {
            get { return Closed; }
        }

        public virtual int RecordsAffected
        {
            get
            {
                // assuming select only?
                return -1;
            }
        }

        #endregion

        #region Implementation of IDisposable

        public virtual void Dispose()
        {
            Fields = null;
        }

        #endregion

        protected void SetFields(Type elementType)
        {
            Fields = DynamicProperties.CreatePropertyMethods(elementType);
        }
    }
}

// <copyright project="Salient.Reflection" file="DynamicProperties.cs" company="Sky Sanders">
// This source is a Public Domain Dedication.
// Please see http://spikes.codeplex.com/ for details.   
// Attribution is appreciated
// </copyright> 
// <version>1.0</version>
using System;
using System.Collections.Generic;
using System.Reflection;
using System.Reflection.Emit;

namespace Salient.Reflection
{
    /// <summary>
    /// Gets IL setters and getters for a property.
    /// 
    /// started with http://jachman.wordpress.com/2006/08/22/2000-faster-using-dynamic-method-calls/
    /// </summary>
    public static class DynamicProperties
    {
        #region Delegates

        public delegate object GenericGetter(object target);

        public delegate void GenericSetter(object target, object value);

        #endregion

        public static IList<Property> CreatePropertyMethods(Type T)
        {
            var returnValue = new List<Property>();

            foreach (PropertyInfo prop in T.GetProperties())
            {
                returnValue.Add(new Property(prop));
            }
            return returnValue;
        }


        public static IList<Property> CreatePropertyMethods<T>()
        {
            var returnValue = new List<Property>();

            foreach (PropertyInfo prop in typeof (T).GetProperties())
            {
                returnValue.Add(new Property(prop));
            }
            return returnValue;
        }


        /// <summary>
        /// Creates a dynamic setter for the property
        /// </summary>
        /// <param name="propertyInfo"></param>
        /// <returns></returns>
        public static GenericSetter CreateSetMethod(PropertyInfo propertyInfo)
        {
            /*
            * If there's no setter return null
            */
            MethodInfo setMethod = propertyInfo.GetSetMethod();
            if (setMethod == null)
                return null;

            /*
            * Create the dynamic method
            */
            var arguments = new Type[2];
            arguments[0] = arguments[1] = typeof (object);

            var setter = new DynamicMethod(
                String.Concat("_Set", propertyInfo.Name, "_"),
                typeof (void), arguments, propertyInfo.DeclaringType);
            ILGenerator generator = setter.GetILGenerator();
            generator.Emit(OpCodes.Ldarg_0);
            generator.Emit(OpCodes.Castclass, propertyInfo.DeclaringType);
            generator.Emit(OpCodes.Ldarg_1);

            if (propertyInfo.PropertyType.IsClass)
                generator.Emit(OpCodes.Castclass, propertyInfo.PropertyType);
            else
                generator.Emit(OpCodes.Unbox_Any, propertyInfo.PropertyType);

            generator.EmitCall(OpCodes.Callvirt, setMethod, null);
            generator.Emit(OpCodes.Ret);

            /*
            * Create the delegate and return it
            */
            return (GenericSetter) setter.CreateDelegate(typeof (GenericSetter));
        }


        /// <summary>
        /// Creates a dynamic getter for the property
        /// </summary>
        /// <param name="propertyInfo"></param>
        /// <returns></returns>
        public static GenericGetter CreateGetMethod(PropertyInfo propertyInfo)
        {
            /*
            * If there's no getter return null
            */
            MethodInfo getMethod = propertyInfo.GetGetMethod();
            if (getMethod == null)
                return null;

            /*
            * Create the dynamic method
            */
            var arguments = new Type[1];
            arguments[0] = typeof (object);

            var getter = new DynamicMethod(
                String.Concat("_Get", propertyInfo.Name, "_"),
                typeof (object), arguments, propertyInfo.DeclaringType);
            ILGenerator generator = getter.GetILGenerator();
            generator.DeclareLocal(typeof (object));
            generator.Emit(OpCodes.Ldarg_0);
            generator.Emit(OpCodes.Castclass, propertyInfo.DeclaringType);
            generator.EmitCall(OpCodes.Callvirt, getMethod, null);

            if (!propertyInfo.PropertyType.IsClass)
                generator.Emit(OpCodes.Box, propertyInfo.PropertyType);

            generator.Emit(OpCodes.Ret);

            /*
            * Create the delegate and return it
            */
            return (GenericGetter) getter.CreateDelegate(typeof (GenericGetter));
        }

        #region Nested type: Property

        public class Property
        {
            public GenericGetter Getter;
            public PropertyInfo Info;
            public GenericSetter Setter;

            public Property(PropertyInfo info)
            {
                Info = info;
                Setter = CreateSetMethod(info);
                Getter = CreateGetMethod(info);
            }
        }

        #endregion

        ///// <summary>
        ///// An expression based Getter getter found in comments. untested.
        ///// Q: i don't see a reciprocal setter expression?
        ///// </summary>
        ///// <typeparam name="T"></typeparam>
        ///// <param name="propName"></param>
        ///// <returns></returns>
        //public static Func<T> CreateGetPropValue<T>(string propName)
        //{
        //    var param = Expression.Parameter(typeof(object), "container");
        //    var func = Expression.Lambda(
        //    Expression.Convert(Expression.PropertyOrField(Expression.Convert(param, typeof(T)), propName), typeof(object)), param);
        //    return (Func<T>)func.Compile();
        //}
    }
}

Sky Sanders
Looks like this code does exactly what I need it to do, but I am surprised that there isn't a solution that does not require creating your own datareader.
Jason Kealey
well let me know if it works for ya. BTW the subsonic collections; they do implement IList<T>?
Sky Sanders
Yes; you can do .GetList() on a subsonic collection to get to the inner IList<T>
Jason Kealey
Verdict: I initially got some "Unhandled Exception: System.InvalidOperationException: The given ColumnMapping does not match up with any column in the source or destination." using my SubSonic objects directly. However, the import worked fine using regular objects that didn't have SubSonic's extra properties.
Jason Kealey
Jason, go get the full implementation linked at the head of the answer and see if that works. I implemented all but the most irrelevant methods of IDataReader. It might work. Also, If you could break out a test including the subsonic layer and some DDL I am interested to see if I can make this work. It is a good idea...
Sky Sanders
Jason Kealey
Glad I could help. I know you could have done it easily but it looked like a good thing to have around as I have run into the same issue with bulk insertion from memory before. just think of me as the shoemaker's elves.
Sky Sanders
A word of caution with this code: it only works with the SqlBulkCopy if the database fields are precisely in the right order, of the right type, none are missing, none are extra. A nice enhancement would allow the IDataReader to work on the collection given a target schema (and now I understand why it isn't built into the framework).
Jason Kealey
Also, it might be interesting to have it work on IEnumerable<T> instead of / in addition to IList<T> because one could pass it a linq statement that would be consumed as the reader goes through instead of loading a full List in memory.
Jason Kealey
Jason, the bulk copy behavior you are talking about is not related to the reader, ANY IDataRecord or Reader you push into a bulk copy must match exactly. Same as if you were supplying a file input. A first clue is that the only 'get' that bulk copy uses is GetValue(i). It does not query the schema at all, ever.
Sky Sanders
I updated my codebase and the new IEnumerable version works great. Excellent work! You should publish this in a blog post somewhere!
Jason Kealey
Thanks, Jason. Glad it helps.
Sky Sanders
The constructor which takes an IEnumerable<T> should be modified to get the generic argument of the interface, not the concrete type. This will let it work with enumerables returned from methods with `yield return` in them: `foreach (var intface in collection.GetType().GetInterfaces()) { if (intface.IsGenericType /* ... */ } }`
Daniel Plaisted
@Daniel, I had implemented that way, but after a few test cases I realized that there is absolutely no compelling reason to use a generic constraint and many reasons not to. It works just fine with sequences of any sort. I am feeding SqlBulkCopy from a yielded Xml stream at blazing speeds with a tiny footprint. Here is an example. http://meta.stackoverflow.com/questions/44330/do-you-want-to-quickly-and-easily-import-the-so-data-dump-into-your-sql-serverIt also accepts anonymous types. You can get the latest source from the project linked. cheers.BUt thanks for the feedback.
Sky Sanders
@Daniel, I think I misunderstood you. Let me ponder that one for a couple hours.
Sky Sanders
@Daniel + for you. Somehow I got the wrong impression from your comment. Indeed, augmenting the ctor by checking interfaces improves the usability of the class. Thank you so much.
Sky Sanders