tags:

views:

86

answers:

3

So, I'm importing data from one database into another. There are about 5,000 records (so nothing ridiculous, but not small enough to eyeball). Is there an easy way to automatically truncate data that is too long - specifically, varchar fields? I don't want truncation to be silent since the too-long fields will likely require attention, but it would be really nice if a name that is 2 characters too long wouldn't fail on insert, and throw a totally non-specific exception.

The solution I'd like to implement is one that will truncate the data, insert it, and log this. Has anyone else done something similar?

+2  A: 

Linq2Sql will generate a property like this:

    [Column(Storage="_Name", DbType="NVarChar(50) NOT NULL")]
    public string Name
    {
        get
        {
            return this._Name;
        }
        set
        {
            if ((this._Name != value))
            {
                this.OnNameChanging(value);
                this.SendPropertyChanging();
                this._Name = value;
                this.SendPropertyChanged("Name");
                this.OnNameChanged();
            }
        }
    }

See how it calls a function called OnNameChanged? Just create a function with that name to do the truncation and logging:

void OnNameChanged()
{
    if (Name.Length > 50)
    {
        Name = Name.Substring(0, 50);
        LogSomehow("Name truncated");
    }
}
Gabe
A: 

I would make an extension iterator method that you could throw inline with your query, doing something like this:

public static IEnumerable<string> Truncater(this IEnumerable<string> s, int len)
{
    foreach( var str in s )
    {
        if( str.Length > len )
        {
            string outStr = str.Substring(0, len);
            Console.WriteLine(String.Format("Truncated {0} to {1}", str, outStr));
            yield return outStr;
        }
        else
            yield return str;
    }
}
Rob H
A: 

Here's a more automatic method. It looks at the type/length of each column being inserted, on the fly.

Use it like this:

            foreach (object insert in context.GetChangeSet().Inserts)
            {
                FindLongStrings(update);
            }

            context.SubmitChanges();

And here's the method: (It's horribly inefficient so I wouldn't leave it in production code, but if you've got a one-time conversion/import (and it sound like you do), it might do the trick.)

    public static void FindLongStrings(object testObject)
    {
        foreach (PropertyInfo propInfo in testObject.GetType().GetProperties())
        {
            foreach (ColumnAttribute attribute in propInfo.GetCustomAttributes(typeof(ColumnAttribute), true))
            {
                if (attribute.DbType.ToLower().Contains("varchar"))
                {
                    // kinda crude, but it works...
                    string dbType = attribute.DbType.ToLower();
                    int numberStartIndex = dbType.IndexOf("varchar(") + 8;
                    int numberEndIndex = dbType.IndexOf(")", numberStartIndex);
                    string lengthString = dbType.Substring(numberStartIndex, (numberEndIndex - numberStartIndex));
                    int maxLength = 0;
                    int.TryParse(lengthString, out maxLength);

                    string currentValue = (string)propInfo.GetValue(testObject, null);

                    // Do your logging and truncation here
                    if (!string.IsNullOrEmpty(currentValue) && currentValue.Length > maxLength)
                        Console.WriteLine(testObject.GetType().Name + "." + propInfo.Name + " " + currentValue + " Max: " + maxLength);

                }
            }
        }
    }
shaunmartin