views:

2061

answers:

14

What is the fastest c# function that takes and int and returns a string containing a letter or letters for use in an Excel function? For example, 1 returns "A", 26 returns "Z", 27 returns "AA", etc.

This is called tens of thousands of times and is taking 25% of the time needed to generate a large spreadsheet with many formulas.

public string Letter(int intCol) {

    int intFirstLetter = ((intCol) / 676) + 64;
    int intSecondLetter = ((intCol % 676) / 26) + 64;
    int intThirdLetter = (intCol % 26) + 65;

    char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
    char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
    char ThirdLetter = (char)intThirdLetter;

    return string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
}
A: 

can you post your current function? Something as simple as this should not be 25% of your processing.

Neil N
I second this. I agree with others that a lookup table would be perfect because of the fixed maximum count (256). But I'm also surprised this is what's slowing you down.
Matthew Flaschen
Actually, Excel 2007 apparently supports 16384. It would still make sense to do at least the beginning in a lookup table, and optionally the whole thing.
Matthew Flaschen
+3  A: 

You could pre-generate all the values into an array of strings. This would take very little memory and could be calculated on the first call.

JDunkerley
+1  A: 

The absolute FASTEST, would be capitalizing that the Excel spreadsheet only a fixed number of columns, so you would do a lookup table. Declare a constant string array of 256 entries, and prepopulate it with the strings from "A" to "IV". Then you simply do a straight index lookup.

Doug
I don't know where you are getting that Excel has a fixed 256 columns. I started scrolling and adding text, and I am on AEW and have given up.
esac
It used to, before Excel 2007, which allows 16384 (XFD). Still, don't blame you for quitting. ;)
Matthew Flaschen
I'm running Office 2005. Apparently the 256 column limit has been extended (or lifted entirely?) in newer versions. Which, of course, makes it more important to programatically seed your lookup table.
Doug
+10  A: 

I can tell you that the fastest function will not be the prettiest function. Here it is:

private string[] map = new string[]
    { 
        "A", "B", "C", "D", "E" .............
    };

public string getColumn(int number)
{
    return map[number];
}
womp
A good point to draw attention to the array approach, though defining it manually wouldn't be such a great idea. Pre-generation is the way to go.
Noldorin
Hey, he asked for the fastest! Any code you add to automatically pre-populate it is going to be slower ;)
womp
@womp: This is true... though it's a one-off operation, so it's effectively discountable. What's the difference between generating the enormous array as code or during initialisation, except messiness? I know, you're just being pedantic and taking it literally for the fun of it (unless I'm mistaken).
Noldorin
Nope, you're right :) Clearly it would be MUCH more practical to pre-generate the array with some code. But this is technically the absolute fastest. Just for the record, I voted up some other answers ;)
womp
Ok, good to know you're (kind of) kidding. :)
Noldorin
By the way, this map should be static, so that it is not duplicated (and potentially regenerated) for each instance.
Thomas Levesque
A: 

Try this function.

// Returns name of column for specified 0-based index.
public static string GetColumnName(int index)
{
    var name = new char[3]; // Assumes 3-letter column name max.
    int rem = index;
    int div = 17576; // 26 ^ 3

    for (int i = 2; i >= 0; i++)
    {
        name[i] = alphabet[rem / div];
        rem %= div;
        div /= 26;
    }

    if (index >= 676)
        return new string(name, 3);
    else if (index >= 26)
        return new string(name, 2);
    else
        return new string(name, 1);
}

Now it shouldn't take up that much memory to pre-generate each column name for every index and store them in a single huge array, so you shouldn't need to look up the name for any column twice.

If I can think of any further optimisations, I'll add them later, but I believe this function should be pretty quick, and I doubt you even need this sort of speed if you do the pre-generation.

Noldorin
@esac: You're absolutely right. (And there was even another with the for loop. :P) I shouldn't be writing code at this hour, frankly... So yeah, that did deserve a downvote in fairness. Thanks for having the courtesy to remove it though. :) +1 to you for the corrections.
Noldorin
return new string(name, 3); there is no overload for string(char[], int). Maybe you meant "new string(name)". Also you get an index out of the bounds of the array exception for case index = (26 % 100) on the line name[i] = alphabet[rem / div]; (yes i have alphabet defined as earlier)
esac
+5  A: 
    public static string ExcelColumnName(int count)
    {
        return new string((char)('A'+(((count-1)%26))), ((count-1) / 26) + 1);
    }

Or if you would like to cache it for further lookups to make it faster :)

public class Test
{
    private Dictionary<int, string> m_Cache = new Dictionary<int, string>();

    public string ExcelColumnName(int count)
    {
        if (m_Cache.ContainsKey(count) == false)
        {
            m_Cache.Add(count, new string((char)('A' + (((count - 1) % 26))), ((count - 1) / 26) + 1));

        }

        return m_Cache[count];
    }

    public static void Main(string[] args)
    {
        Test t = new Test();

        Console.WriteLine(t.ExcelColumnName(1));
        Console.WriteLine(t.ExcelColumnName(26));
        Console.WriteLine(t.ExcelColumnName(27));
        Console.WriteLine(t.ExcelColumnName(1));
        Console.WriteLine(t.ExcelColumnName(26));
        Console.WriteLine(t.ExcelColumnName(27));

    }
esac
even without caching, generating 10million column names with your method takes 2449 milliseconds, whereas mine is at 48 milliseconds. No need to keep an expensive pre-generated array around in memory :)
esac
Just a point, the OP said that he was using 1 based, so that is how I coded mine, however his posted algorithm uses 0 based.
esac
+1  A: 

Once your function has run, let it cache the results into a dictionary. So that, it won't have to do the calculation again.

e.g. Convert(27) will check if 27 is mapped/stored in dictionary. If not, do the calculation and store "AA" against 27 in the dictionary.

shahkalpesh
+6  A: 

Don't convert it at all. Excel can work in R1C1 notation just as well as in A1 notation.

So (apologies for using VBA rather than C#):

Application.Worksheets("Sheet1").Range("B1").Font.Bold = True

can just as easily be written as:

Application.Worksheets("Sheet1").Cells(1, 2).Font.Bold = True

The Range property takes A1 notation whereas the Cells property takes (row number, column number).

To select multiple cells: Range(Cells(1, 1), Cells(4, 6)) (NB would need some kind of object qualifier if not using the active worksheet) rather than Range("A1:F4")

The Columns property can take either a letter (e.g. F) or a number (e.g. 6)

barrowc
+1  A: 

Your first problem is that you are declaring 6 variables in the method. If a methd is going to be called thousands of times, just moving those to class scope instead of function scope will probably cut your processing time by more than half right off the bat.

Neil N
A: 

Caching really does cut the runtime of 10,000,000 random calls to 1/3 its value though:

    static Dictionary<int, string> LetterDict = new Dictionary<int, string>(676);
    public static string LetterWithCaching(int index)
    {
        int intCol = index - 1;
        if (LetterDict.ContainsKey(intCol)) return LetterDict[intCol];
        int intFirstLetter = ((intCol) / 676) + 64;
        int intSecondLetter = ((intCol % 676) / 26) + 64;
        int intThirdLetter = (intCol % 26) + 65;
        char FirstLetter = (intFirstLetter > 64) ? (char)intFirstLetter : ' ';
        char SecondLetter = (intSecondLetter > 64) ? (char)intSecondLetter : ' ';
        char ThirdLetter = (char)intThirdLetter;
        String s = string.Concat(FirstLetter, SecondLetter, ThirdLetter).Trim();
        LetterDict.Add(intCol, s);
        return s;
    }

I think caching in the worst-case (hit every value) couldn't take up more than 250kb (17576 possible values * (sizeof(int)=4 + sizeof(char)*3 + string overhead=2)

foson
A: 

It is recursive. Fast, and right :

class ToolSheet
{


    //Not the prettyest but surely the fastest :
    static string[] ColName = new string[676];


    public ToolSheet()
    {
        ColName[0] = "A";
        for (int index = 1; index < 676; ++index) Recurse(index, index);

    }

    private int Recurse(int i, int index)
    {
        if (i < 1) return 0;
        ColName[index] = ((char)(65 + i % 26)).ToString() + ColName[index];

        return Recurse(i / 26, index);
    }

    public string GetColName(int i)
    {
        return ColName[i - 1];
    }



}
A: 

sorry there was a shift. corrected.

class ToolSheet
{


    //Not the prettyest but surely the fastest :
    static string[] ColName = new string[676];


    public ToolSheet()
    {

        for (int index = 0; index < 676; ++index)
        {
            Recurse(index, index);
        }

    }

    private int Recurse(int i, int index)
    {
        if (i < 1)
        {
            if (index % 26 == 0 && index > 0) ColName[index] = ColName[index - 1].Substring(0, ColName[index - 1].Length - 1) + "Z";

            return 0;
        }


        ColName[index] = ((char)(64 + i % 26)).ToString() + ColName[index];


        return Recurse(i / 26, index);
    }

    public string GetColName(int i)
    {
        return ColName[i - 1];
    }



}
+2  A: 

I currently use this, with Excel 2007

public static string ExcelColumnFromNumber(int column)
        {
            string columnString = "";
            decimal columnNumber = column;
            while (columnNumber > 0)
            {
                decimal currentLetterNumber = (columnNumber - 1) % 26;
                char currentLetter = (char)(currentLetterNumber + 65);
                columnString = currentLetter + columnString;
                columnNumber = (columnNumber - (currentLetterNumber + 1)) / 26;
            }
            return columnString;
        }

and

public static int NumberFromExcelColumn(string column)
        {
            int retVal = 0;
            string col = column.ToUpper();
            for (int iChar = col.Length - 1; iChar >= 0; iChar--)
            {
                char colPiece = col[iChar];
                int colNum = colPiece - 64;
                retVal = retVal + colNum * (int)Math.Pow(26, col.Length - (iChar + 1));
            }
            return retVal;
        }

As mentioned in other posts, the results can be cached.

astander
+1 : Might not be the fastest, but useful for the stuff I need to do :)
Ian