views:

1927

answers:

2

I want to scan an excel sheet, and replace any occurrences of social security numbers with zeros... I would like to do this using Excel.Interop if at all possible, but I'm open to anything at this point... here's some of my code... I'm banging my head on the desk for the past few months...

 // Get range and convert it to a string variable 
            Excel.Range _range =(Excel.Range)_excelApp.get_Range("A1:K1",Type.Missing);

            // convert the value of our cells in our range
            // to the string variable
            string myString = _range.Cells.Value2.ToString();          

            // match any SSN e.g. 1236780909, 123-33-2445
            if (Regex.IsMatch(myString, @"\b\d{3}\b\d{2}\b\d{4}"));
            {                
                _range.Cells.Value2 = replaceSSN;
            }

            // save our workbook with a new name and create a backup
            _excelWorkbook.SaveAs("Results.xls", Type.Missing, Type.Missing, Type.Missing, Type.Missing, true, Excel.XlSaveAsAccessMode.xlNoChange, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

            // close workbook
            _excelWorkbook.Close(false, Type.Missing, Type.Missing);

            // send quit signal to app
            _excelApp.Quit();

            // report success            
            MessageBox.Show("File masked successfully.", "Mask Data", MessageBoxButtons.OK);

            // release memory
            //System.Runtime.InteropServices.Marshal.ReleaseComObject(_excelApp);

            // change label back to being blank
            lblActivity.Text = "";
        }      
+5  A: 

Hi Woody The problem is here

        string myString = _range.Cells.Value2.ToString();

And here

        // match any SSN e.g. 1236780909, 123-33-2445
        if (Regex.IsMatch(myString, @"\b\d{3}\b\d{2}\b\d{4}"));
        {                
            _range.Cells.Value2 = replaceSSN;
        }

I think you are misunderstanding what Value2 is, Its the array reprentation of the range you have defined above.

Excel.Range _range =(Excel.Range)_excelApp.get_Range("A1:K1",Type.Missing);

Value2 Property of the Range class,returns an array of values. what you need to do is perhaps declare a empty array and get the Value2 of the range, loop each item in the array and run the regex, if it finds a match replace the item in the array. Then you could set the array back to the Value2 of the Range, which will update the cells values.

EDIT: Please find some sample code below

                var excelApp = new Application();
                excelApp.Workbooks.Open("c:\\Test.xls",Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing,
                                                       Type.Missing,Type.Missing);
                var ws = excelApp.Worksheets;
                var worksheet =(Worksheet) ws.get_Item("Sheet1");
                Range range = worksheet.UsedRange;
                // In the following cases Value2 returns different types
                // 1. the range variable points to a single cell
                // Value2 returns a object
                // 2. the range variable points to many cells
                // Value2 returns object[,]

                object[,] values = (object[,])range.Value2;

                for (int row = 1; row <= values.GetUpperBound(0); row++)
                    for (int col = 1; col <= values.GetUpperBound(1); col++)
                {
                    string value = Convert.ToString(values[row, col]);
                    //Also used a different regex, found yours not to match on your given criteria
                    if (Regex.IsMatch(value, @"^\d{3}-\d{2}-\d{4}$"))
                    {
                        range.Cells.set_Item(row,col,"0");
                    }
                }

            excelApp.Save("C:\\Out.xls");
            excelApp.Quit();

            Marshal.ReleaseComObject(worksheet);
            Marshal.ReleaseComObject(ws);
            Marshal.ReleaseComObject(excelApp);
almog.ori
Works perfectly. Thank you so much Ori... you've saved my head from more headaches...lol... I also completely understand what you've done... a learning experience as well... thanks again!
Woody
no problem anytime :)
almog.ori
Hey Ori, I've just encountered a bug. I get an Unable to cast object of type 'System.Double' to type 'System.Object[,]'. whenever only one cell is in the range. If there is more than one cell that is populated in the range, then it works fine... any ideas?
Woody
I used the same code searching through a set of Excel documents using more or less the exact same code, just looking for credit card numbers (see this one for the same job in word: http://stackoverflow.com/questions/2235388/regex-to-find-credit-card-number-in-documents-does-not-work)Thank you for this answer. It saved a lot of work for me! +1
sonstabo
+2  A: 

Here you can have a small understanding of code snippet but I have not tried it so it may contain syntax errors.

Excel.WorkSheet currentSheet = ApplicationInstance.ActiveSheet as Excel.Worksheet;

foreach(Exce.Range r1 in currentSheet.UsedRange)
{
     if(Regex.IsMatch(r1.Value2.ToString(), @"\b\d{3}\b\d{2}\b\d{4}")) // check r1.Value2 should not be null
     {
         r1.Value2 = Your_New_Value;
     }
}
Tumbleweed
I just tried that as well as doing something like Ori has mentioned, and I keep getting object errors. Like for example, the code you just posted came back with -- NullReferenceException: Object reference not set to an instance of an object :(
Woody
If there is a null value in one of the cells somewhere in the contiguious block that is the UsedRange calling ToString() will throw
almog.ori
yes any how Woody needs to use worksheet.UsedRange and check for his condition and take the decision.
Tumbleweed
absolutly, that comment was for woody
almog.ori
yes !! It was @ Woody :)
Tumbleweed