views:

433

answers:

1

I got an application that automates Microsoft Excel (2003) using Office Automation. One of the things it does is replacing values. The problem I got is that when replacing values, all rich formatting is lost as explained here: http://support.microsoft.com/kb/177194

To clarify, I replace valueX with valueY in a cell looking like this:
valueX is great

This results in:
valueY is great


I tried getting the formatting within the cell before replacing, so I could save which characters are Bold and apply the formatting after the value is replaced. But I couldn't find a way to get the formatting for each character. (C#)

Is there any way to solve this problem, without using the resolution give on the website above? In C# or some workaround in Excel

+2  A: 

In your question, you mention that you can't figure out how to get the formatting for each character in C# but that you might be able to use a workaround in Excel. I'm not sure what objects in Excel are exposed to C# via interop, but perhaps my showing you what can be done in VBA will be of some assistance.

I'll provide three parts to my answer:

  1. How to find formatting of characters in Excel using VBA.
  2. The complexity of resetting the format of a replaced cell's contents.
  3. A possible way to use VBA in Excel in a before/after situation.


How to find formatting of characters in Excel using VBA

The formatting of a given character is exposed through the Range.Characters.Font object. Let's take a look. Imagine you have the following in cell A1 in Sheet1 of an Excel sheet:

Testing

You can see from the example that the 3rd and 4th characters are bold. How might we find those properties and other font properties of those characters? Here's a VBA function that demonstrates some of the properties you can expose through Range.Characters and Range.Characters.Font:

Sub IterateCharacters()
Dim rng As Range
Dim lngLen As Long
Dim lngCount As Long
Dim chr As Characters
Set rng = ThisWorkbook.Worksheets("Sheet1").Range("A1")
lngCount = 1
lngLen = Len(rng.Value)

Debug.Print "Count", "Text", "Font", "Color", "Size", "Bold"    

Do While lngCount <= lngLen
    Set chr = rng.Characters(lngCount, 1)
    Debug.Print lngCount, chr.Text, chr.Font.Name, chr.Font.Color, chr.Font.Size, chr.Font.Bold
    lngCount = lngCount + 1
Loop

End Sub

The above code will produce the following output:

Count         Text          Font          Color         Size          Bold
 1            T             Arial          0             10           False
 2            e             Arial          0             10           False
 3            s             Arial          0             10           True
 4            t             Arial          0             10           True
 5            i             Arial          0             10           False
 6            n             Arial          0             10           False


The complexity of resetting the format of a replaced cell's contents

I imagine that the tricky part of this problem is not actually in finding the formatting of each individual character (assuming, of course, that it is even possible in your situation). The hard part will be "remembering" the formatting of each character and re-applying the formatting after you've replaced the text.

In the easiest case scenario, your replacement strings are always of the same length as your destination string. If valueX is going to simply be replaced by valueY, and it's always the same length, retaining the formatting is quite easy. Simply get the formatting before running the replace method, then re-apply the formatting using the exact same settings as before for each character.

Slightly more difficult is if you are replacing one possible value in a cell. In this case, you need to offset your replacement formatting by a difference in the length between the replacement and destination strings and only do so after the replaced string.

In the ugliest scenario, you have n strings that will be replaced, in which case, when you parse back through the characters to apply formatting, you'll have to do so with something like n * difference for each subsequent replacement.

I'm sure there are better algorithms for this task, but maybe that would be best suited for a separate SO question.


A possible way to use VBA in Excel in a before/after situation

Ideally, you'll find a way to access the necessary objects straight from C# so you don't have to deal with VBA code and Excel. But you may not be so lucky.

If you have to do your formatting in Excel, here's one way you might do it.

  1. Set up your Excel sheet into 3 sheets: Original,Copy,ReplacedValues
  2. Before running your C# code, make sure that Copy is a duplicate of Original.
  3. When you run your C# code to replace cell contents, paste a value of the replacement string in the ReplacedValues worksheet. I'd recommend something simple where your Range Address (e.g. B2) is in Column A, the original value is in Column B, and the replacement value is in column C.
  4. After running your C# code, write a method that will iterate through all of the listed ranges in ReplacedValues.
  5. For each range, you'll be able to reference the original value in Copy (perhaps my naming scheme is a little confusing) and the updated value in Original.
  6. Find what the original formatting was in Copy. Use the difference of the string lengths to offset and apply your formatting back to the original (this is where you get to write some fancy code parsers to do all of this work).

There's probably a better method than the above process, but I think that would work quite nicely.

Ben McCormack
Ben, what an increadible answer, holy cow. Xta, this approach would have no problem working using C#, although some of these property names might be prefixed with "get_" or "set_", but not many I'd expect.
Mike Rosenblum
Thanks! I remember how long it took it me to find out how to format characters within an Excel range. I haven't used C# .NET interop before, but that's something I'd probalby like to try at some point since the language is much more elegant to use.
Ben McCormack
Wow, thank you very much! I haven't actually programmed it yet, but I'm sure I can get it to work now with this detailed answer.
@Ben: regarding your reply "I haven't used C# .NET interop before, but that's something I'd probalby like to try at some point since the language is much more elegant to use." Beware of using C# with Excel, it truly is much, much LESS elegant when using Excel with C# 3.5; or below. C# 4.0, however, is just around the corner and should help a LOT. If you want to go this route, I would strongly recommend either waiting for C# 4.0 to be formally released, or use the Visual Studio 2010 beta, which is very, very stable at this point.
Mike Rosenblum