tags:

views:

385

answers:

3

Hi,

I am trying to retrieve the font name and size of all the headings from a word document. Any idea how to get it?

A: 

The most straight forward solution is to open the document in Word and access the object model. This is traditionally done using VBA, but you can also use .NET (e.g. C# og VB.NET) by using VSTO (Visual Studio Tools for Office). Personally I find C#/VB.NET much better languages than VBA.

Once you have access to the object model you will have to enumerate paragraphs in the document. When you find a heading (perhaps defined by the style) you will have to figure out the formatting of the heading.

Martin Liversage
A: 

This is what I got from a brief skim of the MSDN page on "HeadingStyles":

MsgBox ActiveDocument.HeadingStyles(1).Style
Anthony
+1  A: 

The basic structure will be something like below:

Public Sub ShowFontAndSize() Dim singleLine As Paragraph Dim lineText As String

For Each singleLine In ActiveDocument.Paragraphs
 Debug.Print singleLine.Range.Font.Name
 Debug.Print singleLine.Range.Font.Size
Next singleLine

End Sub

The catch will be that this won't sense if there are different fonts and sizes on the same line. If that's a possibility, you will need to add another loop with For Each singleCharacter In singleLine.Range.Characters inside of the paragraphs loop.

Edit: A trickier problem is what to do with this data once you've collected it. Building up an array seems like the natural fit, but VBA arrays are borderline useless, since basic methods like .append() require you to redim the whole array. See http://www.cpearson.com/excel/VBAArrays.htm for more info if you would like to go down that road.

anschauung
Is there a similar loop that can simply return all fonts and font size pairs used in the document without looping through each character? I would think that this meta-info would be stored in such a way that you could retrieve it without going letter by letter. If it can be returned in that way, maybe he could use a certain amount of guessing that the smallest font-size is not a header and thus all others ARE headers? A big guess, I know, but a fairly safe one in most situations.
Anthony
There is, and you touched on it in your answer earlier. But, it's unfortunately useless in any typical working environment. Word tries to store all the fonts/sizes/etc using the HeadingStyles collection, but only the most meticulous users use that feature consistently. Everyone else uses overrides, since it's easier to override than to use the headers, especially in Word 2003 and earlier.
anschauung

related questions