views:

88

answers:

1

I am working with misspellings in Microsoft Word. With only just a few misspellings, accessing the SpellingErrors collection becomes gawdawful slow (at least with For/Next or For/Each loops).

Is there a way to get the to the list (make a copy, copy the entries, stop the dynamic nature of the collection) quickly? I just need a list, a snap shot of it, and for it not to be dynamic or real time.

+4  A: 

Here's how I would simulate creating and checking spelling errors:

Sub GetSpellingErrors()
    ''# Turn off auto-spellchecking
    Application.Options.CheckSpellingAsYouType = False
    ''# Set document
    Dim d As Document
    Set d = ActiveDocument
    ''# Insert misspelled text
    d.Range.Text = "I wantedd to beet hym uup to rite some rongs."
    ''# Get spelling errors
    Dim spellErrs As ProofreadingErrors
    Set spellErrs = d.SpellingErrors
    ''# Dump spelling errors to Immediate window
    For spellErr = 1 To spellErrs.Count
        Debug.Print spellErrs(spellErr).Text
    Next
    ''# Turn back auto-spellchecking
    Application.Options.CheckSpellingAsYouType = True
End Sub

Testing this on my side runs extremely fast, both in Word 2003 and Word 2010. Note that this will give you six spelling errors, not four. Although "beet" and "rite" are words in English, they are considered "misspelled" in the context of this sentence.

Notice the Application.Options.CheckSpellingAsYouType = False. This turns off automatic spelling error detection (red squigglies). It is an application-wide setting - not just for a single document - so best practice would be to turn it back on if that is what the end-user is expecting in Word as I've done at the end.

Now if detection is on in Word 2007/2010 (this doesn't work for 2003 and earlier), you can simply read the misspelled words in the XML (WordprocessingML). This solution is more complicated to set up and manage, and should really only be used if you're not using VBA to program but rather Open XML. A simple query with Linq-to-XML would suffice to get an IEnumerable of all the misspelled words. You would dump all the .Value of the XML between each w:type="spellStart" and w:type="spellEnd" attributes of the <w:proofErr/> element. The document produced above has this paragraph in WordprocessingML:

<w:p w:rsidR="00A357E4" w:rsidRDefault="0008442E">
  <w:r>
    <w:t xml:space="preserve">I </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r>
    <w:t>wa</w:t>
  </w:r>
  <w:bookmarkStart w:id="0" w:name="_GoBack"/>
  <w:bookmarkEnd w:id="0"/>
  <w:r>
    <w:t>ntedd</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r>
    <w:t xml:space="preserve"> to </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r w:rsidR="003F2F98">
    <w:t>b</w:t>
  </w:r>
  <w:r w:rsidR="005D3127">
    <w:t>eet</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r w:rsidR="005D3127">
    <w:t xml:space="preserve"> </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r w:rsidR="005D3127">
    <w:t>hym</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r w:rsidR="005D3127">
    <w:t xml:space="preserve"> </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r w:rsidR="005D3127">
    <w:t>uup</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r w:rsidR="005D3127">
    <w:t xml:space="preserve"> to </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r w:rsidR="005D3127">
    <w:t>rite</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r w:rsidR="005D3127">
    <w:t xml:space="preserve"> some </w:t>
  </w:r>
  <w:proofErr w:type="spellStart"/>
  <w:r w:rsidR="005D3127">
    <w:t>rongs</w:t>
  </w:r>
  <w:proofErr w:type="spellEnd"/>
  <w:r w:rsidR="005D3127">
    <w:t xml:space="preserve">. </w:t>
  </w:r>
</w:p>
Otaku
Thanks! This is turning out to be more interesting than I thought. The speed accessing SpellingErrors does increase with more words. And beyond just the number of words. The same five misspelled words pasted 2000 times get accessed at about 2 seconds per word in the loop. In my document, it's 20 - 33 seconds per word. It looks like that can't be worked around in VBA. That leaves OpenXML, which makes me think we'll talk again. Thanks for the information.
ForEachLoop