I have a large collection of MSWord documents (approximately 40,000), which are the results of mailmerges (same main document, different data sources).
One of the merge fields is a text field which could have the text "Yes" or "No".
Is there an easy way to list which of the documents have that merge field set to the value "Yes"? (I'm expecting approximately 10,000 "Yes" documents.)
I'd be interested in any approach, whether using Word itself, Office Automation, hexdumping the binary files and grepping for certain magic, or any ready-made tools (perl scripts, .NET apps, etc) which can do this sort of thing.
The files are on a network share accessible from both Linux and Windows boxes (and I can probably steal a Mac for a little while if necessary), so I'm not too worried about which platform the tools run on...