views:

288

answers:

2

Our application's "documents" are single binary files.

Our customers have asked if we can add MS Office-like document properties to our document files so that they are easier for users to manage. By easier to manage, I mean the ability for Windows Explorer to display common document properties in tooltips.

My research seems to indicate we should consider OLE Structured Storage as the basis for our data files. I've seen this technique alternatively described as MS Structured Storage, OLE 2 Compound Document Formats, and Windows file metadata.

My concerns about using OLE Structured Storage is that it does not appear that Office 2007 or 2010 use this file format any more and OLE Structured Storage requires the registration of a DSOFILE.DLL ActiveX component that many of our customers will not be able to use because they run our software on locked down workstations where users do not have admin rights to install software. (Our application software is a pure XCOPY deployment).

Would appreciate hearing ideas on what our options are.

Thank you, Malcolm

+3  A: 

I'm pretty sure your best answer is to use the OLE compound document.

Microsoft may have stopped using this, but that is because they have gone to an XML file format. Unless you are willing to convert from your current file format to XML, I do not think that the new standard for tags will be interesting for you.

You could possibly make your application save two files, the XML one just for tags and the binary data one, but that just means pain for your users. The whole point of the OLE compound document format was to allow multiple "files" bound together in one file.

Also, I would be very surprised if modern Windows did not have support for OLE compound documents built right in. I'm pretty sure that as far back as Microsoft Word 6.0, over a decade ago, documents were saved in this OLE compound document format. Why would Windows XP or newer require an extra .DLL file to be able to parse the tags out?

The best thing about using the OLE compound document format is that the user tags will go with the file, no matter what: if the user writes the file to a file server, if the user drops the file in an email, if the user burns the file to a CD, whatever. (The first answer I wrote, which I deleted, was bad; even if it had worked it would have put the user tags outside the file, and the more I think about it, the less happy I am at that thought.)

So, I suggest that you try creating an OLE compound document, and then just look at the file in Windows Explorer in a standard install of Windows XP. See if you can see the tags without needing to download and install an ActiveX .DLL. I'm pretty sure it will work. (But I don't really do Windows much anymore so I cannot conveniently test this for you.)

EDIT: Okay, I just did a test. I'm at work and I have a Windows computer here. I used Word 2007 to make a document, and I saved the document as Word 97 format. I looked at the document properties in Windows Explorer; the author name was visible in the tags. I added text to "comments" and then opened the file in Word 2007. I was then able to view the comments (click on the "office" icon circle in the upper left, choose "Prepare", choose "Properties").

So, my theory has some evidence to support it: I did not have to install any special software, my Windows Explorer just worked with the OLE compound document format Word file with the tags. (It could be that Microsoft Office installs some special .DLL to use the tags with Windows Explorer; I do have Microsoft Office 2007 installed on that computer. But your customers likely have Microsoft Office too, so even if that is the case, I still think this is the best solution.)

I suggest you Google search for "OLE compound document format" and see how to write this format. I found an example of how to read the tags here: http://support.microsoft.com/kb/186898

steveha
Steve: You mention a "new standard for tags" based on an XML file format. We can convert our files to an XML format if this would allow Windows to automatically extract the content of certain elements and display them in a tooltip. Is this possible (I googled this approach without success). Or are you suggesting that we write our own shell extension to extra meta data from our files for tooltip display? Thank you for your help!
Malcolm
What I meant is, Microsoft Office now uses an XML-based file format, rather than using the OLE compound document format. If you want to do tags the same way the new Microsoft Office does tags, you would need to copy what they are doing. But I'm not recommending that. I think that the OLE compound document is just what you need. It lets you have multiple files bound together as one file, so you can have your current binary format in one internal file, and the special tags as another file, together in the compound document. That is what I recommend you look into.
steveha
I believe that Microsoft probably wrote a shell extension for Windows Explorer, and this new shell extension tells Windows Explorer how to understand and update the tags in an XML Microsoft Office file. So, if you try to do the XML file format, you might need to install the shell extension as well. On the other hand, the OLE compound document format is well over a decade old, and I'll bet that Windows Explorer has support for the file tags built-in since at least Windows XP (and perhaps even older versions of Windows).
steveha
I tested my theories and they seem to work. See the "EDIT:" section I added in the answer.
steveha
Thank you Steve.
Malcolm
A: 

On Windows 2000 or higher, instead of using OLE compound document, you can also store the summery information in the NTFS file metadata so applications such as Windows Explorer or Windows Desktop Search can use the attributes in property pages, tooltips, columns and searching.

Sheng Jiang 蒋晟
Thank you Sheng Jiang.
Malcolm