views:

286

answers:

3

I'm no programmer but I'm sure this community can help.

I have thosands of VSD files in a LAN share that I want to create a simple searchable index for. I want to be able to have the contents of each VSD file in clear text for manipluation in either windows or unix shell script that could be used for searching the clear txt output.

Can any of you help?

A: 

Microsoft provides a nice interface for manipulating the content of Visio documents. It is possible to create a customized tool that goes through every VSD file in your share, extracts the information that interests you, and saves that information in whatever textual format you desire.

Start by defining what information interests you in those VSD files.

M.A. Hanin
A: 

There are several options you can explore:

#1 Use the Built in Searching Capabilities in Windows

This requires having an IFilter that can index the visio format for you. The link below is a IFilter provided by Microsoft.

Visio IFilter 2003 Add-in: Text Search in Visio Files

Comments * Requires no coding * Should have good integration with the desktop search feature (I have not verified this) * The searching features is driven by the IFilter implementation. It may not index what want.

DISCLAIMER: I have never installed the IFilter so I cannot comment on how well it works.

#2 Getting the clear text using the Visio object model (as answered by M.A. Hanin)

If all you need is the plain text of shapes this is very straightforward. If you need to get text from things like custom properties, then it will be a little more complex. If you go down this path I built a library to assist in using the Visio 2007 object model easier - look for a project called VisioAutomation on Codeplex.com

Comments * Requires coding and knowledge of Visio Object Model (will not be too complicated) * If you really have thousands of files, this may take a while .

#3 Getting the clear text using VDX Files

This technique means keeping (or converting) the visio files as VDX files which is an XML format. You can easily get the plain text from the XML.

Comments * Requires coding and very little knowledge of Visio Object Model (to perform the export) - mostly the work will involve XML coding * If you really have thousands of files, generating VDX files can take a while.

I have experience working with th VDX format directly - it is very easy to write code to process it.

A: 

thanks for you help guys, I've gone with a solution of using sever 2003 with the indexing service visio ifilter installed. The index runs against a UNC share and I have pinched a guide from a website to build an IIS front end for the indexing query engine. Seems to work on a test cell quiet well but I'm yet to set up against the main repository.

Your suggestions will of course give me ideas for further reading.

Many thanks to all!

Jon