views:

143

answers:

4

What is the fastest way for a .Net program to index all images on a users' computer?

Using Open Source .Net Libs and native .Net classes.

In a way good for XP sp3 ,..., Windows 7

  • Fastest - Same pc configuration, different time (seconds)

  • .Net program - better C#

  • To index - get a list of absolute links (like c://bla-bla/file ) and save them into file (index.txt)

  • All images (like JPEGs, PNGs, BMPs)

+2  A: 

Can you use Lucene .NET? http://lucene.apache.org/lucene.net/

Kris Krause
I don't actually understand what a full-text search engine has to do with generating an index of image files on a PC? Especially given that it was the selected answer, there must be more to Lucene than I'm getting?
hemp
+6  A: 

You will have to scan with Directory.GetFiles(), recursively.

You can optimize by using as many threads as you have disks (not partitions).

.NET 4 has a new 'streaming' version of GetFiles(), that can help to keep memory use down.

But it will take a looong time on a modern computer. Doing it at full steam will certainly hinder normal use of the PC, so you might want to go slower than you can.

Henk Holterman
+3  A: 

You iterate over all the files; your real challenge is deciding what is an image or not. This could be as simple as examining the file extension, or could be as expensive as opening the file and examining it e.g. file magic.

Finally, you'll want to 'watch' for changes and keep your set up to date.

This is going to be IO-bound, dominated by whether you decide on images by extension or whether you examine their contents.

Will
+1 for FileSystemWatcher.
Moron
+4  A: 

The fastest way to index all the images would be to use an existing index. For instance, the Windows Search Service offers programmatic access to its index through multiple interfaces.

The easiest way to get access to those interfaces from .NET would likely be the Windows API Code Pack.

hemp
Agreed. Indexing is not trivial, hence the PhDs that Google, Yahoo, Microsoft and Apple employ to work on indexing algorithms. It's quite likely a trivial `Directory.GetFiles` will be orders of magnitude slower.
Igor Zevaka