views:

309

answers:

7

I would like to store some Application-Related Metadata for Files, and NTFS Alternate Data Streams (AltDS) would allow me to store this metadata directly on the files rather than in a separate database.

I just don't feel like this is a good idea. I know that this only works on NTFS, but at least if the user copies/moves the files to a Non-NTFS drive they get a Warning from Windows (yeah, yeah, no one reads warnings, I know)-

But also, storing additional data on a file can become very wasteful, as the AltDS stay even if my Application is uninstalled. It's like a decade ago when people used "Registry Cleaners" to remove useless entries from the registry after uninstalling a program to make their system run faster (and less stable when the cleaner cleaned too much...).

I just wonder what they can be reasonably used for? Should they be completely left for Microsoft Apps to use? Or is there some sort of common policy what types of apps may use them (apart from malware)?

Making this CW as it's likely subjective, but I believe it's valid as both Programming Related and not completely S/A.

Edit: Just to clarify what my idea was. I'm in the early stages of writing a small document management system for myself. Because I want to have the freedom to move files around, I want to store metadata on the file so that if I move/rename/modify them, my app still recognizes them. It could either be the entire Metadata or just a GUID that works with a separate database.

To summarize the points given:

Pros:

  • Metadata moves with the file, so no need to recognize it through hashing or filename
  • Works with all FileTypes, even .txt files where it's impossible to store any data in the file itself

Cons:

  • Only works on NTFS which may not be the default file system in future Windows Versions
    • Although it would surprise me if MS doesn't automatically convert them if they ever get WinFS together
  • AltDS remain even if my App is uninstalled
  • Privacy concerns
  • Fragile
    • Most USB Sticks are FAT32. Many private file servers are Linux. Downloading a file from the internet should only transfer the file but not the streams. In short: It's rather easy to lose them.
A: 

If your app can function without that data, for example recreating it as necessary, the data streams are perfectly acceptable.

Given how they are used in windows, I don't think they are going away anytime soon.

Chris Lively
+1  A: 

Bad idea for you, bad idea for MS. I think they were really an attempt to compete with the Mac's data and resource fork file architecture back in the day. If the Mac FS files can have 2 forks, then our will have unlimited "forks", and maybe we'll eventually figure out how to use them.

GregS
I was thinking about those. As I understand it, Mac OS uses the second fork for metadata like the Application and Type descriptors for files (as it doesn't rely on file extensions), which allows us to say "I want to open this .txt file with that Editor, but this other .txt file with another editor". That works because it's neatly inegrated with Finder, whereas Windows Explorer only seems to use AltDS for the annoying "Can't execute this file as it's from the Internet" message.
Michael Stum
+3  A: 

It's hard to say without more information about the kind of data you're storing. You seem to be aware of some of the concerns involving their use, so I'm not sure how much I can help. Here's my general thoughts on alternate data streams, though:

First of all, as you've noted, AD streams only work on NTFS. If there's any chance you'll need to store this metadata on a FAT filesystem, you'll need some kind of fallback mechanism. Modern PCs will probably have NTFS-formatted internal hard drives, but most USB flash drives you encounter are still FAT-formatted. Keep that in mind if your users will be storing data files on flash drives.

Aside from that, I can't think of any technological reasons to avoid AD streams, but I'd still be wary of using them. People tend to be nervous about applications that "hide" data from them, regardless of the intent. Consider the Sony rootkit fiasco, and so on. I'm not saying your application is anywhere near as bad as that, but people (especially the less tech-savvy) may not make out the distinction. Still, I will allow that they might have a valid use for your application. The problem of leaving the AD streams behind after uninstallation is still very real, of course. You might want to consider giving people running the uninstaller the option of running a program to search their drive(s) and clean up any remaining streams.

Also, remember the KISS principle. Is the use of AD streams really the simplest way to effectively solve your application's metadata storage problem? If so, maybe AD streams are a good idea, but, if not, I'd seriously consider taking another approach.

bcat
Thanks. The data is Metadata for a document management system. It's mainly something for myself and in early planning, but I came across AltDS and though to gather some opinions. The "hiding data" part is actually a good point. Only very few users are actually really trying to protect themselves from Malware (Pretty much everyone is willing to install crap if it has a cute mascot), but most people are quick to chime in once someone calls out an App as being Malware, even if it's unfounded.
Michael Stum
+2  A: 

I can think of one good reason not to use them, and that's this little tidbit from their "how to use" guide:

Alternate data streams are strictly a feature of the NTFS file system and may not be supported in future file systems. However, NTFS will be supported in future versions of Windows NT.

Now... the way this is worded, I guess, technically you're safe. But if Microsoft ever decides to supersede/deprecate NTFS - and they did come pretty close at one point - then you're going to have to scramble to upgrade your software so it runs on newer machines.

As unlikely as that possibility may seem now, I think it's less unlikely than suddenly finding yourself unable to wire up a SQLCE database or XML file stored in the user's AppData.

Having said that, I'm sure that there are some scenarios that justify the use of ADS. In my opinion it's just one of those cases where, if you aren't absolutely sure that it's the right tool, then it's probably the wrong one.

Attaching metadata to files in general is a dangerous game. Just look at the unholy mess that is ID3 and the embarrassing results of people leaving the EXIF data in images.

P.S. Registry cleaners aren't used anymore? Why didn't anybody tell me!?

Aaronaught
Good point. Yes, WinFS was supposed to supersede NTFS, but how I understood it, WinFS still supports Metadata (as it's essentially an OS-embedded SQL Server). As for ID3 and EXIF: They are stored in the file itself, but it's a good point. I haven't checked what happens if I download a file that is served from an NTFS Server through IIS to an NTFS File system - I would hope that it kills the AltDS, but haven't checked. My usage would be similar to EXIF, it's for a document management system where I want to have the Data (just a GUID actually) move when the file is moved.
Michael Stum
+1  A: 

Adding an AltDs to a file as a way to tie an application-specific string around it has the problem you cite: no cleanup. And if the file gets copies, your stuff follows it around. For this case, keeping a separate database is probably more virtuous.

If the file, on the other hand, is very much under your own control, then if AltDs is an efficient way to do the job, go ahead.

bmargulies
+3  A: 

Another sticking point: Backup software. Some ignores it, some doesn't restore it, and some support it but don't do anything without you telling it to.

Goyuix
A: 

Alternate data streams are essential to NTFS and will always be supported. When the file they are attached to gets deleted they get deleted as well - so no worries about them "sticking around"

As all the others have said, there are issues with backup, copy to other filesystem and paranoia regarding ADS.

Dominik Weber