views:

78

answers:

4

Hi all,

I've ran into a really weird problem while working on a large project. I write a bunch of same-size files on a partition (tried both RAM disks and virtual disks created via diskmgmt.msc). When there is not enough free space to fit another file (as reported by GetDiskFreeSpaceExW), I delete one (only one) of the previously created ones and write the new one. Then, I delete another old file and write a new one, ad infinitum (so, you may think of the partition as of a ring buffer of equally sized files). After a series of writes-deletes (from few hundreds to few thousands), I run into a no free space error while writing a new file (prior to which, GetDiskFreeSpaceExW reports enough space). I asked a few colleagues of mine to try and reproduce the problem on their hardware, but the problem did not resurface.

To clarify things a bit, here's the exact algorithm:

  1. Choose file size (say, S bytes)
  2. Check free space with GetDiskFreeSpaceExW
  3. If free_space > S: write new file of size S and goto 2
  4. Else: Delete one file and goto 2

It is important to note that I write data to files in blocks of size 4096 bytes (problem may or may not resurface depending on the block size). File size is 5MB. NTFS partition size is 21 MiB. Cluster size is 512 B (again, changing these parameters affects the results). With these parameters, the failure occurs during creation of the 684'th file. It doesn't depend on whether I use a RAM disk or a virtual disk (hence, it is not a problem of a particular implementation).

I analyzed the resulting disk image dumps after the failure and found that the files were heavily fragmented. Chkdsk reports no problems neither before nor after the experiment. No errors were found in the system logs.

Possibly relevant parameters of my netbook (Dell Inspiron 1110):

  • Pentium SU4100, Relatively slow dual-core x64 CULV CPU (1.3 GHz)
  • Windows 7 Ultimate x64 edition
  • 2 GB RAM

Does anyone have any idea about what's going on and how to debug it? Where can I look for additional info? I'm out of ideas already, and I need to solve this issue as soon as possible...

UPD: the problem occurs when I'm writing file data (i.e. write() fails), not when I create the file. So, it doesn't look like I'm lacking MFT entries.

UPD2: answering a few of the questions that were asked

  • The partition is a freshly formatted one, hence, no specific attributes on files, no directory structure, nothing
  • Permissions are default
  • No .lnk's, no hardlinks - _only_ the files I write
  • All files are written to the root dir, no more directories are created
  • Filenames are simply the ordinal numbers of files (i.e. 1, 2, 3, ...)
  • No alternate data streams, files are created using `fopen()`, written to with `fwrite()` and closed with `fclose()`
  • $Txf gets created, indeed
  • No bad clusters, this is a virtual (or a RAM) disk
+1  A: 

The FS has its own overhead which you don't account for. That overhead is not constant, so by deleting / writing files you may be causing fragmenting. In other words "5 MB of free space" doesn't imply you can write 5MB to the disk.

tenfour
Sure it does, but the number of files is really small. It just cannot be a problem of sudden MFT expansion. After all, why don't the file-IO-intensive servers fail with the same problem?..
Roman D
A: 

Assuming your implementation is correct and your colleagues not being able to reproduce the problem, it might be that your MFT is running out of space.

By default, Windows XP reserves 12.5 percent of each NTFS volume (an area called the MFT zone) for exclusive use of the MFT. So if you plan to store tons of small files (under 8K, say) on your volume, your MFT may run out of space before your volume's free space does, and the result will be MFT fragmentation.

From Technet

First, the MFT doesn't shrink even when you delete files and directories from the volume; instead, the MFT marks the FRSs to reflect the deletion. Second, NTFS stores very small files within the MFT FRSs that refer to the files. Although this setup provides a performance benefit for these files, it can cause the MFT to grow excessively when the volume contains many such files.

Lieven
There's a constant number of files. Partition is 21 MiB, file size is 5 MB, hence, up to 2 files may be present on the partition simultaneously. Hence, it is not a case of "tons of small files" :)
Roman D
If memory serves me well, the MFT **does not** get *cleaned up* after every delete.
Lieven
You might want to read http://www.dslreports.com/forum/r19572516-Removing-names-of-deleted-files-from-MFT and http://stackoverflow.com/questions/764304/master-file-table-cleanup-utility for further information on this.
Lieven
Hm... Good point. Have you got any links to any official info on that?.. Is there any possibility to work this around?I actually considered this, but Windows initially allocates and marks up a MFT of 256KiB, which amounts to 64 MFT entries of 4KiB each. Isn't it enough for writing-deleting 2-3 files?..
Roman D
Oh, and i also see in the post-mortem disk dump that there is a plenty of untouched MFT entries in the table (I mean, ones that never were taken in the first place)
Roman D
Best information I've read about this is http://technet.microsoft.com/en-us/library/cc767961.aspx but I am in lingo myself that this would be the actual cause when you say there's enough space left in the MFT.
Lieven
Thank you! Maybe it will provide me with some insight.
Roman D
Whoops... I've forgotten to mention one more important fact. The problem occurs when I'm writing file data (i.e. write() fails), not when I create the file. So, it doesn't look like I'm lacking MFT entries.
Roman D
Sorry, this was the best I could come up with. I'd suggest to post your question on ServerFault.com.
Lieven
I'm not sure this question is suitable for SF, too :( I need some deep insight from the engineering point of view, I can't settle with recommendations like "Screw it, just don't write such files on such partitions". This problem is just a tip of the iceberg I'm dealing with. I guess I'll have to ask it at the Microsoft forums as well...
Roman D
+1  A: 

Good NTFS question and not all the information here. What is the directoriy structure?. Are there any LINK files on this? Are you using compression on the drive? Volume Shadow copies?

You do not run out of MFT space, since there is a constant # of files / directories. That means that the MFT is static. Also the MFT reserve space will be ued in low-disk space scenarios. I've used up every cluster on an NTFS volume.

There are several explanations as to what is happening:

1) The $Log file might have grown. This is a roll-back log.

2) The @SII file for the security info for file might have grown if there are non- uniform permissions on the drive

3) If some of the files on that volme have .lnk/ shortcut files pointing to them, the system puts a GUID for each target into an index. ( you get .lnk files if you 2x click on a file in explorer - in recent documents!)

4) the directory structure is NOT static ( or the file names are not uniform in length) the $index buffers of directories might grow in size.

5) If you have a system volume directroy on that drive, you might have volume shadow copies and othe OS specific data.

6) Alternate Data streams are not shown in the file's size. Are there any?

7) TxF - Under vista and higher there might be a transactional layer that takes up variable space.

8) Bad clusters? Clusters can go bad ( but chkdsk might note this .. )

9) The files become fragmented and the list of fragments alson with the other meta data is too big to fit into a MFT record (unlikely nince your files are small and you don't have massively long files)

10) The use of hardlinks also puts more data on the drive.

I've listed all of these as a reference for other people!

Final note - sometimes you can create and write a small file even if there are 0 bytes free, since NTFS resident files only take up an MFT record (they resue a deleted free one)

Dominik Weber
Thank you for the extensive reply! I tried to mention all of these points in the question.
Roman D
@RomanD - one way to find out why is to look at the volume with forensic software and see what is exactly happening.
Dominik Weber
I already did a lot of examination. Yesterday I tried to disable the TxF log, and the problem disappeared! So, I guess, case solved? I'll accept this answer, since it lists a whole lot of possible pitfalls and includes the one I've ran into.
Roman D
Thank for the answer. Btw - How did you disable the TxF log? TxF is my least familiar area of NTFS
Dominik Weber
You may use fsutil to turn it off. In command line:fsutil resource stop X:\It doesn't turn it off permanently, though. TxF will return after remount of the volume :)
Roman D
Thank you Roman - eventually I'm going to look in detail at the inner structures of TxF (when I get time for that)
Dominik Weber
I've found no info on that. Well, frankly, I didn't search too much :)
Roman D
A: 

While it's not terribly surprising that disabling TXF would have fixed this (TXF is just one component of the file system using space on the volume), there are two things to consider. The first, and more of an aside, really, is that you might want to be careful about disabling it. (Other components may depend upon it; Windows Update is one such component, though it ought to care only about the system volume.)

The other thing to consider is that this is generally and practically a fragile pattern. The file system has some assumptions about what it can consume. Indexes, for example, will grow (in predefined increments, subject to change), and they may not shrink in ways that you find predictable. Additionally, the security descriptor index is likely to continue to grow.

The other note above about the shadow copies is something to always keep in mind.

FWIW the $Log file won't grow automatically.

jrtipton
The partition where I turn the TxF log off is a temporary short-lived virtual disk, and I think that nothing should go terribly wrong there :) That's also why I'm not really concerned about security indices. You are right about the $Log file, it doesn't cause any trouble. But it looks like TxF may (and does) grow and shrink on its own.
Roman D
Of course, I wouldn't advise turning TxF off in most of the other cases
Roman D