I'd like to create a sparse file such that all-zero blocks don't take up actual disk space until I write data to them. Is it possible?
As in other Unixes, it's a feature of the filesystem. Either the filesystem supports it for ALL files or it doesn't. Unlike Win32, you don't have to do anything special to make it happen. Also unlike Win32, there is no performance penalty for using a sparse file.
On MacOS, the default filesystem is HFS+ which does not support sparse files. You can optionally format a volume with UFS which does support sparse files. HFS+ is the default filesystem on MacOS because it supports the archaic "resource fork" stuff which a few programs still use.
hdiutil can handle sparse images and files but unfortunately the framework it links against is private.
You could try defining external symbols as defined by the DiskImages framework below but this is most likely not acceptable for production code, plus since the framework is private you'd have to reverse engineer its use cases.
cristi:~ diciu$ otool -L /usr/bin/hdiutil
/usr/bin/hdiutil: /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages (compatibility version 1.0.8, current version 194.0.0) [..]
cristi:~ diciu$ nm /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages | awk -F' ' '{print $3}' | c++filt | grep -i sparse
[..]
CSparseFile::sector2Band(long long)
CSparseFile::addIndexNode()
CSparseFile::readIndexNode(long long, SparseFileIndexNode*)
CSparseFile::readHeaderNode(CBackingStore*, SparseFileHeaderNode*, unsigned long)
[... cut for brevity]
Later Edit
You could use hdiutil as an external process and have it create an sparse disk image for you. From the C process you would then create a file in the (mounted) sparse disk image.
If you want portability, the last resort is to write your own access function so that you manage an index and a set of blocks.
In essence you manage a single file as the OS manages the disk keeping the chain of the blocks that are part of the file, the bitmap of allocated/free blocks etc.
Of course this will lead to a non optimized and slower access, I would reccomend this apprach only if the requirement to save space is absolutely critical and you have enough time to write a robust set of access functions.
And even in that case, I would first investigate if your problem is in need of a different solution. Probably you should store your data differently?
If you seek (fseek, ftruncate, ...) to past the end, the file size will be increased without allocating blocks until you write to the holes. But there's no way to create a magic file that automatically converts blocks of zeroes to holes. You have to do it yourself.
This may be helpful to look at (the OpenBSD cp command inserts holes instead of writing zeroes). patch
There seems to be some confusion as to whether the default Mac OS X filesystem (HFS+) supports holes in files. The following program demonstrates that this is not the case.
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
void create_file_with_hole(void)
{
int fd = open("file.hole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
lseek(fd, 99988, SEEK_CUR); // Make a hole
write(fd, "Goodbye", 7);
close(fd);
}
void create_file_without_hole(void)
{
int fd = open("file.nohole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
char buf[99988];
memset(buf, 'a', 99988);
write(fd, buf, 99988); // Write lots of bytes
write(fd, "Goodbye", 7);
close(fd);
}
int main()
{
create_file_with_hole();
create_file_without_hole();
return 0;
}
The program creates two files, each 100,000 bytes in length, one of which has a hole of 99,988 bytes.
On Mac OS X 10.5 on an HFS+ partition, both files take up the same number of disk blocks (200):
$ ls -ls
total 400
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.hole
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.nohole
Whereas on CentOS 5, the file without holes consumes 88 more disk blocks than the other:
$ ls -ls
total 136
24 -rw------- 1 user nobody 100000 Oct 10 13:46 file.hole
112 -rw------- 1 user nobody 100000 Oct 10 13:46 file.nohole
Hello, This thread becomes a comprehensive source of info about the sparse files. Here is the missing part for Win32:
Tool that estimates if it makes sense to make file as sparse
Regards