views:

219

answers:

4

This may be a stupid question, but Google and MATLAB documentation have failed me. I have a rather large binary file (>10 GB) that I need to open and delete the last forty million bytes or so. Is there a way to do this without reading the entire file to memory in chunks and printing it out to a new file? It took 6 hours to generate the file, so I'm cringing at the thought of re-reading the whole thing.

EDIT:

The file is 14,440,000,000 bytes in size. I need to chop it to 14,400,000,000.

A: 

I don't know if MATLAB supports this, but see ftruncate() and truncate().

KennyTM
+2  A: 

Since you don't want to read the file into MATLAB (understandably), you are dealing with system level commands. MATLAB has a facility to call system commands using the "system" command

system

So now your problem is reduced to finding the shell command in your OS that will do it for you. Or you can write a program using truncate() (unix -- KennyTM) or SetEndOfFile (windows)

Marc
+3  A: 

I found Perl is much quicker to do this than MATLAB.

Here are two examples from Perl Cookbook:

truncate(HANDLE, $length)
    or die "Couldn't truncate: $!\n";

truncate("/tmp/$$.pid", $length)
    or die "Couldn't truncate: $!\n";

You can run Perl script from MATLAB with PERL function.

yuk
This sounds like the perfect solution --- but I haven't tested it.
Jacob
I actually like Andrew's solution better. More natural to MATLAB.
yuk
...and I'm now I've decided to learn Perl. Seems pretty useful.
Doresoom
I just found that truncate does not work on files over 4GB (WinXP) if you use file name (not file handle) as an argument. Hmm, interesting.
yuk
+4  A: 

There is no ftruncate() in Matlab, but you've got access to the full Java standard library in the JVM embedded in Matlab, and can use java.io.RandomAccessFile or the Java NIO classes to truncate a file.

Here's a Matlab function that calls to Java to lop the last n bytes off a file. Should have minimal I/O cost.

function remove_last_n_bytes_from_file(file, n)

jFile = java.io.RandomAccessFile(file, 'rw');
currentLength = jFile.length();
wantLength = currentLength - n;
fprintf('Truncating file %s: Resizing to %d to remove %d bytes\n', file, wantLength, n);
jFile.setLength(wantLength);
jFile.close();

You could also do it as a one-liner.

java.io.RandomAccessFile('/path/to/my/file.bin', 'rw').setLength(n);
Andrew Janke
Wow, works great! I tested just the last one-liner on >4GB file.
yuk
+1 - This works great, but I just got yuk's solution working about 5 minutes before you posted. Thanks anyway!
Doresoom