ansaurus

Question

How do I extract Word documents from data recovered from USB device?

Answer 1

+2 A:

The Apache POI project has a library for reading and writing all kinds of MS Office docs. If the files are in the new XML base OOXML format, you'll be looking for the start of a zip file as the XML is compressed.

sblundy 2008-12-10 04:46:15

I have had trouble reading .docx files as zip file so don't count TO much on that. OTOH I was having lots of other problems there to so, 64mg NaCl

BCS 2008-12-10 07:17:53

Answer 2

+4 A:

Two approaches:

You can mount files as volumes in linux. Provided your binary blob isn't too corrupted, you'll probably be able to break down the filesystem to find out where you files are located. Is (was) it a FAT partition or NTFS?

If that doesn't work, I'd look for this string of bytes:

D0 CF 11 E0 A1 B1 1A E1

These are the "magic bytes" of office document file signatures. They might occur randomly in other data, but it's a start. You're going to run into MAJOR issues if the files are fragmented.

Also, try to recreate pieces of the document(s) in Word as is, save it to a file and extract chunks to search for in the blob (using grep binary or whatever). Provided you have info from all parts of the file you should be able to decode WHERE in the blob they are. Piecing it back into a working DOC binary seems far fetched, but recovering the rest of the text shouldn't be impossible.

Stefan Mai 2008-12-10 04:52:28

ansaurus

tags:

views:

answers:

How do I extract Word documents from data recovered from USB device?

related questions