views:

275

answers:

4

how to open a password protected Microsoft word(.doc, .docx) file in java assuming that the password is known?

A: 

Use a suitable library. A good starting point is the OpenOffice API

Thorbjørn Ravn Andersen
Downvoted? Wonder why....
Thorbjørn Ravn Andersen
@Thorb Maybe because if the OP knew what "a suitable library" was, they wouldn't need to ask. Now all they get out of your answer is a bing for "OpenOffice API". And http://api.openoffice.org/SDK/example_collection.html is obviously lacking in content. :)
bzlm
Given the amount of details in the original question it is hard to answer more precisely. Thanks for the link.
Thorbjørn Ravn Andersen
A: 

In our projects, we use Aspose to manage Office documents, but we do not deal with password-protected documents, but I imagine that this library handles such cases...

romaintaz
+1  A: 

You can try it with com4j.

http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word.documents.open2000.aspx

Since there is a parameter called "PasswordDocument" in the "open"-method, I think it is possible to open a password protected file.

Hope this is what you were searching for ;)

Edit: I recorded this Macro in Word.

Documents.Open FileName:="test.doc", ConfirmConversions:= _
    False, ReadOnly:=False, AddToRecentFiles:=False, PasswordDocument:= _
    "hallo", PasswordTemplate:="", Revert:=False, WritePasswordDocument:= _
    "hallo", WritePasswordTemplate:="", Format:=wdOpenFormatAuto

So the open method in com4j should look somethin like this (password is "Hallo"):

     _Document document = app.documents().open2000(doc, false, false, false, "hallo", "", false, "hallo", "", WdOpenFormat.wdOpenFormatAuto, false, true);
Tronje182
A: 

A good starting point would be the Apache POI project which supports Office 97-2003 and OOXML (2007-2010) formats. If you are mainly interested in extracting text from those files, you should also look at the Tika project that has some good code, such as OfficeParser.java

You will want to substitute in your known password(s) around line 220 in the parse() method:

if (!d.verifyPassword(Decryptor.DEFAULT_PASSWORD)) {
throw new TikaException("Unable to process: document is encrypted");
}

-- the default password is set to the mostly useless password "VelvetSweatshop" (!)

bluprintz
VelvetSweatshop is not just a curious string, in this instance - it is the default Excel password used when no password is set, but the workbook is "protected"...
Stobor