views:

197

answers:

5

I have a pdf generated by 3rd party system. Using pdf editor or els software I have modified it. Is it possible to detect if pdf file was modified, without original file? 0 vote down check

I will add some more details.

There is no encryption and no signature features.

Document is created by IT system. User receives document and modifies it.

Is it possible to track that change somehow?

I thought that all these applications leaves some data in PDF header or somewhere encoded inside file and it is possible to check it. However properties showed by windows explorer shows nothing... so I was interested if there is something smarter than viewing properties/header in explorer.

+1  A: 

You could always check the md5sum of the pdf file. I'm not sure what environment you are using but that should help get you started.

Bartek
+1  A: 

It's going to be rough without the original file unless there were security features like encryption or digital signatures applied to it, which it doesn't sound like there was. Do you have access to any information at all about the original file? A file size, creation date, any of the metadata, etc.?

yu-chen-pdfonline-com
A: 

If the tool used to modify the PDF is working according to the PDF spec then in the Info dictionary it should update ModDate but leave CreationDate alone. You may also see some non-zero generation numbers on the objects although it is just as possible that all the objects have been regenerated and will therefore be generation 0. The trial version of CosEdit will allow you to look at these 2 items.

If however the tool has been used to intentionally modify the PDF without leaving a trace then they would be spoofing those bits of data so they won't help you.

danio
A: 

Are the users modifying the PDF using Acrobat? If so then what Danio mentioned above should work. Strictly speaking, modifying the PDF should change its ModDate or xmp:ModifyDate without changing its CreationDate. However not all tools adhere to this; quite a few simply leave all metadata untouched, so this method of checking isn't 100% reliable unless you know what PDF editor your users employ.

If the editor your users use does change ModDate or xmp:ModifyDate, then you should be able to see it in two places. One is when you open the document in Acrobat and hit Ctrl-D to view Document Properties. The Creation field and Modified field should have different timestamps. There may also be APIs that can be used to programmatically retrieve this metadata. The other way you can visualize it is to simply open the PDF in Notepad and search for the properties. Most of the document won't be human readable but these timestamps should be. If they do get changed appropriately, you can always parse for them in your application. Good luck!

yu-chen-pdfonline-com
A: 

Thanks guys. I have checked metadata, but application used for edition do not modified MODdate field :-(

stan