tags:

views:

59

answers:

2

Hi, can someone point me to Python PDF package that can do metadata writing? I was surprised that I couldn't find any. I mean I found Python XMP Toolkit, but building Exempi on cygwin is nightmare I want to avoid.

Thanks

A: 
  • pdfrw seems to have some support for this (not a lot of documentation, but looking at the alter.py example, it seems to be possible). I haven't got any experience with pdfrw, so I can't tell you how mature it is...
  • not quite the same thing, but this SO question gives a solution with the better known pyPdf library: http://stackoverflow.com/questions/2574676 (makes a new pdf - with the desired metadata - and copies pages from the old pdf to the new one. This will lose any information not contained in pages I guess)

Note: both examples above work with pdf docinfo metadata, not xmp metadata (which may be possible too, I don't know)

You could of course always wrap a command line application like pdftk.

I would also suppose that there must be some other libraries that are able to do this, but this is all I can come up with myself right now ;-)

Steven
Thanks for your answer :) Tiny pdfrw package is perfect solution to my metadata problem
romor
A: 

You might want to get a command line to modify metadata. Some tools modify both native and XMP metadata.

I wrote a blog on metadata editing a while ago: http://www.barcodeschool.com/2010/09/publishers-fix-the-metadata-in-the-pdf-file/

Sherwood Hu