Add text to Existing PDF using Python

tags:

pdf
python

views:

1431

answers:

+7 Q:

Add text to Existing PDF using Python

Hi,
I need to add some extra text to an existing PDF using Python, what is the best way to go about this and what extra modules will I need to install.

Note: Ideally I would like to be able to run this on both Windows and Linux, but at a push Linux only will do.

Thanks in advance.
Richard.

Edit: pyPDF and ReportLab look good but neither one will allow me to edit an existing PDF, are there any other options?

You may have better luck breaking the problem down into converting PDF into an editable format, writing your changes, then converting it back into PDF. I don't know of a library that lets you directly edit PDF but there are plenty of converters between DOC and PDF for example.

Wahnfrieden 2009-07-24 21:03:21

Problem is that I only have the source in PDF (from a 3rd party) and PDF -> DOC -> PDF will lose a lot in the conversion. Also I need this to run on Linux so DOC may not be the best choice.

Frozenskys 2009-07-24 21:08:21

I believe Adobe keeps PDF editing capability pretty closed and proprietary so that they can sell licenses for their better versions of Acrobat. Maybe you can find a way to automate the usage of Acrobat Pro to edit it, using some kind of macro interface.

Wahnfrieden 2009-07-24 21:14:45

If the parts you want to write to are form fields, there are XML interfaces to editing them - otherwise I can't find anything.

Wahnfrieden 2009-07-24 21:15:57

No I just wanted to add a few lines of text to each page.

Frozenskys 2009-07-24 21:25:33

+1 A:

Have you tried pyPdf ?

Sorry, it doesn’t have the ability to modify a page’s content.

2009-07-24 21:13:14

Looks like that might work, has anyone used it? What's the memory usage like?

Frozenskys 2009-07-24 21:17:37

It does have the ability to add a text watermark and if it was formatted properly it might work.

Frozenskys 2009-07-24 21:24:41

If you're on Windows, this might work:

PDF Creator Pilot

There's also a whitepaper of a PDF creation and editing framework in Python. It's a little dated, but maybe can give you some useful info:

Using Python as PDF Editing and Processing Framework

thedz 2009-07-24 21:14:54

The white paper looks good but is a little light on code, and I don't really have the resource to implement a whole PDF framework myself! ;)

Frozenskys 2009-07-24 21:22:08

+7 A:

I know this is an older post, but I spent a long time trying to find a solution. I came across a decent one using only ReportLab and PyPDF so I thought I'd share:

read your PDF using PdfFileReader(), we'll call this input
create a new pdf containing your text to add using ReportLab, save this as a string object
read the string object using PdfFileReader(), we'll call this text
create a new PDF object using PdfFileWriter(), we'll call this output
iterate through input and apply .mergePage(text.getPage(0)) for each page you want the text added to, then use output.addPage() to add the modified pages to a new document

This works well for simple text additions. See PyPDF's sample for watermarking a document.

dwelch 2010-02-01 23:28:31

"create a new pdf containing your text to add using ReportLab, save this as a string object"How do you do that? Its a canvas instance.

Lakshman Prasad 2010-04-16 08:23:45

ansaurus

tags:

views:

answers:

Add text to Existing PDF using Python

related questions