views:

77

answers:

2

How I can get the number of pages in a PDF document ? The document can have images too, and text in different font size. It should work with different PDF document versions.

The answer can be in any scripting language, I will port them later to Ruby.

A: 

I can think of a band-aid solution which might just work. I am going to assume that you are developing a web application/web page which needs this information. In that case, let the adobe reader plugin for the browser load the pdf document. Then, use the plugin to attach/execute some 'Javascript for pdf' to the loaded document which will return the number of pages. The DOM for that function call can be found here:

http://www.adobe.com/devnet/acrobat/pdfs/js%5Fapi%5Freference.pdf

Further, you must also collect this information and get it back. You may also find this guide helpful:

http://www.adobe.com/devnet/acrobat/pdfs/Acro6JSGuide.pdf

Crimson
+1  A: 

Using pyPdf:

from pyPdf import PdfFileReader

pdf = PdfFileReader(file("document.pdf", "rb"))
print pdf.getNumPages()

I think there must be a similar library with similar functionality for Ruby.

lost-theory