pdf

iTextSharp - Outputting a segment of HTML into PDF (into a Table Cell)

I'm using iTextSharp to generate PDFs for an ASP.NET application, the PDF generation seems to work fine, though I'm finding iTextSharp a little unintuitive to use, but that's a different story. I'm putting data into a table inside my PDF, now I want to place HTML content into a table cell and have it maintain the formatting/styling. I'v...

how to extract text from djvu and other ebooks format (possibly in Python)

I have a collection of ebooks in djvu, pdf, chm format and I am looking for a way to search the keyword in the content. I have been researching around and find couple suggestion to parse pdf content but there seems to be no way to convert the content in djvu into text. By any chance, does anyone know a way to decode djvu content into tex...

Create PDF from Merge document via ASP C#

Okay, here is what i want to do.. A user will run a query and return 3 pieces of data (really more but for this example lets say 3.. such as name, city, state). The user will then be prompted with a list of 1 or more documents that are basically word merge documents. Does not have to be word but needs to be something standard. Once ...

how to parse < sign htmlparser.Parse(sr)?

hi i am trying to export html to pdf. using itextsharp , In the html table one value appears like this <34 . while parsing , this is giving error. (closing tag i.e.> is required ...) please tell me how to get through this? thanks in advance, rsd ...

HttpModule: Open a pdf file will raise several PreRequestHandlerExecute event

I'm writing a small asp.net program used to log information whenever some pdf files is accessed. I use httpmodle to achieve that. But the problem is if the pdf file is big (>1M or so), more than one PreRequestHandlerExecute events will be raised (if I download the file, only one event will be raised). These pdf files belong to another we...

What third party library in .net that could properly determine if the pdf is corrupted or not?

Hi, is there a third party library in .net that could determine properly if the pdf file is corrupted or not? We've been using ABCPDF but it have some problems in determining some files if it corrupted or not. Is there other third party libraries much better than ABCPDF in determining if the pdf file is corrupted or not? Thanks in advan...

Moving Items on a PDF Document

I have been charged with working on a feature for certificates my company prints out for clients when they completing specific training course. Currently we provide them with a basic PDF file that looks like the average Joe award but many of our customers want to be able to add and reposition content on the actual PDF and then print them...

Downloading multiple pdf files using wget fails (403 error)

I'm trying to download multiple pdf files from a web page (I'm using Mac OS X 10.6.1). Here is a example what I'm getting (www.website.org is just for example): ~> wget -r -A.pdf http://www.website.org/web/ --2009-10-09 19:04:53-- http://www.website.org/web/ Resolving www.website.org... 208.43.98.107 Connecting to www.website.org|208....

Seach&Replace strings in PDF with perl/ruby/php

Hi! I'm looking for a way to script replacing strings in PDF documents. I can use either perl, ruby or php. If possible, regex would be a blast... Thank you! ...

extract text from pdf in Javascript

Hi, I wonder if is possible to get the text inside of a PDF file by using only Javascript? If yes, what libraries should I use? I know there are some server-side java, c#, etc libraries but I would prefer not using a server. thanks ...

Extract image from PDF using .Net c#

I want to extract images from PDF. Tried many solutions but still not getting solution. Help me out....Thanks in advance ...

Need the .NET PDF library to edit pdf info

Hello there, I need the 100% .NET library to edit PDF Info like Author, Title, Creator, Subject and Keywords. All PDF libraries I tried are unable to do this without completely resaving the hole PDF documents. So for huge files (>35MB) it takes too much time. I need only to update several text fields (see above) and I don't need to resa...

Programmatically extracting Adobe PDF package files

We have a bunch of documents in our organization that were inadvertently saved as Adobe PDF packages (also known as PDF 1.7 "collections"). We would like to convert these to normal PDFs (most of these "packages" contain one bog-standard pdf file), but given the number of files, it's not possible manually. Any Adobe expert know whether: ...

.NET component to view and annotate documents and images

I need a component that I can install on my server that can be used in an ASP.NET Web Application to: Display word documents, PDF documents, images in the browser. Allow these documents to be annotated online and have those annotation saved for redisplay later. Thanks in advance. ...

Using EPiServer to publish a PDF document

We have an EPI Server site. We would like to publish the contents from a PDF document as part of a published page. The published page should have the right menu and the top menu of the site, but the contents should come from the PDF document. Is there a PDF viewer or another way to do this? Thanks Shiraz ...

How to remove a column from an iTextSharp table

In my c# app, I have a function to generate a PDF document using iTextSharp, which includes a table of figures. The table (PdfPTable specifically) is populated and then inserted into the document. After it has been populated, under certain conditions, I would like to remove one of the columns - does anyone know how to do this? I know I...

Batch OCRing PDFs that haven't already been OCR'd.

If I have 10,000 PDFs, some of which have been OCRed, some of which have 1 page that has been OCRed but the rest of the pages have not, how can I go through all the PDFs and only OCR the pages that haven't already been done? ...

Problem lInking to PDF files in a Flash projector on the Mac

I'm a designer, not a developer, so please forgive any obvious oversights on my part! I am building a CD with a Flash interface that launches on loading. Inside it is linking to several PDFs. It linked correctly on my local machine, a mac, and in a test environment. However, once burning the projector files to the CD it does not link co...

How to show thumbnail of pdf in listview C#.net?

ImageList imageList = new ImageList(); if (folder != null && System.IO.Directory.Exists(folder)) { try { string[] arrImageName=new string[1000]; int Count = 0; string CutName; ...

How to get DPI, width and length of an image in PDF in PHP

Hi, Suppose a single image is saved as pdf. How can I get DPI, width and length information about the image in PDF file? How can I do it in PHP? Basically I want to retrieve the following information: On a particular private website I uploaded my pdf and got following informtaion: Size of input file: 285.81 KB Import time: 0 sec Sourc...