Find Tables in PDF's

views:

answers:

+1 Q:

Find Tables in PDF's

Hello,

Are there any tools or tricks how to automatically extract tables from pdfs. Are there any C# libraries that could do that? Or do you maybe know other methods how this could be handled?

Thank you very much

You can use the iTextSharp library to deal with PDFs : http://sourceforge.net/projects/itextsharp/

I've only used it to generate PDFs programatically, but Im fairly certain you can use it to pull them apart.

There's a tutorial here : http://itextsharp.sourceforge.net/tutorial/index.html

ThePaddedCell 2010-04-23 14:47:53

Please don't recommend products unless you know whether or not they can actually do what you're recommending them for. It just adds noise.

Rowan 2010-04-25 19:46:31

Is it not better to suggest something, then suggesting nothing at all?

ThePaddedCell 2010-04-26 13:07:17

+1 A:

PDF files do not contain table structures - several tools will try and 'guess' them.

mark stephens 2010-04-23 20:51:28

+2 A:

i found a interesting site and one master thesis about this topic

Information Extraction - Utilizing Table Patterns

http://ieg.ifs.tuwien.ac.at/projects/pdf2table/

if anybody finds more informations please keep on posting...

nWorx 2010-04-29 16:58:55

ansaurus

tags:

views:

answers:

Find Tables in PDF's

related questions