I want to parse the text from a pdf file in perl without converting the pdf into any other format . Is it possible ?
+7
A:
Yes you can.
Take a look at the CAM::PDF package.
You can use this module to pull the text out.
my $pdf = CAM::PDF->new($filename);
my $pageone_tree = $pdf->getPageContentTree(1);
print CAM::PDF::PageText->render($pageone_tree);
Byron Whitlock
2010-10-29 06:32:12
Deleted mine, yours is the better package.
Powertieke
2010-10-29 12:42:38