HI All,
I have a PDF file with a xml attached, i need to parse the xml file. Does anyone knows how i do that? I´m using C#.
Thanks in advance.
HI All,
I have a PDF file with a xml attached, i need to parse the xml file. Does anyone knows how i do that? I´m using C#.
Thanks in advance.
PDF files can have a meta data information object or is it an XML file embedded as an object?
I believe this blog post describing how read from a PDF file using C# is what you want.
This is the example he gives of grabbing text from the PDF:
using System;
using org.pdfbox.pdmodel;
using org.pdfbox.util;
namespace PDFReader
{
class Program
{
static void Main(string[] args)
{
PDDocument doc = PDDocument.load("lopreacamasa.pdf");
PDFTextStripper pdfStripper = new PDFTextStripper();
Console.Write(pdfStripper.getText(doc));
}
}
}
Here is what looks like an exhaustive and highly organized list of how to read PDFs with C#.
If what you need is some form of embedded meta data, as Mark suggested, I'm sure it's also possible with the to fetch using the tools I've linked to.