tags:

views:

430

answers:

3

I'm using pdftk to fill in a PDF form with an XFDF file. However, for this project I do not know in advance what fields will be present, so I need to analyse the PDF itself to see what fields need to be filled in, present an interface to the user accordingly, and then generate an XFDF file from that to fill in the PDF form.

How do I get the field names? Preferably command-line, .NET or PHP solutions.

+1  A: 

I can get my client to export the XFDF file (which contains field names) using Acrobat along with the PDF, which avoids this problem completely.

Christopher Done
+1  A: 

I used the following code, using ABCpdf from WebSupergoo, but I imagine most libraries have comparable classes:

protected void Button1_Click(object sender, EventArgs e)
    {
        Doc thedoc = new Doc();
        string saveFile = "~/docs/f1_filled.pdf";
        System.Text.StringBuilder sb = new System.Text.StringBuilder();
        thedoc.Read(Server.MapPath("~/docs/F1_2010.pdf"));
        foreach (Field fld in thedoc.Form.Fields)
        {
            if (!(fld.Page == null))
            {
                sb.AppendFormat("Field: {0}, Type: {1},page: {4},x: {2},y: {3}\n", fld.Name, fld.FieldType.ToString(), fld.Rect.Left, fld.Rect.Top, fld.Page.PageNumber);
            }
            else
            {
                sb.AppendFormat("Field: {0}, Type: {1},page: {4},x: {2},y: {3}\n", fld.Name, fld.FieldType.ToString(), fld.Rect.Left, fld.Rect.Top, "None");
            }
            if (fld.FieldType == FieldType.Text)
            {
                fld.Value = fld.Name;
            }

        }

        this.TextBox1.Text = sb.ToString();
        this.TextBox1.Visible = true;
        thedoc.Save(Server.MapPath(saveFile));
        Response.Redirect(saveFile);
    }

This does 2 things: 1) Populates a textbox with the inventory of all Form Fields, showing their name, fieldtype, and their page number and position on the page (0,0 is lower left, by the way). 2) Populates all the textfields with their field name in an output file - print the output file, and all of your text fields will be labelled.

Eric Flamm
A: 

Easy! You are using pdftk already

# pdftk input.pdf dump_data_fields

It will output Field name, field type, some of it's properties (like what are the options for dropdown list or text alignment) and even a Tooltip text (which I found to be extremely useful)

The only thing I'm missing is field coordinates...

TEHEK