views:

501

answers:

3

Here's the situation. I have a PDF with automatically generated pdf form field names. The problem is that these names are not very user friendly. They look something like : topmostSubform[0].Page1[0].Website_Address[0]

I want to be able to change them so that they are something like WebsiteAddress. I have access to ABCPDF and I have experience with iTextSharp, but I have tried using these API's to do this (access form fields and try to rename), but it does not seem as if it is possible.

Does anybody have any experience trying to do this via an API of some sort (preferably open source). Code is .Net also.

+2  A: 

The good news: you can change field names in iTextSharp.

You can't actually edit a PDF though. You'd read in an existing PDF, update your field names in memory and then write out your revised PDF. To change a field name call the AcroFields.RenameField method.

Here's a snippet:

PdfReader reader = new PdfReader(PDF_PATH);
using (FileStream fs = new FileStream("Test Out.pdf", FileMode.Create)) {
    PdfStamper stamper = new PdfStamper(reader, fs);
    AcroFields fields = stamper.AcroFields;
    fields.RenameField("oldFieldName", "newFieldName");
    stamper.Close();
}

Now the bad news: there appear to be limitations in the characters you can use in the renamed fields.

I tested the snippet above with your example field name and it didn't work. Remove the periods though and it does work. I'm not sure if there's a workaround but this may be a problem for you,

Jay Riggs
A: 

Yes, it's possible to rename form fields. I don't have any experience with an source code API that will help you with this, but my companies PDF SDK can help you do this, and from a little bit of searching it appears that iText will indeed let you rename form fields.

Rowan
+1  A: 

The full name of an AcroForm field isn't explicitly stored within a field. It's actually derived from a hierarchy of fields, with a dot delimited list of ancestors appearing on the left.

Simply renaming a field from 'topmostSubform[0].Page1[0].Website_Address[0]' to 'WebsiteAddress' is therefore unlikely to produce a correct result.

You'll find section 8.6.2 'Field Dictionaries' of the PDF reference provides a good explanation of how field naming works ;-)

Basically, each field in an AcroForm is define by a dictionary, which may contain certain optional entries pertaining to a field's name.

  • Key '/T' specifies the partial name. In your question, 'topmostSubform[0]', 'Page1[0]', and Website_Address[0], all represent partial names.

  • Key '/TU' specifies an alternative 'user-friendly' name for fields, which can be used in place of the actual field name for identifying fields in a user interface.

Instead of renaming the field in question, think about adding a /TU entry!

The example below uses ABCpdf to iterate through all the fields in an AcroForm and insert an alternate name into a field based on its partial name.

VBScript:

Set theDoc = CreateObject("ABCpdf7.Doc")
theDoc.Read "myForm.pdf"

Dim theFieldIDs, theList
theFieldIDs = theDoc.GetInfo(theDoc.Root, "Field IDs")
theList = Split(theFieldIDs, ",")

For Each fieldID In theList
    thePartialName = theDoc.GetInfo(fieldID, "/T:text")
    theDoc.SetInfo fieldID, "/TU:text", thePartialName
Next

theDoc.Save "output.pdf"
theDoc.Clear

Changing "/TU:text" to "/T:text" will set a field's partial name.

Examples written in C# and VB.NET of the functions used can be found here: Doc.GetInfo, Doc.SetInfo. See also the documentation on Object Paths.

Alberto Rossini