I have colleagues working on a .NET 1.1 project, where they obtain XML files from an external party and programmatically instruct iTextSharp to generate PDF content based on the XML data.
The tricky part is, within this XML are segments of arbitrary HTML content. These are HTML code users copied and pasted from their Office applications. Still looks ok on a web browser, but when this HTML is fed into iTextSharp's HTMLWorker object to parse and convert into PDF objects, the formatting and alignment run all over the place in the generated PDF document. E.g.
<span id="mceBoundaryType" class="portrait"></span>
<table border="0" cellspacing="0" cellpadding="0" width="636" class="MsoNormalTable"
style="margin: auto auto auto 4.65pt; width: 477pt; border-collapse: collapse">
<tbody>
<tr style="height: 15.75pt">
<td width="468" valign="bottom" style="padding-right: 5.4pt; padding-left: 5.4pt;
padding-bottom: 0in; width: 351pt; padding-top: 0in; height: 15.75pt; background-color: transparent;
border: #ece9d8">
<p style="margin: 0in 0in 0pt" class="MsoNormal">
<font face="Times New Roman"> </font></p>
</td>
<td colspan="3" width="168" valign="bottom" style="padding-right: 5.4pt; padding-left: 5.4pt;
padding-bottom: 0in; width: 1.75in; padding-top: 0in; height: 15.75pt; background-color: transparent;
border: #ece9d8">
<p style="margin: 0in 0in 0pt; text-align: center" class="MsoNormal" align="center">
<u><font face="Times New Roman">Group</font></u></p>
</td>
</tr>
The tags are full of Style attributes, and iTextSharp does not support CSS and interpreting that attribute. What are some alternatives other iTextSharp users have tried to workaround this, or other feasible HTML-to-PDF components?