I am trying to use ReportLab with Unicode characters, but it is not working. I tried tracing through the code till I reached the following line:
class TTFont:
# ...
def splitString(self, text, doc, encoding='utf-8'):
# ...
cur.append(n & 0xFF) # <-- here is the problem!
# ...
(This code can be found in ReportLab's repository, in the file pdfbase/ttfonts.py. The code in question is in line 1059.)
Why is n
's value being manipulated?!
In the line shown above, n
contains the code point of the character being processed (e.g. 65 for 'A', 97 for 'a', or 1588 for Arabic sheen 'ش'). cur
is a list that is being filled with the characters to be sent to the final output (AFAIU). Before that line, everything was (apparently) working fine, but in this line, the value of n
was manipulated, apparently reducing it to the extended ASCII range!
This causes non-ASCII, Unicode characters to lose their value. I cannot understand how this statement is useful, or why it is necessary!
So my question is, why is n
's value being manipulated here, and how should I proceed about fixing this issue?
Edit:
In response to the comment regarding my code snippet, here is an example that causes this error:
my_doctemplate.build([Paragraph(bulletText = None, encoding = 'utf8',
caseSensitive = 1, debug = 0,
text = '\xd8\xa3\xd8\xa8\xd8\xb1\xd8\xa7\xd8\xac',
frags = [ParaFrag(fontName = 'DejaVuSansMono-BoldOblique',
text = '\xd8\xa3\xd8\xa8\xd8\xb1\xd8\xa7\xd8\xac',
sub = 0, rise = 0, greek = 0, link = None, italic = 0, strike = 0,
fontSize = 12.0, textColor = Color(0,0,0), super = 0, underline = 0,
bold = 0)])])
In PDFTextObject._textOut
, _formatText
is called, which identifies the font as _dynamicFont
, and accordingly calls font.splitString
, which is causing the error described above.