views:

148

answers:

1

I'm using jython 2.2.1, and jdbc 1.2 and connecting to a mssql 2000 database, writing the contents of an email to it. When I get to the body of the email which can be quite large sometimes I need to truncate the data at 5000 chars. Except mssql & jdbc gang up on me like school yard bullies, when i check the database loads of my data is missing, every time, with max chars = 256 chars.

I have checked the size of the field and it is set to 5000. what gives?

I am pretty sure it is related to jdbc, as the previous version used .... vb6 & odbc, without a hitch.

here is some code:

BODY_FIELD_DATABASE=5000

def _execute_insert(self):
    try:
        self._stmt=self._con.prepareStatement(\
            "INSERT INTO EmailHdr (EntryID, MailSubject, MailFrom, MailTo, MailReceive, MailSent, AttachNo, MailBody)\
             VALUES (?, ?, ?, ?, ?, ?, ?, cast(? as varchar (" + str(BODY_FIELD_DATABASE) + ")))")
        self._stmt.setString(1,self._emailEntryId)
        self._stmt.setString(2,self._subject)
        self._stmt.setString(3,self._fromWho)
        self._stmt.setString(4,self._toWho)
        self._stmt.setString(5,self._emailRecv)
        self._stmt.setString(6,self._emailSent)
        self._stmt.setString(7,str(int(self._attachmentCount) + 1))
        self._stmt.setString(8,self._format_email_body()) 
        self._stmt.execute()
        self._prepare_inserting_attachment_data()
        self._insert_attachment_data()
    except:
        raise

def _format_email_body(self):
    if not self._emailBody:
        return " "
    if len(self._emailBody) > BODY_FIELD_DATABASE:
        return self._clean_body(self._emailBody[:BODY_FIELD_DATABASE])
    else:
        return self._clean_body(self._emailBody)

def _clean_body(self,dirty):
    '''used to clean =20 occurrence in email body that contains chinese characters
       http://en.wikipedia.org/wiki/Quoted-printable'''
    dirty=str(dirty)
    return dirty.replace(r"=20","")
A: 

Deleted my answer - it was totally wrong. Keeping it here though so comments & conversation hang around.

EDIT:

As you can read in the comments, here's what happened:

The data was being put into the database fine, but the MSSQL Query Manager could not display the Chinese characters.

Harley
no there are still spaces in the text and it is well formatted, just minus the =20s.
Setori
=20 is related to this http://stackoverflow.com/questions/320166/emailretr-retrieves-strange-20-characters-when-the-email-body-has-chinese-chara it is solved this issue
Setori
also there are no =20 when it is an ascii only email, but still im sitting at 256 chars in the db.
Setori
I know what =20 is, but the fact remains that if your email was full of them, after you remove them it'll be smaller. =20 is not really specific to chinese chars either.
Harley
no you right it isnt specific to chinese characters but the probability is higher when there are chinese characters. Im not sure of a correct or optimal way of delivering an output without the =20, so i wack em this way. the occurrence is not so great to reduce 5000 chars to 256 chars in the db
Setori
Hmm I seem to have misread the question. Anyway, have you tried printing the body before committing it to the database? Would be interesting to see if that is the proper length.
Harley
sure, and i get the full length, no problem, well formatted (no blasted =20s) but as it goes through jdbc it falls over.
Setori
Can you access the database in another way and add a heap of text to that field?
Harley
let me try: good idea
Setori
oh my word: the problem was in the actual Query manager. It couldnt display chinese characters. the software packed up as soon as it hit a character. all the data was being written in correctly. False alarm
Setori