views:

88

answers:

2

Here is the problem:

for your reference: http://www.freeimagehosting.net/uploads/b443e7a1fe.jpg

database entries 1,2 and 3 are made using jython 2.2.1 using jdbc1.2. database entry 4 is made using vb the old to be replace program using odbc.

We have found that if I copy and paste both jython and vb MailBody entries to wordpad directly from that SQL Server Enterprise Manager software it outputs the format perfectly with correct line returns. if I compare the bytes of each file with a hex editor or KDiff3 they are binary identically the same.

There is a 3rd party program which consumes this data. Sadly that 3rd party program reads the data and for entries 1 to 3 it displays the data without line returns. though for entry 4 it correctly formats the text. As futher proof we can see in the picture, the data in the database is displayed differently. Somehow the line returns are preserved in the database for the vb entries but the jython entries they are overlooked. if I click on the 'MailBody' field of entry 4 i can press down i can see the rest of the email. Whereas the data for jython is displayed in one row.

What gives, what am i missing, and how do I handle this? Here is a snippet of the code where I actually send it to the database.

EDIT: FYI: please disregard the discrepancies in the 'Processed' column, it is irrelevant. EDIT: what i want to do is make the jython program input the data in the same way as the vb program. So that the 3rd party program will come along and correctly display the data. so what it will look like is every entry in 'MailBody' will display "This is a testing only!" then next line "etc etc" so if I was to do a screendump all entries would resemble database entry 4.

SOLVED

add _force_CRLF to the mix:

def _force_CRLF(self, data):
    '''Make sure data uses CRLF for line termination.
    Nicked the regex from smtplib.quotedata. '''
    print data
    newdata = re.sub(r'(?:\r\n|\n|\r(?!\n))', "\r\n", data)
    print newdata
    return newdata

def _execute_insert(self):
    try:
        self._stmt=self._con.prepareStatement(\
            "INSERT INTO EmailHdr (EntryID, MailSubject, MailFrom, MailTo, MailReceive, MailSent, AttachNo, MailBody)\
             VALUES (?, ?, ?, ?, ?, ?, ?, cast(? as varchar (" + str(BODY_FIELD_DATABASE) + ")))")
        self._stmt.setString(1,self._emailEntryId)
        self._stmt.setString(2,self._subject)
        self._stmt.setString(3,self._fromWho)
        self._stmt.setString(4,self._toWho)
        self._stmt.setString(5,self._format_date(self._emailRecv))
        self._stmt.setString(6,self._format_date(self._emailSent))
        self._stmt.setString(7,str(self._attachmentCount))
        self._stmt.setString(8,self._force_CRLF(self._format_email_body()))
        self._stmt.execute()
        self._prepare_inserting_attachment_data()
        self._insert_attachment_data()
    except:
        raise

def _format_email_body(self):
    if not self._emailBody:
        return "could not extract email body"
    if len(self._emailBody) > BODY_TRUNCATE_LENGTH:
        return self._clean_body(self._emailBody[:BODY_TRUNCATE_LENGTH])
    else:
        return self._clean_body(self._emailBody)

def _clean_body(self,dirty):
    '''this method simply deletes any occurrence of an '=20' that plagues my output after much testing this is not related to the line return issue, even if i comment it out I still have the problem.''' 
    dirty=str(dirty)
    dirty=dirty.replace(r"=20","")
    return r"%s"%dirty
+1  A: 

I suggest to add a debug output to your program, dumping character codes before insertion in DB. There are chances that Jython replace CrLf pair with single character and doesn't restore it when written to DB.

Dmitry Khalatov
when i output the debug info, there is correct formatting.please note this http://mail.python.org/pipermail/spambayes/2003-April/004477.html any connection?
Setori
correct man! nice!
Setori
seems right answers and voting is disabled!! anyway let me update the debug info i did.
Setori
+1  A: 

You should look at the quopri module (and others regarding email) so you don't have to use dirty tricks as _clean_body

rapto
cool man thanks for the tip
Setori