tags:

views:

278

answers:

3

Hello,

I learning Perl and doing a home made project to my family (a subscription project). The Perl application that uses Net::POP3 connect to my mailbox and save all my emails to a file (Mail.txt). When I open this file I see a lot of junk, as below. What i can do to remove this? Thanks.

Return-Path: 
Received: from [unix socket] by embro.tpn.terra.com (LMTP); Sun, 11 Oct 2009 04:09:50
    +0000 (UTC)
X-Abaca-Spam: 153
X-Terra-Karma: -2%
X-Terra-Hash: 2c7d32f717e807b11af5c0871edb9e93
Received-SPF: pass (embro.tpn.terra.com: domain of linuxquestions.org designates
    208.101.3.244 as permitted sender) client-ip=208.101.3.244;
    [email protected]; helo=sql02.linuxquestions.org;
Received: from sql02.linuxquestions.org (smtp.linuxquestions.org [208.101.3.244])
    by embro.tpn.terra.com (Postfix) with ESMTP id 14EA1580000A2
    for ; Sun, 11 Oct 2009 04:09:49 +0000 (UTC)
Received: from web02.linuxquestions.org (web02-be.linuxquestions.org [10.13.156.4])
    by sql02.linuxquestions.org (8.13.8/8.13.8) with ESMTP id n9B49mXe005694
    for ; Sun, 11 Oct 2009 00:09:48 -0400
DomainKey-Signature: a=rsa-sha1; s=smtp; d=linuxquestions.org; c=simple; q=dns;
    b=Le/RccpkHMfH426hLwlLkIbCujr0LiWKM32ryuZ1fWwYU6VjCTocd30N/JAg+w77d
    54VJkNnpA18iQxJ/yfKyQ==
Received: from web02.linuxquestions.org (localhost.localdomain [127.0.0.1])
    by web02.linuxquestions.org (8.13.8/8.13.8) with ESMTP id n9B49m2f027957
    for ; Sun, 11 Oct 2009 00:09:48 -0400
Received: (from nobody@localhost)
    by web02.linuxquestions.org (8.13.8/8.13.8/Submit) id n9B49mNn027956;
    Sun, 11 Oct 2009 00:09:48 -0400
Date: Sun, 11 Oct 2009 00:09:48 -0400
To: [email protected]
Subject: "What programs would you like to see ported to Linux?" update
From: "LinuxQuestions.org" 
Auto-Submitted: auto-generated
Message-ID: 
X-Priority: 3
X-Mailer: LQ Mailer
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
Status: O

Dear nathanpc,
+5  A: 

It's not junk. It's email header. Use, for example, Mail::Message to parse it. Something like this:

my $msg_obj = Mail::Message->read($rawdata); my $body = $msg_obj->body;
gonzo
Thanks very much!
Nathan Campos
Kids these days, spoiled by graphical email viewers that hide all headers... :)
Ether
+1  A: 

You know, I did recommend Mail::POP3Client which abstracts away the details:

Body( MESSAGE_NUMBER )

Get the body of the specified message, either as an array of lines or as a string, depending on context.

BodyToFile( FILE_HANDLE, MESSAGE_NUMBER )

Get the body of the specified message and write it to the given file handle.

Sinan Ünür
He says he's learning, maybe he wants to do it the hard way for pedagogical reasons. Just like using the definition of the derivative to calculate derivatives for polynomial functions. The shortcut is WAY easier, but learning to apply the underlying method is valuable, too. http://web.mit.edu/wwmath/calculus/differentiation/polynomials.html
daotoad
@daotoad: First, I know how to take a derivative. Second, if one were to apply the same logic here, one would have started with reading the applicable RFC and one would have known what an email header is. And, how is using `Mail::Message` to parse the message (see accepted answer) different than what I recommend in principle?
Sinan Ünür
A: 

Email headers consist of all text up to the first completely blank line. So, if you actually do want to throw them away (rather than using a good module to parse them as the earlier examples suggested), just throw away everything up to and including the first blank line.

If you're looking at an mbox-format mailbox file containing multiple messages, you can identify the start of the next message's headers by looking for a line which starts with the five characters "From " (note the trailing space - this is what distinguishes it from a "From:" header).

Dave Sherohman
There is a good example in http://perldoc.perl.org/perlop.html#Range-Operators Scroll down to the examples.
Sinan Ünür