tags:

views:

71

answers:

4

Hi, This is a string:

--0-1946616131-1282798399=:21360 Content-Type: text/plain; charset=us-ascii --------------
------ do not change ---------------------------- Ticket ID : #987336 --------------------
------------------------------------------- Hello, This is my problem try to solve this 
thank u --0-1946616131-1282798399=:21360 Content-Type: text/html; charset=us-ascii"

Now I want to remove -

--0-1946616131-1282798399=:21360 Content-Type: text/plain; charset=us-ascii

and

--0-1946616131-1282798399=:21360 Content-Type: text/html; charset=us-ascii

section from it. I mean clean the text.

How can I do that?

A: 

You could do two a regular expression, or you could try a couple splits. Here's the second option:

//the original string
$string = "--0-1946616131-1282798399=:21360 Content-Type: text/plain; charset=us-ascii -------------------- do not change ---------------------------- Ticket ID : #987336 --------------------------------------------------------------- Hello, This is my problem try to solve this thank u --0-1946616131-1282798399=:21360 Content-Type: text/html; charset=us-ascii";
//split the string into lines separated by --0-
$splitstring = explode("--0-",$string);
print "<pre>";
print_r($splitstring);
print "</pre>";
//create an array that will be our final clean strings
$cleanstrings = array();
//go through each of our lines
foreach($splitstring as $line){
    //if it has content
    if (strlen($line)>0) {
        //then split it again to get rid of the junk sections
        $splitline = explode("charset=us-ascii",$line);
        //if the second part of the string has content
        if (strlen($splitline[1])>0) {
            //then add it to our list of clean strings
            $cleanstrings[] = $splitline[1];
        }
    }
}
print "<pre>";
print_r($cleanstrings);
print "</pre>";
Bob Baddeley
The reason I posted this option is because I have a suspicion that the part he wants to strip out isn't always going to be the same, so he'll have to either make a complicated replace function that captures all the possibilities, or do something like this.
Bob Baddeley
A: 

Use this simple one line code (where $text is the input text):

$newtext = str_replace('--0-1946616131-1282798399=:21360 Content-Type: text/plain; charset=us-ascii', '', $text);
shamittomar
actually this text is from a part of a multi-part email message which came from yahoo.
Emrul Hasan
A: 

Please clarify whether this string variates and how or is always the same?

Also it seems that you are doing something wrong in the first place to get this string. Or do you have no control over the incoming string?

Functions to look at: str_replace, preg_replace and explode

Thomas
Actually I was reading pop3 emails and this is the body of the message that came from yahoo mail. gmail and other mail body is ok. but problem with yahoo. I want to save messages in the DB.
Emrul Hasan
A: 

This seems like part of a MIME multipart message. If that's the case, the parts you are looking to remove are unpredictable.

The break between different parts should be specified in the message header like so:

Content-Type: multipart/mixed; boundary="frontier"

boundary="frontier" means that every new part of the message will be introduced by something like this:

--frontier
Content-Type: text/plain

Since the sender of the message is completely free to choose any text he likes for the boundaries, they're unpredictable without looking at the message header. Unless you have a really specific case of very specific boundaries, it's almost impossible to reliably remove the boundary text after the fact. It needs to be "cleaned up" while the message is being parsed.

If you are dealing with a very limited, predictable set of boundaries, you should specify their format and try to remove them with a regular expression.

deceze
I am parsing emails using IMAP functions and saving in DB(from, date, subject and body). that mail came from yahoo. gmail are ok. Is there any good way to do that(reading pop3 emails and save them in db)?
Emrul Hasan
@Emrul Yes, you need to parse the mails properly. There are many different ways in which mails can be encoded, just because the ones you get from Gmail consist only of plain-text doesn't mean that's all there is to it. You need to anticipate multipart messages too, in which case you need to take a look at functions like `imap_fetchstructure` and `imap_fetchbody`. You also need to anticipate the different transport encodings and text encodings mails could be in.
deceze