tags:

views:

57

answers:

3

I have some strings I need to scrape data from. I need a simple way of telling PHP to look in the string and delete data before and after the part I need. An example is:

When: Sat 19 Sep 2009 22:00 to Sun 20 Sep 2009 03:00 

I want to delete the "When: " and then remove the & and everything after it. Is this a Regex thing? Not really used them before.

+1  A: 

I would not use regular expressions for this.

$data = substr($input, 6, strpos($input, '&') - 6);
Alex Barrett
+1  A: 

Yes, regex can do this kind of thing in its sleep.

$result = preg_replace('/When:(.*)&.*/', '$1', $text);

UPDATE If you want to find the date range only, in the middle of a lot of other text, here is a crude regex that will match the one in the question...

if (preg_match('/[a-z]{3} [0-9]{2} [a-z]{3} [0-9]{4} [0-9]{2}:[0-9]{2} to [a-z]{3} [0-9]{2} [a-z]{3} [0-9]{4} [0-9]{2}:[0-9]{2}/i', $text, $regs)) {
    $result = $regs[0];
} else {
    $result = "";
}
rikh
This would be better: `$result = preg_replace('/When: ([^`
Jeremy Stein
Sorry, I should have mentioned that there could be plenty of text before, or after those areas. I need the regex to remove those parts and everything proceeding and after.
Plasticated
Ah, you didn't say that. I have updated the answer to include a regex that will find the date range in the middle of a general body of text. It is a bit crude, but by adding brackets around each part you will be able to access all the parts of the dates separately which might be useful.
rikh
A: 

So you would want to keep "Sat 19 Sep 2009 22:00 to Sun 20 Sep 2009 03:00"

Well you can go for a regexp alright. I don't know much about the Regexp in PHP, but in PERL, you could do somehing like

/^When: (.*)\ $/ .

The (.*) could then be used to get all that is what you want to keep. In PERL, that would be looking the $1 var.

Or you could do something like

/^When: (.)\&.$/ if the content after the & is variable.

Also, you must watch out. If the string you want to keep contains &, then it might a little more tricky.

But RegExp are usually the way to got for this type of work.

David Brunelle