ansaurus

Question

PHP - REGEX - use string for pattern but exclude it from being removed!

Answer 1

+1 A:

You want to remove everything between a semicolon and either a colon or the end of the line, right? So use that as your expression. You're overcomplicating things.

preg_replace('/(?:;.+?:)|(?:;.+?$)/m','',$data);

It's a pretty simple expression. Either match (?:;.+?:) or (?:;.+?$), which differ only by their terminator (the first one matches up to a colon, the second one matches up to the end of the line).

Each is a non-capturing group that starts with a semicolon, reluctantly reads in all characters, then stops at the terminator. Everything matched by this is removable according to your description.

Welbog 2010-04-28 14:51:05

Answer 2

+1 A:

If you want to retain part of the matched pattern in a substitution, you put parentheses around it and then refer to it by $1 (or whichever grouping it is).

For example:

s/^(this is a sentence) to edit/$1/

gives "this is a sentence"

dnagirl 2010-04-28 14:52:11

Answer 3

+1 A:

nik 2010-04-28 14:54:26

Answer 4

+1 A:

You could use a relatively simple regex like the following.

$subject = 'DTSTART;TZID="America/Chicago":20030819T000000
DTEND;TZID="America/Chicago":20030819T010000
DTSTART;TZID=US/Pacific
DTSTART;VALUE=DATE';

echo preg_replace('/^[A-Z]+\K[^:\n]*/m', '', $subject) . PHP_EOL;

It looks for a series of capital letters at the start of a line, resets the match starting point (that's what \K does) to the end of those and matches anything not a colon or newline (i.e. the parts you want to remove). Those matched parts are then replaced with an empty string.

The output from the above would be

DTSTART:20030819T000000
DTEND:20030819T010000
DTSTART
DTSTART

If the lines that you are interested in will only ever start with DTSTART or DTEND then we could be more precise about what to match (e.g. ^DT(?:START|END)) but [A-Z] obviously covers both of those.

salathe 2010-04-28 15:16:00

ansaurus

tags:

views:

answers:

PHP - REGEX - use string for pattern but exclude it from being removed!

related questions