ansaurus

Question

Preg_split() help

Answer 1

+2 A:

First of all: use a parser to modify XML (something like SimpleXML of DOM could suit you fine, depending on the actions taken next).

However, for the sake of argument:

preg_split(":(</?word>):",
    "<word>test</word><word>test2</word>",
    0,
    PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);

Wrikken 2010-08-05 19:21:23

What's with the `is` modifiers; I'd give a vote if they weren't just copy/pasted from the question.

salathe 2010-08-05 19:36:37

Ah, yes, wholly unnecessary indeed. I'll edit them out. (I do remember when starting out with regexes years ago I typed `/six` almost per default :), at this moment I was just lazy c/p-ing of course... :P )

Wrikken 2010-08-05 19:39:10

And here is your upvote, thank you for indulging a persnickity regex-author. :-)

salathe 2010-08-05 19:41:46

And right you are to point it out ;)

Wrikken 2010-08-05 20:02:26

Answer 2

A:

First off, NEVER USE REGEX TO PARSE HTML..

But to solve your problem, look at the flags for preg_split()

preg_split(
    ":(</?word>):is", 
    $html, 
    -1, 
    PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY
);

Now, it'll split them and give you this:

array(7) {
  [0]=>
  string(6) "<word>"
  [1]=>
  string(4) "test"
  [2]=>
  string(7) "</word>"
  [3]=>
  string(2) ", "
  [4]=>
  string(6) "<word>"
  [5]=>
  string(5) "test2"
  [6]=>
  string(7) "</word>"
}

Still no good. But, what we can do, is loop over the array, and move <word> to the next buffer, and </word> to the prior...

$buffer = '';
$newWords = array();
foreach ($words as $word) {
    if (strcasecmp($word, '<word>') === 0) {
        $buffer .= $word;
    } elseif (strcasecmp($word, '</word>') === 0) {
        // Find the last buffer
        $last = end($newWords);
        $newWords[key($newWords)] = $last . $buffer . $word;
        $buffer = '';
    } else {
        $newWords[] = $buffer . $word;
        $buffer = '';
    }
}
if (!empty($buffer)) {
    $newWords[] = $buffer;
}

Which would give you:

array(3) {
  [0]=>
  string(17) "<word>test</word>"
  [1]=>
  string(2) ", "
  [2]=>
  string(18) "<word>test2</word>"
}

ircmaxell 2010-08-05 19:27:33

ansaurus

tags:

views:

answers:

Preg_split() help

related questions