ansaurus

Question

Extract everything between <object></object>

Answer 1

+4 A:

See Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why this is probably the wrong thing to do.

That said you might be able to get away with something like /(<object>.*?<\/object>)/s. This matches the string "<object>" followed by any number of characters up to the string "</object>". The s on the end tells . to match newlines (it normally doesn't).

Chas. Owens 2009-04-04 14:33:43

+1 for the first paragraph.

strager 2009-04-04 17:13:46

Answer 2

+6 A:

This is partially in response to Owens (because I can't put code in a comment very well). That regex might not work for the object tag, basically because the opening <object> tag has attributes in it. Try this one instead:

/(<object[^>]*>)(.*?)(<\/object>)/si

It's case insensitive and broken into the three groupings for easy reference. It's not 100% perfect, but should help.

St. John Johnson 2009-04-04 14:44:38

> is legal in an attribute value, IIRC.

strager 2009-04-04 17:12:39

Also, this does not handle <object> nesting.

strager 2009-04-04 17:13:11

Which is why it is hard to parse HTML with a Regex. But this will work for his attempt.

St. John Johnson 2009-04-04 18:12:33

Yeah, these are the dangers of trying to use a regex, which is why I used a half-hearted match-what-he-showed approach. Any time spent attempting to bullet proof the regex is time that should have been spent learning how to use a parser.

Chas. Owens 2009-04-04 19:10:13

Answer 3

A:

this regex will match all the line breaks between the opening and closing tags and capture the entire thing in one group

/(<object[^>]*?>(?:[\s\S]*?)<\/object>)/gi

Scott Evernden 2009-04-04 17:08:22

This will fail if objects are nested.

porneL 2009-04-04 17:42:32

right .. but i don't think i've ever seen objects nested inside objects

Scott Evernden 2009-04-04 17:43:54

It's completely legal. I've seen it. You can have an image object inside a video object inside a flash object, for example.

strager 2009-04-04 18:18:50

Answer 4

+3 A:

Using SimpleXML:

$sxe = new SimpleXMLElement($xml);
$objects = $sxe->xpath('//object[@id="object701207571"]');
$object = $objects[0];

$params = $object->xpath('param');

foreach($params as $param)
{
    $attrs = $param->attributes();
    echo $attrs['name'] . ' = ' . $attrs['value'] . "\n";
}

// Get plain XML:
echo $object->asXML();

strager 2009-04-04 17:30:01

Answer 5

+1 A:

$doc = DOMDocument::loadHTML($html);
foreach($node->getElementsByTagName('object') as $object)
{
   echo $doc->saveXML($object);
}

porneL 2009-04-04 17:45:03

Answer 6

A:

@St. John Johnson Good sir its working ... thanx

kuldeep singh 2010-01-04 06:41:14

ansaurus

tags:

views:

answers:

Extract everything between <object></object>

related questions