ansaurus

Question

PHP PREG Question

Answer 1

+1 A:

Step one: Remove regular expressions from your toolbox when dealing with HTML. You need a parser.

Step two: Download simple_html_dom for php.

Step three: Parse

$html = str_get_html('<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>');
$spanText = $html->find('span', 1)->innerText;

Step four: Profit!

Edit

$html->find('span.placeholder', 1)->tag, $matches); will return what you want. It looks for class=placeholder.

Byron Whitlock 2009-12-28 21:00:02

Byron - i don't know ahead of time the title or thejquery###="#" piece - any way to issue wildcards on those?

OneNerd 2009-12-28 21:06:15

You said you want to strip the span, not keep the attributes?

LiraNuna 2009-12-28 21:16:33

just want the piece [[[SOMETEXT]]] to remain, everything else can go.

OneNerd 2009-12-28 21:18:52

I'm also guessing there will be other non/placeholder spans in the source. So you'll need to select only the spans with the placeholder class and get their inner text.

pygorex1 2009-12-28 21:21:22

yes, although sometimes the class is set like this: class=placeholder (no quotes), and sometimes with quotes.

OneNerd 2009-12-28 21:24:13

Answer 2

+1 A:

I think this should solve your poble

function strip_placeholder_spans( $in_text ) {
preg_match("/>(.*?)<\//", $in_text, $result);
return $result[1]; }

marvin 2009-12-28 21:00:47

hmm - not an expert, but wouldn't that strip out all tags?

OneNerd 2009-12-28 21:05:21

oh yes sorry, misunderstood the question, you want only strip span, then you can use,function strip_placeholder_spans( $in_text ) {preg_match("/<span(.*?)>(.*?)<\/span>/", $in_text, $result);return $result[2]; }I'm not sure i understood it right again, im kind of confused waht you wanted

marvin 2009-12-28 21:29:35

Answer 3

+1 A:

Use an HTML parse. This is the most robust solution. The following code will work for the two code examples you posted:

$s= <<<STR
<span style="" class="placeholder" title="">[[[SOMETEXT]]</span>
Some Other text &amp; <b>Html</b>
<SPAN class=placeholder title="" jQuery1262031390171="46">[[[SOMETEXT]]]</SPAN>
STR;

preg_match_all('/\<span[^>]+?class="*placeholder"*[^>]+?>([^<]+)?<\/span>/isU', $s, $m);
var_dump($m);

Using regular expressions results in very focused code. This example will only handle very specific HTML and well-formed HTML. For instance, it won't parse <span class="placeholder">some text < more text</span>. If you have control over the source HTML this may be good enough.

pygorex1 2009-12-28 21:20:17

I converted your preg_match_all to a preg_replace, and it appears to do what I need. Thanks -

OneNerd 2009-12-29 16:26:51

ansaurus

tags:

views:

answers:

PHP PREG Question

related questions