tags:

views:

36

answers:

2

Hi All;

I have a problem! I wanna detect any numbers in HTML content without numbers in tag attributes, I wanna change this numbers to other character then only numbers not in HTML TAG ATTRIBUTES that match with this REGEX.

Example:

Hi 3456; <a href="?id=4456">your code: 345</a> 

Matched 3456, 345 Not Matched 4456

Thanks from all

+1  A: 

You should best use a parser like PHP Simple HTML DOM Parser. The reasons are outlined in this blog post.

Reinis I.
Suggested third party alternatives to [SimpleHtmlDom](http://simplehtmldom.sourceforge.net/) that actually use [DOM](http://php.net/manual/en/book.dom.php) instead of String Parsing: [phpQuery](http://code.google.com/p/phpquery/), [Zend_Dom](http://framework.zend.com/manual/en/zend.dom.html), [QueryPath](http://querypath.org/) and [FluentDom](http://www.fluentdom.org).
Gordon
A: 

Here's a quick dirty way that will work for simple samples and for valid html, and probably will cause problems with invalid html:

<?php
$html='Hi 3456; <a href="?id=4456">your code: 345</a> another 234';

$html = preg_replace('|(>[^<\d]*)(\d+)([^<\d]*</)|', '$1{NUM_WAS_HERE}$3', $html);//match between tags
$html = preg_replace('|^([^<\d]*)(\d+)([^<\d]*<)|', '$1{NUM_WAS_HERE}$3', $html);//beginning of the string
$html = preg_replace('|(>[^<\d]*)(\d+)([^<\d]*)$|', '$1{NUM_WAS_HERE}$3', $html);//end of the string

echo $html, "\n";//outputs: Hi {NUM_WAS_HERE}; <a href="?id=4456">your code: {NUM_WAS_HERE}</a> another {NUM_WAS_HERE}

As @Reinis recommended, using an html parser is the good secure way to achieve this.

aularon