ansaurus

Question

PHP problem with preg_replace

Answer 1

+1 A:

Did you look at PHP's built in strip_tags() function?

Otherwise, we've no idea what your code is actually doing, so very hard to identify why it isn't working as you want.

Mark Baker 2010-06-22 13:10:06

strip_tags has a known limitation of 1024 characters. Everything above that remains unstripped.

bisko 2010-06-22 13:49:43

@bisko that limit is per-tag, not the entire input.

Matt 2010-06-22 13:51:53

By my tests it's for the entire input. Passing it a long, valid HTML it strips just the first 1024 chars if tags are found in them.

bisko 2010-06-22 13:58:58

@bisko you might want to check your test again, 1024 characters is [no problem](http://codepad.org/Fd7N9cXu).

Matt 2010-06-22 15:41:09

Hm, seems I really need to check again the results I got in testing. Seems you are right, Matt. Thanks for noting this!

bisko 2010-06-23 09:25:49

Answer 2

A:

When you have long HTML files, the preg family of functions will return false, because of a backtrack limitation in PHP ( check here: http://bugs.php.net/bug.php?id=40846 ).

You could try to work on smaller portions of the files and concatenate them after stripping the tags.

Also you could optimize your regular expressions not to use so much backtracking if you rely much on .* . For example

/<.*?>/

Could be optimized as

/<[^>]+>/

And so on.

bisko 2010-06-22 13:48:49

ansaurus

tags:

views:

answers:

PHP problem with preg_replace

related questions