views:

171

answers:

3

I've got a section of code on a b2evo PHP site that does the following:

$content = preg_replace_callback(
    '/[\x80-\xff]/',
    create_function( '$j', 'return "&#".ord($j[0]).";";' ),
    $content);

What does this section of code do? My guess is that it strips out ascii characters between 128 and 256, but I can't be sure.

Also, as it stands, every time this bit of code is called from within a page, PHP allocates and then does not free upto 2K of memory. If the function is called 1000+ times on a page (this can happen), then the page uses an extra 2MB of memory.

This is causing problems with my web application. Why am I losing memory, and how do I rewrite this so I don't get a memory leak?

+4  A: 

It's create_function that's leaking your memory - just use a normal function instead and you'll be fine.

The function itself is replacing the characters with numeric HTML entities (&#xxx;)

Greg
I wish I could tick both answers. Thanks.
seanyboy
+3  A: 

Not really stripping, it replaces high-Ascii characters by their entities.

See preg_replace_callback.
create_function is used to make an anonymous function, but you can use a plain function instead:

$content = 'Çà ! Nœm dé fîçhïèr tôrdù, @ pöür têstër... ? ~ Œ[€]';
$content = preg_replace_callback('/[\x80-\xff]/', 'CB_CharToEntity', $content);
echo $econtent . '<br>';
echo htmlspecialchars($content) . '<br>';
echo htmlentities($content) . '<br>';
echo htmlentities($content, ENT_NOQUOTES, 'cp1252') . '<br>';

function CB_CharToEntity($matches)
{
    return '&#' . ord($matches[0]) . ';';
}

[EDIT] Found a cleaner, probably faster way to do the job! ^_^ Just use htmlentities with options fitting your needs.

PhiLho
A: 

It's a lot simpler to use preg_replace with the /e flag in your case:

$content = preg_replace(
    '/[\x80-\xff]/e',
    '"&#".ord($0).";"',
    $content);
newacct