tags:

views:

77

answers:

3

Hi!

I have a problem that I haven't seen before. The same regex is producing two different results on two different servers.

This is the code:

preg_replace('#[^\pL0-9_@-]#iu', '', '!%&abc123_æøå');

Result on server A (php 5.2.6, Server Api: Apache 2.0 Handler):

abc123_æøå

Result on server B (php 5.2.5, Server Api: CGI/FastCGI):

123_

Anyone with any ideas on why this difference is happening?

+2  A: 

This must be because of

  • Locale settings
  • PHP multibyte strings support on/off
  • PHP mb_string.func_overload (overloading of some functions for multibyte support)
culebrón
Thanks for the reply :) I couldn't locate any of this stuff in php.ini, but I tried running setlocale(LC_ALL, 'no_NB'); before the regex, without any change. If I run the regex on just 'abc123_' it still produces '123_', so maybe it's not a multibyte issue?
Tommy
+1  A: 

You could try the mb_eregi_replace function instead.

mb_eregi_replace('[^\pL0-9_@-]', '', '!%&abc123_æøå');

Should work consistently across all servers that support multibyte strings, and should eliminate problems that you might get due to different file encodings. (Theoretically, at least.)

Atli
Thanks for the reply :) I tried this, and it actually produces the same result as the code in my original post. Maybe that is an indicator of some sorts.
Tommy
Interesting. Try printing the `mb_regex_encoding`. See if they are different. You could also just hard-code in the encoding. `mb_regex_encoding('UTF-8');`, or whatever encoding you want to use.
Atli
Thanks for the suggestion! I the encoding was ISO-8859-1, but even with utf-8 both `mb_eregi_replace` and `preg_replace` performs the same. Wierd stuff.
Tommy
Yea, this is behaving very strangely indeed. The only other thing I can think up is that the files themselves could be encoded using different encodings *(ISO vs. UTF-8)*, and the hard-coded strings might therefore be represented by a different string of bytes *(PHP strings are just byte arrays, after all)*. Perhaps if you re-encoded both files using something like Notepad++, that might change something...
Atli
No effect now either. Thanks for the suggestions so far, though. I've had this code on about 10 different servers, this is the first time I'm seeing this issue.
Tommy
A: 

Well, it's finally sorted out. The server was upgraded from php 5.2.5 to 5.2.11 (still running as cgi, though), and the problems went away with the old version.

Thanks for everyones feedback and suggestions!

Tommy