views:

245

answers:

4

I have a php script which accesses a MSSQL2005 database, reads some data from it and sends the results in a mail.

There are special characters in both some column names (I know, it's terrible) and in the fields itself.

When I access the script through my browser (webserver iis), the query is executed correctly and the contents of the mail are correctly (for my audience) encoded. However, when I execute php from the console, the query fails (due to the special characters in the column names). If I replace the special characters in the query with calls to chr() and the character code in latin-1, the query gets executed correctly, but the results are also encoded in latin-1 and therefore not displayed correctly in the mail. Why is PHP/the MSSQL driver/… using a different encoding in the two scenarios? Is there a way around it?

If you wonder, I need the console because I want to schedule the script using SQLAgent (or taskmanager or whatever). Thank you!

+1  A: 

PHP's poor support for the non English world is well known. I've never used a database with characters outside the basic ASCII realm, but obviously you already have a work around and it seems you just have to live with it.

If you wanted to take it a step further, you could: 1. Write an array that contains all the special chars and their CHR equivalents 2. foreach the array and str_replace on the query

But if the query is hardcoded, I guess what you have is fine. Also, make sure you are using the latest PHP, at least 4.4.x, there's always a change this was fixed but I skimmed the 4.x.x release notes and I don't see anything that relates to your problem.

TravisO
+1  A: 

The thing to remember about PHP strings is that they are streams of bytes. If you want to get the data in the correct character set (for whatever you are doing), you have to do this explicitly through some kind of function or filter. It's all pretty low-level.

Depending on your setup, you may need to know the internal character set of the strings in the database, but at the very least you need to know what character set the database is sending to PHP (because, remember, to PHP it's just a stream of bytes).

Then you have to know the target character set (and possibly specify it, which you really should anyway). For example, say that you are getting utf-8 from the database, but wish to send a latin-1 (and therefore base64 or q-printable encoded as 'Content-transfer-encoding'):

$send_string = base64_encode(utf8_decode($database_string));

Of course in this case, you'd have to know that all the utf-8 characters exist in the latin-1 character set, and you probably wouldn't really want base64 (PHP unfortunately does not have a good q-printable encoding function, though curiously, it does for decoding), and if you aren't talking about utf-8 <=> latin-1 you'll want to whip out the mbstring functions instead.

As far as the console, you'd have to know what PHP is getting when you are typing in special characters from the console, which probably depends on the shell and/or PHP settings. But remember that PHP only understands strings as byte byte byte and you should be able to work it out.

ruquay
Isn't it more that latin-1 characters exists in UTF-8? All the first UTF-8 characters are the same as ASCII and you can also find all latin-1 but UTF-8 is a variable-length character encoding that can represent any character in the Unicode standard, so more than 100,000.
lpfavreau
+2  A: 

Depending on the type of characters you have in your database, it might be a console limitation I guess. If you type chcp in the console, you'll see what is the active code page, which might something like CP437 also known as Extended ASCII. If you have characters out of this code page, like in UTF8, you might run into problems. You can change the current active code page by typing chcp 65001 to switch to UTF8.

You might also want to change the default Raster font to Lucida Console depending on the required characters as not all fonts support extended characters (right click on command prompt window's title, properties, font).

As already said, PHP's unicode support is not ideal, but you can manage to do it in PHP5 with a few well placed function call of utf8_decode. The secret of character encoding is to understand well what is the current encoding of all the tools you are using: database, database connection, current bytes in your PHP variable, your output to the console screen, your email's body encoding, your email client, and so on...

For everything that have special characters, in our modern days, something like UTF8 is often recommended. Make sure everything along the way is set to UTF8 and convert only where necessary.

lpfavreau
A: 

I need to connect PHP5.X with microsoft sql server 2005 ... i m using iis server in xp.. any one can help me please..

Thankyou..

Arun