tags:

views:

63

answers:

2

Dear programmers: As I guess, most of you know that we have the following encodings for files:

  • ANSI
  • UTF-8

UTF-8 is recognized by adding three chars at the beginning of the file but those chars causes some troubles in PHP Language as you know So we use

  • UTF-8 Without BOM (Instead of UTF-8)

Here is my question: How can we write a new file (Using PHP) with the encoding of (UTF-8 Without BOM) either using frwite() or any other function (Doesn't matter)

(I'm not asking about an editor settings> I'm asking about creating a file with php functions) Hope you help me :)

Thanks in advanced

A: 

If it is "without BOM", then it is just normal bytes in the file.

They are all just numbers, from 0 to 255, or 0x00 to 0xFF, as bytes sequence in the file. How you interpret them will be up to you or the program.

動靜能量
+1  A: 

I'm afraid you have misrepresented both UTF-8 and ANSI in your question.

UTF-8 is not required to have a BOM at its start. There's no such encoding as "UTF-8 without BOM" encoding. There's just "UTF-8". I've processed millions (well, certainly hundreds of thousands) of UTF-8 files and never once come across a BOM at their start.

According to the Unicode standard, a BOM is neither required nor recommended in UTF-8:

2.6 Encoding Schemes

Use of a BOM is neither required nor recommended for UTF-8, but may be encounter in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature. See the "Byte Order Mark" subsection in Section 16.8, Specials, for more information.

Also, there is no such encoding as "ANSI"!

The closest thing that IANA provides provides to "ANSI" for a character set name is "ANSI_X3.4-1968" and "ANSI_X3.4-1986", which are both just legacy aliases for "US-ASCII" (the preferred MIME name), a 7-bit encoding of 128 code points. There is no other official charset name contains "ANSI" in its name.

I'm not sure what environment you're operating under, but it seems to have led you into some very non-standard naming, expectations, and standards.

Could it perhaps be… Windows™? ☹

EDIT: Just found this answer about the source of this misonymy.

tchrist