views:

1422

answers:

3

I am trying to create a text file using VB.Net with UTF8 encoding, without BOM. Can anybody help me, how to do this?
I can write file with UTF8 encoding but, how to remove Byte Order Mark from it?
Thanks in Advance. edit1: I have tried code like this;

    Dim utf8 As New UTF8Encoding()
    Dim utf8EmitBOM As New UTF8Encoding(True)
    Dim strW As New StreamWriter("c:\temp\bom\1.html", True, utf8EmitBOM)
    strW.Write(utf8EmitBOM.GetPreamble())
    strW.WriteLine("hi there")
    strW.Close()

        Dim strw2 As New StreamWriter("c:\temp\bom\2.html", True, utf8)
        strw2.Write(utf8.GetPreamble())
        strw2.WriteLine("hi there")
        strw2.Close()

1.html get created with UTF8 encoding only and 2.html get created with ANSI encoding format.

+2  A: 

There seems to be a way of omitting the byte order mark (BOM) via passing False True to the UTF8Encoding constructor (link to MSDN reference page).

That is, use your own instance of UTF8Encoding instead of the default System.Text.Encoding.UTF8:

Dim utf8WithoutBom As New System.Text.UTF8Encoding(True)
                                                  '^^^^'

Using sink As New StreamWriter("Foobar.txt", False, utf8WithoutBom)
    sink.WriteLine("...")
End Using

(Note that omitting the BOM is only permissible for UTF-8, not for UTF-16.)

stakx
yes. But this is not working. If I open the file created in such manner, it only shows UTF8 and not "without BOM"I used something like this; Dim utf8 As Encoding = Encoding.UTF8 Dim utfEnc As UTF8Encoding = New UTF8Encoding(False)
Vijay Balkawade
Many editors add the BOM for consistency sake. When you say you open the file, where are you opening it? Try opening it on a Linux machine.
Robert C. Barth
I use Notepad++. When I convert file from utility which is available on net, it shows me proper encoding. But when I try to do it from code, it shows only UTF8.
Vijay Balkawade
**Update:** I made a mistake in the code: You need to pass in `True` instead of `False` to the constructor. But the code will work as expected. _Make sure to use a hex editor to verify the generated byte stream_, since text editors might transform the text (i.e. hide the BOM from the user) before you get the change to look at it!
stakx
I have one query. After creating file with UTF8 without BOM, if we open it using notepad++, are there any chances that the editor re-converts it to UTF8 ?
Vijay Balkawade
AFAIK Notepad++ has a menu **Encoding** where you can switch to **UTF-8 without BOM**.
stakx
@Vijay Notepad++ Does not set anything unless you specify it ( use Encoding Menu ), but if you open that text file with an HEX Editor (for example FrHed) you can actually see the BOM (first 3 bytes of the file) if the file has it.
balexandre
A: 

It might be that your input text contains a byte order mark. In that case, you should remove it before writing.

jdv
Please assist me. How to remove it before writting.
Vijay Balkawade
+1  A: 

Try this:

Encoding outputEnc = new UTF8Encoding(false); // create encoding with no BOM
TextWriter file = new StreamWriter(filePath, false, outputEnc); // open file with encoding
// write data here
file.Close(); // save and close it
Roman Nikitin