




I have an HTTPHandler that is reading in a set of CSS files and combining them and then GZipping them. However, some of the CSS files contain a Byte Order Mark (due to a bug in TFS 2005 auto merge) and in FireFox the BOM is being read as part of the actual content so it's screwing up my class names etc. How can I strip out the BOM characters? Is there an easy way to do this without manually going through the byte array looking for ""?


Is the BOM appearing in the actual text itself, or just at the very start? I'd be surprised to see it anywhere other than at the start of the data - in which case simply ignoring the first 3 bytes (assuming UTF-8) should do the trick.

Jon Skeet
+5  A: 

Expanding on Jon's answer with a sample.

var name = GetFileName();
var bytes = System.IO.File.ReadAllBytes(name);
System.IO.File.WriteAllBytes(name, bytes.Skip(3).ToArray());

Another way, assuming UTF-8 to ASCII.

File.WriteAllText(filename, File.ReadAllText(filename, Encoding.UTF8), Encoding.ASCII);
Tim Bailey

FWIW, you could open the files in Notepad++ and save them without the Byte Order Mark. It's what I had to do in this question.

George Stocker
var text = File.ReadAllText(args.SourceFileName);
var streamWriter = new StreamWriter(args.DestFileName, args.Append, new UTF8Encoding(false));
Looking at this code, ideally it should work. But, I am surprised that it is saving file in ANSI format.
Vijay Balkawade

Expanding JaredPar sample to recurse over sub-directories:

using System.Linq;
using System.IO;
namespace BomRemover
    /// <summary>
    /// Remove UTF-8 BOM (EF BB BF) of all *.php files in current & sub-directories.
    /// </summary>
    class Program
        private static void removeBoms(string filePattern, string directory)
            foreach (string filename in Directory.GetFiles(directory, file  Pattern))
                var bytes = System.IO.File.ReadAllBytes(filename);
                if(bytes.Length > 2 && bytes[0] == 0xEF && bytes[1] == 0xBB && bytes[2] == 0xBF)
                    System.IO.File.WriteAllBytes(filename, bytes.Skip(3).ToArray()); 
            foreach (string subDirectory in Directory.GetDirectories(directory))
                removeBoms(filePattern, subDirectory);
        static void Main(string[] args)
            string filePattern = "*.php";
            string startDirectory = Directory.GetCurrentDirectory();
            removeBoms(filePattern, startDirectory);            

I had need that C# piece of code after discovering that the UTF-8 BOM corrupts file when you try to do a basic PHP download file.

Olivier de Rivoyre