encoding

Detecting encoding conversion problems

The majority of content on my company's website starts life as a Word document (Windows-1252 encoded) and is eventually copied-and-pasted into our UTF-8-encoded content management system. The conversion usually chokes on a few characters (special break characters, smart quotes, scientific notations) which have to be cleaned up manually, ...

"The parameter is incorrect" when setting Unicode as console encoding

I get the following error: Unhandled Exception: System.IO.IOException: The parameter is incorrect. at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.__Error.WinIOError() at System.Console.set_OutputEncoding(Encoding value) at (my program) when I run the following line of code: Console.OutputEnco...

japanese email subject encoding

Aparently, encoding japanese emails is somewhat challenging, which I am slowly discovering myself. In case there are any experts (even those with limited experience will do), can I please have some guidelines as to how to do it, how to test it and how to verify it? Bear in mind that I've never set foot anywhere near Japan, it is simply ...

Why isn't the Byte Order Mark emitted from UTF8Encoding.GetBytes?

The snippet says it all :-) UTF8Encoding enc = new UTF8Encoding(true/*include Byte Order Mark*/); byte[] data = enc.GetBytes("a"); // data has length 1. // I expected the BOM to be included. What's up? ...

Convert asp.net project pages from Windows-1251 to Utf-8

I can do that file-by-file with Save As Encoding in Visual Studio, but I'd like to make this in one click. Is it possible? ...

How to "HTML encode" Em Dash in Visual Basic.NET

Hi all, I am generating some text to be shown on a web-site, and use HttpUtility.HtmlEncode to ensure it will look correct. However, this method does not appear to encode the Em Dash (it should convert it to ""). I have come up with a solution, but I'm sure there is a better way of doing it - some library function or something. sWebs...

How to put an encoding attribute to xml other that utf-16 with XmlWriter?

I've got a function creating some XmlDocument: public string CreateOutputXmlString(ICollection<Field> fields) { XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; settings.Encoding = Encoding.GetEncoding("windows-1250"); StringBuilder builder = new Strin...

Read Csv file encoding error

I am using the following method for reading Csv file content: /// <summary> /// Reads data from a CSV file to a datatable /// </summary> /// <param name="filePath">Path to the CSV file</param> /// <returns>Datatable filled with data read from the CSV file</returns> public DataTable ReadCsv(string filePath) { ...

ASP.net Conversion of BlogML to MovableType Import Format, Encoding, Line Termination question

This could be a bit long, I'm not really sure what the problem is. I'm currently using an implementation of BlogEngine.NET as my blogging platform. I'm wanting to switch to MovableType to take advantage of their Community and Social blogging apparatus. The major "IF" in the equation is whether or not I'll be able to import my old posts t...

Python: Is there a way to determine the encoding of text file?

I know there is something buried in here. But I was just wondering if there is an actual way built into Python to determine text file encoding? Thanks for your help :) Edit: As a side question, it can be ignored if you want but why is the type of encoding not put into the file so it could be detected easier? ...

Best way to encode text data for XML in Java?

Very similar to this question, except for Java. What is the recommended way of encoding strings for an XML output in Java. The strings might contain characters like "&", "<", etc. ...

Producing valid XML with Java and UTF-8 encoding

I am using JAXP to generate and parse an XML document from which some fields are loaded from a database. Code to serialize the XML: DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder(); Document doc = builder.newDocument(); Element root = doc.createElement("test"); root.setAttribute("version", text); doc....

multiple description coding

How can I encode an h264 video to be MDC (multiple description coding) ...

PHP Mail Encodes Subject Line

When I try to send a HTML encoded email from PHP, if the subject line contains special chars like "Here's the information you requested", PHP encodes it to read "Here&#039;s the information you requested." How do I fix this? ...

Japanese Encoding subject in Outlook 07

I am trying to read and process Japanese emails. I have set my regional and Language options to east asian and languages for non-unicode in the xp control pannel. I have to process pst files and preserve the true metadata and I am having trouble with the subject line and sometimes the to: and cc: fields. I get my message body to show Jap...

Strange Encoding Problem

I have a table of datas encoded in latin5 charset and all the columns in the table are also latin5. From mysql console when I enter "SET NAMES 'latin5'" and query the table results are ok . When I try to delete or insert/update all the new data's encodings are perfect. But when I try to insert Iso-8859 data (also verify this with mb_dete...

How do I know what encoding scheme to use when converting a string to a byte array?

From my database I am getting a very long string which is basically xml. I need to change it to a byte array. I can't get my head around the potential encoding issues. What do I need to be careful of when doing this conversion? public static byte[] StringToByteArray1(string str) { return Encoding.ASCII.GetBytes(str); ...

System.Net.Mail and =?utf-8?B?XXXXX.... Headers

I'm trying to use the code below to send messages via System.Net.Mail and am sometimes getting subjects like '=?utf-8?B?W3AxM25dIEZpbGV...' (trimmed). This is the code that's called: MailMessage message = new MailMessage() { From = new MailAddress("[email protected]", "Service"), BodyEncoding = Encoding.UTF8, Body = body...

XmlDocument dropping encoded characters

My C# application loads XML documents using the following code: XmlDocument doc = new XmlDocument(); doc.Load(path); Some of these documents contain encoded characters, for example: <xsl:text>&#10;</xsl:text> I notice that when these documents are loaded, &#10; gets dropped. My question: How can I preserve <xsl:text>&#10;</xsl:tex...

How to explicity tell SVN to treat a file as text, not binary

I have a number of files that I checked into SVN without having set up their Mime types correctly. SVN initially classified them as binary. I've since set their Mime type in SVN via propset to "text/plain; charset=UTF-8" and I'vc made sure that all the files are UTF-8 signed. When I do 'svn blame filename', svn says that the file is b...