character-encoding

Character Encoding Trouble - Java

Hi all, I've written a little application that does some text manipulation and writes the output to a file (html, csv, docx, xml) and this all appears to work fine on Mac OS X. On windows however I seem to get character encoding problems and a lot of '"' seems to disappear and be replaced with some weird stuff. Usually the closing '"' o...

.NET: Why isn't base 64 in Encoding.GetEncodings()?

I have a function that can decode an array of bytes into a string of characters using a specified encoding. Example: Function Decode(ByVal bytes() As Byte, ByVal codePage As String) As String Dim enc As Text.Encoding = Text.Encoding.GetEncoding(codePage) Return enc.GetString(bytes) End Function If I want to include base64 in ...

Zend Studio for eclipse - Switch character encoding for all files in a project

I'm using Zend Studio for Eclipise on Mac, and it seems to keep setting all files to have and encoding of 'Mac Roman'. This becomes problematic when I save the files, as they all need to be UTF-8. I know how to change the encoding to UTF-8 on a file by file basis, but I was wondering if I could set this project wide? ...

PHP/MySQL: Insert data into database character set problem.

Hi, I'm building a website that fetches text from another page and insert it into the database. The problem is that all the special characters are saved in the database using the HTML encoding so then I need to convert the output using: <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" /> I mean, what I have rig...

Java mail problem with Turkish characters

I have problem of showing Turkish characters in mail sent with Java code. The characters are shown as question marks (?) in mail. Message msg = new MimeMessage(mailSession); msg.setHeader("Content-Encoding","ISO-8859-9"); msg.setFrom(new InternetAddress(from)); InternetAddress[] address = {new InternetAddress(to)}; msg.setRecipients(Mes...

Character Encoding Detection Algorithm

I'm looking for a way to detect character sets within documents. I've been reading the Mozilla character set detection implementation here: Universal Charset Detection I've also found a Java implementation of this called jCharDet: JCharDet Both of these are based on research carried out using a set of static data. What I'm wonderin...

asp.net converting iso-8859 file to utf-8

Hi, I need to convert a CSV file from iso to UTF-8 to keep the accents in the database. French accents (é,è,ê, and the like) are not kept when I try to translate them to UTF-8, they are changed to "?". I'm stumped. I use the following function for the translation: public static string iso8859ToUnicode(string src) { Encoding...

Can I make git recognize a UTF-16 file as text?

I'm tracking a Virtual PC virtual machine file (*.vmc) in git, and after making a change git identified the file as binary and wouldn't diff it for me. I discovered that the file was encoded in UTF-16. Can git be taught to recognize that this file is text and handle it appropriately? I'm using git under Cygwin, with core.autocrlf set ...

How do I get foreign characters in a select/dropdown list to display properly in IE 7?

I have tested in IE6, Firefox 3.0.5 and Chrome and they all work. In IE7 it displays as boxes. For example: <select name="selectact" id="selectact"> <option value="page" selected="selected">网 页</option> <option value="news">新 闻</option> <option value="trade">行 业</option> <option value="area">区 域</option> <option value="web">网 站</op...

How do I get SAS encoding option programmatically?

How do I find out the SAS global encoding option programmatically? I can run proc options, and it will give me the answer, but I need to do it from code. I am hoping for an answer on the lines of "look at the macro symbol &sysencoding", but this might be too much to hope for. I would prefer to avoid fragile things like writing to an ext...

"svnlook changed" encoding

Hello, When I execute the following command: svnlook changed {path} -r {rev} where {path} is the repository path and {rev} is the revision number, I get the following output: U trunk/this/is/a/path/Mon fichier avec un nom accentu,.txt The output should actually be: U trunk/this/is/a/path/Mon fichier avec un nom accentué.txt Th...

Removal of ¶ (pilcrow) from pasted text

Users are pasting text from Lotus Notes into my VBA application. This is then being stored in Access. Sometimes the pasted text includes what I assume is a carriage return which, when pasted into a single line form control, is displayed in the application's forms as ¶. However, as this won't paste in to the VBE, I am unable to add thi...

ASP.NET and diacritics

Hello all, I intend to create asp.net pages using Visual Studio 2008. Preferably, the pages should be fully compliant with XHTML standard. How should I include the diacritics into the page content (no need to use diacritics in URLs)? Should I use character references (the ones with "&"), or just writing them directly form the keyboard? ...

HTTP URL - allowed characters in parameter names

Is there any formal restriction as to which characters are allowed in URL parameter names? I've been reading RFC3986 ("Uniform Resource Identifier (URI): Generic Syntax") but came to no definitive conclusion. I know there are practical limitations, but would it actually be forbidden to do something like: param with\funny<chars>=some_v...

Wrong Encoding in JAX-WS Dispatch Response

I'm trying to access a web service with JAX-WS using: Dispatch<Source> sourceDispatch = null; sourceDispatch = service.createDispatch(portQName, Source.class, Service.Mode.PAYLOAD); Source result = sourceDispatch.invoke(new StreamSource(new StringReader(req))); System.out.println(sourceToXMLString(result)); where: private static Str...

PHP character encoding problems

Hello guys. I need help with a character encoding problem that I want to sort once and for all. Here is an example of some content which I pull from a XML feed, insert into my database and then pull out. http://pastebin.com/d78d24f33 As you can see, a lot of special html characters get corrupted/broken. How can I once and for all sto...

How can I delete a charcter from string in PHP ?

How can I delete a character from string in PHP ? $s = "waseem"; Are there a function like delChar($s , 2); ? which 2 is the index of the Character , I search but I didn't find anything . any ideas ? ...

How to convert large UTF-8 strings into ASCII?

I need to convert large UTF-8 strings into ASCII. It should be reversible, and ideally a quick/lightweight algorithm. How can I do this? I need the source code (using loops) or the JavaScript code. (should not be dependent on any platform/framework/library) Edit: I understand that the ASCII representation will not look correct and wou...

Why Encoding.ASCII != ASCIIEncoding.Default in C#?

Why does Encoding.ASCII != ASCIIEncoding.Default in C#? ...

HTML encoding: eastern european languages

My program is fetching messages from a database, which contains English, German and several Eastern European languages. My Python script sets the encoding via: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> and use the values fetched correctly from the database (if I check within my logs). Unfortunately all br...