character-encoding

utf-8 to gsm 7 convertion in php

What should I do to convert utf-8 to gsm-7 in PHP. I got this link http://mobiletidings.com/2009/07/06/gsm-7-encoding-gnu-libiconv/ But dont know how to do this in PHP. ...

European signs in img src problem

Hey. I recently encountered a strange problem on my website. Images with æ ø and å in them (Western European signs) Won't display. The character encoding on all sites is "Iso-8859-1" I can print æ ø and å on the page without problems. If I right click the "broken image" and choose properties, it displays the filename with the europea...

Character sets offered in the Eclipse properties

I've just been handed a pile of Java source that, I suspect, is in ISO-8859-8. Eclipse's menu of charsets, here on my Mac, does not include that. Or any of a wide variety of other encodings supported by the JDK. Is there a recipe for expanding the list of encodings that show up in the menu? ...

Encoding a JMS TextMessage

Hello I'm receiving messages from a jms mq queue which are supposedly utf-8 encoded. However on reading the out using msgText = ((TextMessage)msg).getText(); I get questionmarks where non standard characters were present. It seems possible to specify the encoding when using a bytemessage, but I cant find a way to specify encoding while r...

Setting encoding in Grails controller's render method

Hello, I'm trying to build an RSS feed using Grails and Rome. In my controller's rss action, my last command is : render(text: getFeed("rss_2.0"), contentType:"application/rss+xml", encoding:"ISO-8859-1 ") However, when I navigate to my feed's URL, the header is : <?xml version="1.0" encoding="UTF-8"?> <rss xmlns:dc="http://pu...

What are the valid URL characters that can be used in a query variable?

What are the valid characters that can be used in a URL query variable? I'm asking because I would like to create GUIDs of minimal string length by using the largest character set so long as they can be passed as a URL query variable (www.StackOverflow.com?query=guiddaf09834fasnv) Edit If you want to encode a UUID/GUID or any other in...

Dealing with wacky encodings in Python

I have a Python script that pulls in data from many sources (databases, files, etc.). Supposedly, all the strings are unicode, but what I end up getting is any variation on the following theme (as returned by repr()): u'D\\xc3\\xa9cor' u'D\xc3\xa9cor' 'D\\xc3\\xa9cor' 'D\xc3\xa9cor' Is there a reliable way to take any four of the abov...

Javascript replacing HTML char code with actual character

I have a HTML input text, and its values are populated from a related div. My problem is that the div contains characters like &amp; which will display correcly as '&' sign in div but when copied to text box the text '&amp;' will be dispalyed How can i convert &amp; to & and '&lt;' to '<', '&nbsp;' to ' ' ??? ...

Is it possible to set two encodings for one hml?

Is there a way to specify certain part of a html file as another encoding? The default encoding for the (generated) html is utf-8. However, some of the included data to be inserted in the html is in another encoding. It's something like: <div> the normal html in utf-8 </div> <div> <%= raw_data_in_another_encoding %> </di...

Serializing chinese characters with Xerces 2.6

I have a Xerces (2.6) DOMNode object encoded UTF-8. I use to read its TEXT element like this: CBuffer DomNodeExtended::getText( const DOMNode* node ) const { char* p = XMLString::transcode( node->getNodeValue( ) ); CBuffer xNodeText( p ); delete p; return xNodeText; } Where CBuffer is, well, just a buffer object which is latel...

What is binary character set?

I'm wondering what binary character set is and what is a difference from, let's say, ISO/IEC 8859-1 aka Latin-1 character set? ...

Strange characters/text when installing PHPBB3 forum

Hi all, I am trying to install a PHPBB3 forum, and get strange characters/text after the install on certain pages - everything seems to install correctly though...no errors from the installer.. :( Originally it only appeared on the "new topic" or "post a reply" pages, but now it is appearing in various different places! Any help woul...

Word mail-merge called from Access 2007

Hi, I am working with a french version of Access and I absolutely need characters with accents (, etc.) I am calling Word from Access to do a mail-merge. I used to output the result of a query in a RTF file and merge with a .dot file. With 2003, the whole process went OK. With 2007, the accentuated characters go wrong. I tried UTF-8 enc...

OpenJPA & MySQL persist wrong encoded characters

Hi all, my mysql db has character encoding utf8. In QueryBrowser i can see special characters are correct. In appplication using openjpa i can see the same values also correct. But when I persist object into DB, I have correct values in application but incorrect in DB! When I restart application that special characters in application ar...

Would it be possible to have a UTF-8-like encoding limited to 3 bytes per character?

UTF-8 requires 4 bytes to represent characters outside the BMP. That's not bad; it's no worse than UTF-16 or UTF-32. But it's not optimal (in terms of storage space). There are 13 bytes (C0-C1 and F5-FF) that are never used. And multi-byte sequences that are not used such as the ones corresponding to "overlong" encodings. If these h...

How to store characters like ♥☆ to DB?

Previous issue - was not able to store non-english characters: http://stackoverflow.com/questions/3008918/how-to-store-non-english-characters That was fixed by using UTF8. But realized today that symbols like ♥☆ are not stored correctly. They get converted to characters like ♥☆. How can this be fixed? ...

What is a good resource for HTML character codes -> glyph and...

Hi, I've already found a good site to convert HTML character codes to their respective glyphs: http://www.public.asu.edu/~rjansen/glyph_encoding.html However, I need a bit more information. Does anyone know of a site like the one above that also provides information on what type of character code it is? Meaning, is it a special charac...

Characterset problem while inserting into mysql database from java application

Hi all, I have written a application that parses the html code of some web pages. My problem is with inserting that data into my mysq database. So for example i want to insert ľščťžýáíé and when i look into the table i get ?š??žýáíé. I guess the problem could be that the html pages i'm downloading are encoded in cp1250. but the databas...

What does unicode character &#10; represent?

The unicode is &#10; and it's being used in an XML document. ...

how to convert char * to uchar16 in JNI C++

Hello, here's what I am trying to do: typedef uint16_t uchar16_t; uchar16_t buf[32]; // buf will contain timezone information like GMT-6, Eastern Daylight Time, etc char * str = "Test"; for (int i = 0; i <= strlen(str); i++) buf[i] = str[i]; I guess that's not correct since uchar16_t would contain 2 bytes and str contains 1 ...