questions about encoding | ansaurus

encoding

Read mixed encoding string

I read some string with (windows-1256) encoding but the numbers in that string encoded using (UTF-8) and as a result all text except numbers (encoded with utf-8) read but numbers displays as (?) which is acceptable. but i want to know how can i read complete text without problem, how can i know when to switch between encodings to read co...

Broken (changed) Japanese Encodings from .NET 1.0 to 3.5(4.0?)

Hi All, I was migrating an app from 1.0 to 3.5 and testing in 4.0. It seems the following behavior has changed when it comes to the iso-2022-jp and SHIFT_JIS encoding routine. If you run the following routine on 1.0 you end up with "true", however, if you run it on 3.5, you end up with "false". This is a real show stopper for me, beca...

What encoding format is this?

I'm trying to edit a (slightly) proprietary format and within one of the files it will encode a connection string. I have a way to encode my own data with it, so I can reverse engineer it a bit. ABC123"/3 will encode to rijcmlqXxEeLA4tSspHg5XfWJiq4w== and AB120";2 encodes to rijcmlqiF3LjnFJnYfEi2WvcSoPSg== Is this a know...

language-agnostic

Formatting string for xml attribute in php

I have some strings that are valid in my database but when I include them in an attribute of a UTF-8 XML output they give me the following error: XML Parsing Error: not well-formed My current code (simplified): header('Content-Type: text/xml'); echo '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'; echo '<root attribute=...

=?ISO-8859-1 in mail subject

I'm acquiring the unread mails I have in my GMail account through PHP and its method *imap_open* When I get the subjects through the method *imap_fetch_overview* I get some subjects like this: =?ISO-8859-1?Q?Informaci=F3n_Apartamento_a_la_Venta?= =?ISO-8859-1?Q?_en_Benasque(Demandas:_0442_______)?= It's unreadable, I think because of...

ASCII-characters instead of Swedish chars?

Hi everyone, I have tested PHP's IMAP lib. to fetch emails from a GMAIL account, but I've just can't get my head around trying to make the characters to display correctly. At first, I was close to pull my hair off when I realized that I accidentally fetched the attachments instead of the message body - not good, but now when that is so...

Streaming h264 encoded frame over vlc

Hello , I have integrated a TI lib for .h264 encoding on davinci board with processor dm6446 I could verify the encoded bit stream when saved on hdd and using Elecard stream analyser. But i could not stream it over rtsp and view in vlc player. the vlc player would switch to tcp /ip and then stop showing message as nothing to play. On fu...

= sign in gmail message body

I'm retrieving the messages of my Gmail account, and I'm finding '=' signs in the body, with some hex codes. This is an example: Na pagina= para anunciar poderia ter as op=C3=A7=C3=B5es de Estados , Cidades e Bair= ros . Ex : S=C3=A3o paulo Diadema , Sant= I have highlighted in bold these parts. Of course, in Gmail thes...

AS3: Can ByteArray return its content as a string with two bytes per unicode character?

var bytes:ByteArray = new ByteArray; bytes.writeInt(0); trace(bytes.length); // prints 4 trace(bytes.toString().length); // prints 4 When I run the above code the output suggests that every character in the string returned by toString contains one byte from the ByteArray. This is of course great if you want to display the content of t...

Python csv: UnicodeDecodeError

I'm reading in a file with Python's csv module, and have Yet Another Encoding Question (sorry, there are so many on here). In the CSV file, there are £ signs. After reading the row in and printing it, they have become \xa3. Trying to encode them as Unicode produces a UnicodeDecodeError: row = [unicode(x.strip()) for x in row] Unicod...

Problem with encode decode. Python. Django. BeautifulSoup

In this code: soup=BeautifulSoup(program.Description.encode('utf-8')) name=soup.find('div',{'class':'head'}) print name.string.decode('utf-8') error happening when i'm trying to print or save to database. dosnt metter what i'm doing: print name.string.encode('utf-8') or just print name.string Traceback (most recent ca...

Howto let the SAX parser determine the encoding from the xml declaration?

Hi, I'm trying to parse xml files from different sources (over which I have little control). Most of the them are encoded in UTF-8 and don't cause any problems using the following snippet: SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser parser = factory.newSAXParser(); FeedHandler handler = new FeedHandler(); Input...

C++: Files, encodings and datatypes

---- PLEASE CLOSE ---- ------ Edit --------- I found where the problem is. I'm going to start a new question for the real problem .... ---------------------- Hi, My Situation: Linux (Ubuntu 10.04) gcc But it has to be platform independent I have a text file (UTF-8) with special characters like ¥ © ® Ỳ È Ð. I have a std:...

C++: std::string problem

Hi again, I have this simple code: #include <iostream> #include <fstream> using namespace std; int main(void) { ifstream in("file.txt"); string line; while (getline(in, line)) { cout << line << " starts with char: " << line.at(0) << " " << (int) line.at(0) << endl; } in.close(); return 0; } wh...

"Invalid multibyte char (US-ASCII)" error for ä, ü, ö, ß which are Ascii!

My application needs to handle some international characters, namely ä, ü, ö and ß, which are still ascii. When I tested the behavior of ruby when dealing with these chars, I got this error: test.rb:1: invalid multibyte char (US-ASCII) test.rb:1: invalid multibyte char (US-ASCII) for this code: puts "i like my chars: ä, ü, ö and ß!"...

What is this encoding? Need to decode some strings. Unknown coding.

Hello, i need to decode some strings i found in some code i'm doing maintenance. What do you guys this is encoded to? How can I decode it? Thanks in advance. foo=97 8B 8B 8F C5 D0 D0 88 88 88 D1 96 92 9E 98 96 91 9E 8B 96 90 91 D1 9E 8B D0 99 93 9E 8D 9A D0 99 93 9E 8D 9A D1 8F 97 8F bar=10 9F 6B 37 02 DA B5 E9 18 3B E1 23 1B 61 ...

Working with HTML econding in PHP (intelligent way to decode)

Hi guys, from a PHP script I'm downloading a RSS feed like: $fp = fopen('http://news.google.es/news?cf=all&ned=es_ve&hl=es&output=rss','r') or die('Error reading RSS data.'); The feed is an spanish news feed, after I downloaded the file I parsed all the info into one var that have only the content of the tag <descriptio...

Transforming a url into a unique 32 character token

I am writing an affiliate system, and I want to generate a unique 32 character wide token, from the url. The problem is that a URL can be up to 128 chars long (IIRC). Is there a way that I can create a unique 32 char wide key/token from a given URL, without any 'collisions'? I am not sure if this is an encoding, encryption or hashing p...

Working out file encoding: I know the string, know the character, what is the encoding?

I'm adding data from a csv file into a database. If I open the CSV file, some of the entries contain bullet points - I can see them. file says it is encoded as ISO-8859. $ file data_clean.csv data_clean.csv: ISO-8859 English text, with very long lines, with CRLF, LF line terminators I read it in as follows and convert it from ISO-885...

character-encoding

Getting special characters with gettext

Hi. I am using gettext to translate my site. In norwegian I need to use the charaters æøå, but they show up blank. I have set the $encoding = 'iso-8859-1';from wikipedia it says that æøå should be available, but as I said they show up blank. In poedit settings and my po/mo file I have set encoding to iso-8859-1. ps. I want to support a...

character-encoding

1
...
77
78
79
80
81
...
93