Is there a common method to encode and decode arbitrary data so the encoded end result consists of numbers only - like base64_encode but without the letters?
Fictitious example:
$encoded = numbers_encode("Mary had a little lamb");
echo $encoded; // outputs e.g. 12238433742239423742322 (fictitious result)
$decoded = numbers_decode("12...
I'm writing a crawler in ruby (1.9) that consumes lots of HTML from a lot of random sites.
When trying to extract links, I decided to just use .scan(/href="(.*?)"/i) instead of nokogiri/hpricot (major speedup). The problem is that I now receive a lot of "invalid byte sequence in UTF-8" errors.
From what I understood, the net/http library...
I use netbeans as development IDE and runs the application from cmd but have problems to display ISO 8859-1 characters like åäö correct in both cmd window and when I run the application from netbeans
Question: What is best practice to set it up
Right now I do
@output.puts indent + "V" + 132.chr + "lkommen till Ruby Camping!"
to get...
I have a perl script that prints some information to console in Russian. Script will be executed on several OSes, so console encoding can be cp866, koi8-r, utf-8, or some other. Is there a portable way to detect console encoding so I can setup STDOUT accordingly so the text is printed correctly?
...
I have a Python script that pulls in data from many sources (databases, files, etc.). Supposedly, all the strings are unicode, but what I end up getting is any variation on the following theme (as returned by repr()):
u'D\\xc3\\xa9cor'
u'D\xc3\xa9cor'
'D\\xc3\\xa9cor'
'D\xc3\xa9cor'
Is there a reliable way to take any four of the abov...
So, I have built on this system for quite some time, and it is currently outputting Latin1 (ISO-8859-1) to the web browser, and this is the components:
MySQL - all data is stored with the Latin1 character set
PHP - All PHP text files are stored on disk with Latin1 encoding
HTML - The output has the http-equiv="content-type" content="...
I have ascii strings which contain the character "\x80" to represent the euro symbol:
>>> print "\x80"
€
When inserting string data containing this character into my database, I get:
psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80
HINT: This error can also happen if the byte sequence does not match the encodi
ng ...
I am currently html encoding all user entered text before inserting/updating a db table record. The problem is that on any subsequent updates, the previously encoded string is reencoded. This endless loop is starting to eat up alot of column space in my tables. I am using parameterized queries for all sql statements but am wondering wou...
Hi. I'm having trouble deserializing a ruby class that I wrote to YAML.
Where I want to be
I want to be able to pass one object around as a full 'question' which includes the question text, some possible answers (For multi. choice) and the correct answer. One module (The encoder) takes input, builds a 'question' class out of it and app...
I have a piece of JavaScript string, coming from an untrusted source, embedded inside of an onclick tag and I'm not sure what the correct way of encoding this string is. Here is a simplification of the HTML:
<input type="button" onclick="alert([ENCODED STRING HERE]);"
value="Click me" />
I use the Microsoft AntiXss library which c...
I am using .NET to create a video uploading application. Although it's
communicating with YouTube and uploading the file, the processing of
that file fails. YouTube gives me the error message, "Upload failed
(unable to convert video file)." This supposedly means that "your
video is in a format that our converters don't recognize..."
I h...
Given this XML...
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
<this>
<that>one</that>
</this>
</item>
<item>
<this>
<that>two</that>
</this>
</item>
<item>
<this>
<that>three</that>
</this>
</item>
</root>
I want to make copies of the items into a new for...
Hi all!
I have a very strange problem when retrieving data with php from a mysql table. Basically, two php files with the EXACT same content are given data with different encodings and i dunno why.
Here's the code:
$dbhost = 'localhost';
$dbuser = 'myuser';
$dbpass = 'mypass';
$conn = mysql_connect($dbhost, $dbuser, $dbpass) or die ('Er...
Hi,
I have some code that fetches some data from the database, database codepage is UTF8. When I run the code on a linux box, some characters come out as question marks (?) but when I run the same code on a windows server, all characters appear correctly.
When I do:
$> $LANG
Following is returned
en_SG.UTF-8
en_SG is something that d...
When I run maven install on my multi module maven project I always get the following output:
[WARNING] File encoding has not been set, using platform encoding UTF-8, i.e. build is platform dependent!
So, I googled around a bit, but all I can find is that I have to add
<properties>
<project.build.sourceEncoding>UTF-8</project.buil...
Trying to develop a text editor, I've got two textboxes, and a button below each one.
When the button below textbox1 is pressed, it is supposed to convert the Unicode text (intended to be Japanese) to Shift-JIS.
The reason why I am doing this is because the software VOCALOID2 only allows ANSI and Shift-JIS encoding text to be pasted in...
When I use the Visual SourceSafe (2005) Explorer to get the latest version of a file to my client (Win 7) machine, and then diff my newly gotten local copy with the one in the repository, VSS tells me that the files have different character encodings.
What gives?
...
I have a downloader program that download pages from internet .
the encoding of each page is different , some are in UTF-8 and some are Unicode.
For example : a that shows 'a' character ; pages full of this characters .We should convert this encodings to normal text .
I used the UnicodeEncoding class in c# , but they do not help me ...
I have a string that I would like represented uniquely as an integer.
For example: A3FJEI = 34950140
How would I go about writing a EncodeAsInteger(string) method. I understand that the amount of characters in the string will make the integer increase greatly, forcing the value to become a long, not an int.
Since I need the value to ...
Hi,
While i was working with an old application with existing database which is in ms-access contains some strange data encoding such as 48001700030E0F465075465A56525E1100121D04121B565A58 as email address
What kind of data encoding is this? i tried base64 but it dosent seems that. Can anybody with previous experience with ms-access co...