views:

122

answers:

3

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII.

This is file creation method which triggers on ButtonClick

protected void ToFile(object sender, EventArgs e)
{
    filename = Transactions.generateDateYMDHMS();
    string path = string.Format("{0}{1}.001", Server.MapPath("~/transactions/"), filename);
    StreamWriter sw = new StreamWriter(path, false, Encoding.ASCII);
    sw.WriteLine("hello");
    sw.WriteLine(Transactions.convertUTF8ASCII("שלום"));
    sw.WriteLine("bye");
    sw.Close();
}

as you can see, i use Transactions.convertUTF8ASCII() static method to convert from probably Unicode string from .NET to ASCII representation of it. I use it on term Hebrew 'shalom' and get back '????' instead of result i need.

Here is the method.

public static string convertUTF8ASCII(string initialString)
{
    byte[] unicodeBytes = Encoding.Unicode.GetBytes(initialString);
    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
    return Encoding.ASCII.GetString(asciiBytes);
}

Instead of having initial word decoded to ASCII i get '????' in the file i create even if i run debbuger i get same result.

What i'm doing wrong ?

+2  A: 

You can't simply translate arbitrary unicode characters to ASCII. The best it can do is discard the unsupportable characters, hence ????. Obviously the basic 7-bit characters will work, but not much else. I'm curious as to what the expected result is?

If you need this for transfer (rather than representation) you might consider base-64 encoding of the underlying UTF8 bytes.

Marc Gravell
Thanks, Marc. I have 'Œ€‹‰' this kind of characters in example file, though it does not represent 'shalom' but you will get an idea what kind of Encoding it is. I could not understand what do you mean by 'transfer' to base-64.
eugeneK
@eugeneK - it still isn't obvious to me what the translation is there. I suspect I'd need to see the exact byte sequence and character code-points that are supposed to map to each other for it to "click".
Marc Gravell
i've got requirement with mistake which lead me to ASCII at first place. Thanks for the info anyways.
eugeneK
A: 

Do you perhaps mean ANSI, not ASCII?

ASCII doesn't define any Hebrew characters. There are however some ANSI code pages which do such as "windows-1255"

In which case, you may want to consider looking at: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

In short, where you have:

Encoding.ASCII

You would replace it with:

Encoding.GetEncoding(1255)
userx
you probably right. i have no experience with Encoding at all hence i never knew ASCII doesn't contain hebrew chars
eugeneK
@eugueK ASCII is pretty much just the English alphabet, 0-9, basic punctuation and some control characters. Ref: http://en.wikipedia.org/wiki/ASCII
userx
nopes, i've got requirement with mistake which lead me to ASCII at first place. Thanks for the info anyways.
eugeneK
+1  A: 

Are you perhaps asking about transliteration (as in "Romanization") instead of encoding conversion, if you really are talking about ASCII?

peSHIr
nopes, i've got requirement with mistake which lead me to ASCII at first place. Thanks for the info anyways.
eugeneK