views:

135

answers:

7

Is it posible to convert Cyrillic string to English(Latin) in c#? For example I need to convert "Петролеум" in "Petroleum". Plus I forgot to mention that if I have Cyrillic string it need to stay like that, so can I somehow check that?

+3  A: 

You can of course map the letters to the latin transcription, but you won't get an english word out of it in most cases. E.g. Российская Федерация transcribes to Rossiyskaya Federatsiya. wikipedia offers an overview of the mapping. You are probably looking for a translation service, google probably offers an api for that.

Femaref
+4  A: 

I'm not familiar with Cyrillic, but if it's just a 1-to-1 mapping of Cyrillic characters to Latin characters that you're after, you can use a dictionary of character pairs and map each character individually:

var map = new Dictionary<char, string>
{
    { 'П', "P" },
    { 'е', "e" },
    { 'т', "t" },
    { 'р', "r" },
    ...
}

var result = string.Concat("Петролеум".Select(c => map[c]));
dtb
I was trying to avoid that, but thanks :) I thought if there was some cleaner way from .Net or c#.
Pece
@Pece: I'm not aware of a built-in method that does this... BTW, if performance is a concern, use a char[] or StringBuilder instead of LINQ.
dtb
It is not ch to ch mapping. You need multiple Latin characters for some Cyrillic characters.
PauliL
@PauliL: Fixed. :)
dtb
It's the longest but simplest solution I've found so far, so thanks :)
Pece
A: 

Use a Dictionary with russian and english words as a lookup table. It'll be a lot of typing to build it, but it's full proof.

kirk.burleson
not really. If google can't produce a fool proof dictionary, he can't either.
Femaref
A: 

Why do you want to do this? Changing characters one-for-one generally doesn't even produce a reasonable transliteration, much less a translation. You may find this post to be of interest.

Jeanne Pindar
A: 

You are searching for a way of translitterating russian words written in cirillic (in some encodings, e.g. even a Latin encoding, since iso 8859-5 aka Latin-5 is for cyrillic) into latin alphabet (with accents)?

I don't know if .NET has something to transliterate, but I dare say it (as many other good frameworks) hasn't. This wikipedian link could give you some ideas to implement translitteration, but it is not the only way and remember tha cyrillic writing systems is not used by russian only and the way you apply translitteration may vary on the language that use the writing system. E.g. see the same for bulgarian. May this link (always from wp) can be also interesting if you want to program the translitterator by yourself.

ShinTakezou
+1  A: 

http://code.google.com/apis/ajaxlanguage/documentation/#Transliteration

Google offer this AJAX based transliteration service. This way you can avoid computing transliterations yourself and let Google do them on the fly. It'd mean letting the client-side make the request to Google, so this means your app would need to have some kind of web-based output for this solution to work.

kobrien
+2  A: 

If you're using Windows 7, you can take advantage of the new ELS (Extended Linguistic Services) API, which provides transliteration functionality for you. Have a look at the Windows 7 API Code Pack - it's a set of managed wrappers on top of many new API in Windows 7 (such as the new Taskbar). Look in the Samples folder for the Transliterator example, you'll find it's exactly what you're looking for:

hmemcpy