tags:

views:

309

answers:

2

I have this particular piece of code, but its not working.

text = text.Replace("\xEF\xBF\xBD", "?");

Does any one knows how to replace the text "\xEF\xBF\xBD" to "?" in C# String.

+14  A: 

You have to escape the backslashes.

text = text.Replace("\\xEF\\xBF\\xBD", "?");

Alternatively, you can escape the entire string with the @ symbol:

text = text.Replace(@"\xEF\xBF\xBD", "?");
David
Combined both answers. Give rep to David, as he needs it.
chris
Thanks Chris! I didn't even know you could combine answers. Very cool.
David
Had to edit your post and delete mine.
chris
This answer may technically correct if you take Priyanks question verbatim, but I don't think he really wants to replace the characters "�"...
oefe
I wasn't asking on how to escape backslash, I wanted to replace the Unicode sequence.
Priyank Bolia
+5  A: 

Short answer (guessing a bit):

text = text.Replace("\xFFFD", "?");

And learn about Unicode and character encodings, especially utf-8.

Long answer:

Well, do you mean "\xEF\xBF\xBD" literally? That is, a string consisting of these characters:

backslash, uppercase latin character E, uppercase latin character F, backslash, uppercase latin character B, uppercase latin character F, backslash, uppercase latin character B, uppercase latin character D

Then, the answer would be:

text = text.Replace(@"\xEF\xBF\xBD", "?");

Or do you sequences of characters which are described by the C# escape sequence "\xEF\xBF\xBD", namely:

LATIN SMALL LETTER I WITH DIAERESIS, INVERTED QUESTION MARK, VULGAR FRACTION ONE HALF

(which would be displayed as "�)? Then, the your code would be correct:

text = text.Replace("\xEF\xBF\xBD", "?");

Or do you want to replace the byte sequence

EF BF BD

(which could actually be the utf-8 representation of the unicode replacement character, FFFD, which is often displayed as"�")?

This is just a wild guess, but by intuition says you actually want to achieve the latter. Now, a .Net string contains characters, not bytes, but assuming that you have read these bytes e.g. from a file as utf-8, the answer would be:

text = text.Replace("\xFFFD", "?");
oefe
I was talking about the byte sequence, and not just plain characters values, your answer worked well. But can you please elaborate on why I can't just replace the byte sequence by giving the byte values as I did in the question and why I have to write \xFFFD
Priyank Bolia
well found the answer from Wikipedia to my comment: http://en.wikipedia.org/wiki/Unicode_Specials
Priyank Bolia