I am working on writing some code to scrub user input to my ASP.NET site. I need to scrub input to remove all references to ASCII characters 145, 146, 147, 148 which are occasionally getting input from my mac users who are copying and pasting content they write in a word processor on their macs.
My issue is the following three strings I am led to believe should output the same text.
string test1 = Convert.ToChar(147).ToString();
string test2 = String.Format("'{0}'", Convert.ToChar(147));
char[] characters = System.Text.Encoding.ASCII.GetChars(new byte[] { 147 });
string test3 = new string(characters);
Yet when I set an ASP TextBox to equal the following
txtShowValues.Text = test1 + "*" + test2 + "*" + test3;
I get a blank value for test1, test2 works correctly, and test3 outputs as a '?'.
Can someone explain what is happening differently. I am hoping this will help me understand how .NET is using ASCII values for characters over 128 so that I can write a good scrubbing script.
EDIT
The values I mentioned (145 - 148) are curly quotes. So single left, single right, double left, double right.
By "works correctly" I mean it outputs a curly quote to my browser.
SECOND EDIT
The following code (mentioned in an answer) outputs the curly quotes as well. So maybe the problem was using ASCII in test 3.
char[] characters2 = System.Text.Encoding.Default.GetChars(new byte[] { 147 });
string test4 = new string(characters2);
THIRD EDIT
I found a mac that I could borrow and was able to duplicate the problem. When I copy and paste text that has quote symbols in them from Word into my web app on the mac it pastes curly quotes (147 and 148). When I hit save curly quotes are saved to the database, so I will use the code you all helped me with to scrub that content.
FOUTH EDIT
Spent some time writing more sample code based on the responses here and noticed it has something to do with MultiLine TextBoxes in ASP.NET. There was good info here, so I decided to just start a new question: http://stackoverflow.com/questions/2215547/asp-net-multiline-textbox-allowing-input-above-utf-8