ansaurus

Question

Is there a unicode character that looks like an ascii one (but isn't equal)?

Answer 1

+2 A:

Try a cyrillic character such as 'a' or 's'. Take a look: http://jrgraphix.net/research/unicode_blocks.php?block=8

Good idea, by the way, but I wouldn't do a method-overloading answer. I'd use a switch-case iterating over a string. That way there's no tip-off that something is wrong, and you can easily pick out the candidates who really know their stuff.

Borealid 2010-08-05 04:51:29

I've added my code in the question...

Stephen 2010-08-05 23:13:29

No I haven't - I don't want a google search to turn up the code. Perhaps I will after this round of interviews...

Stephen 2010-08-05 23:15:57

Answer 2

+4 A:

You could do something with the same feeling but a slightly less obscure case:

System.out.println(100l);
System.out.println(1001);

Depending on the font used, these two statements can look very similar indeed. (If that's the case with the font you're using, the first number is 100L.)

Jon Skeet 2010-08-05 05:25:41

So similar Visual Studio will even warn you not to do this.

Matt Greer 2010-08-05 17:40:01

@Matt: Indeed. I've considered logging a feature request for this to be determined by which font you're using. I like the idea of a compiler switch to specify the source font :)

Jon Skeet 2010-08-05 17:44:06

Hmmm. nice enough, but I think that one will be too hard to hide - it would end up on internet and in editor... however, it won't have the issues with character encoding (I got a compile error on first try when I it at the command line - needed to specify encoding)

Stephen 2010-08-05 21:44:16

Answer 3

+1 A:

What you need is the Bible.

Use it to secure your own job, not to lower the chances of a newcomer.

Anurag 2010-08-05 06:07:08

Or the Obfuscation Table ;) http://forums.thedailywtf.com/forums/p/19103/230329.aspx#230329

Baju 2010-08-05 22:43:15

Answer 4

+1 A:

n-dash or m-dash - look similar to the minus sign.

Mark Bannister 2010-08-05 16:52:11

Answer 5

+2 A:

There are lots of possibilities - here are just a couple that I found with Windows Character Map. Be aware though that not all fonts will have these characters, so your candidate might not see what you intend.

ǃ U+01C3: Latin Letter Retroflex Click
Κ U+039A: Greek Capital Letter Kappa
‚ U+201A: Single Low-9 Quotation Mark
′ U+2032: Prime

Mark Ransom 2010-08-05 22:36:09

Answer 6

+1 A:

I have actually found something that will work in both UTF-8 and cp1252 encoding (so that it will pass most (all?) text editors): the non breaking space!

Registered at position 160 (00A0, 10100000) in cp1252 and apparently UTF-8 (wikipedia notes it in the range of "Second, third, or fourth byte of a multi-byte sequence"), it provides a character that will "just work"

Note: This has been tested to work on windows when copied out of a text file/skype into code editor. A Wordpress web page did not fare so well (but then it probably changed the character anyhow). Thankfully, our organisation did not pursue the "problems" pre-interview tactic, so I have not tested this definitively on a web page.

Stephen 2010-08-06 01:14:34

A lone 0xA0 byte is not valid UTF-8, and of course non-breaking space in UTF-8 is not represented as a lone 0xA0 byte.

R.. 2010-08-11 08:16:58

That's good to know (+1) - I thought as much. However, for the purpose of this question, it appears to work well (the code compiles and runs correctly - or wrongly)

Stephen 2010-08-11 21:37:46

In UTF-8 this would be 0xC2 0xA0, see http://www.fileformat.info/info/unicode/char/a0/index.htm For a web page you might use ` ` instead, but in any case it could be translated to a true space by the browser.

Mark Ransom 2010-08-16 20:49:58

ansaurus

tags:

views:

answers:

Is there a unicode character that looks like an ascii one (but isn't equal)?

related questions