views:

276

answers:

5

I'm interested in generating short codes (up to 6 characters) which are unambiguous for human readability:

i.e.: 2Z8B5S would be a very bad code because B looks a lot like 8 and 2 looks a lot like Z, etc.

A good code would be something like: AE37HT, say.

Obviously, I could try to figure it out myself, but I was looking to see if there were any studies by people like NASA or whatever.

If you also have any references about how the readability is affected by color, typeface, size and distance from viewing (I'm looking at something potentially about an inch high from a distance of about 6 feet), that would be helpful as well. On the monitor or possibly in print, too.

I found this set of guidelines, but it doesn't have any empirical results which I could turn into a table to generate the codes:

http://www.usabilitysciences.com/usability-of-codes-passwords-numbers-and-letters/

+2  A: 

G'day,

You might like to look at air traffic control agencies, maybe Eurocontrol can help because this is the sort of thing they work with?

Edit: Are we talking about visual confusion or spoken?

Edit: You might like to look at the chapters on naming in Steve McConnell's "Code Complete" he has a few rules there to help create unambiguous names iirc.

HTH

cheers.

Rob Wells
For this case, visual, but I guess it's possible to have someone give out the code over the phone, so aural issues could be important, I hadn't considered that use.
Cade Roux
+4  A: 

Frankly, I think the most important factor here is choosing the correct font.

If your goal is purely legibility, it will be a matter of picking a font that's preferably:

1) Fixed width. For picking out random numbers/letters, fixed width helps tremendously, since the kerning isn't changing as you move across the font.

2) Use a font with separate 0/O looks - those definitely mess people up. Look for other letter/number combinations that are similar. Potentially, leave 0/O out of the mix just for this reason.

3) Choose a font with subtle serifs and weight changes.

For some guidelines on legibility, see this page.

With the right font, I think you could pick any letter/number combination and have it be understandable clearly (other than potentially 0 and O). I believe the 8/B, 5/S and other samples would be clear in the correct font.

The other thing you could consider would be to use one color for letters and a second for numbers - this would give clues to the potentially ambigous number/letter combinations. I'd make this a subtle cue, though, as having a drastic color change will draw attention to letters or numbers, which will hurt the overall readability.


Edit after reading your comment to another answer:

I only need a few thousand codes, so I'm not terribly worried about the size of the domain

If this is the case, I would recommend leaving the entire number set, and just selectively adding in letters that have no visual (or aural, if you're reading these) similarity to numbers. With 6 digits, even with numbers, you have more code possibilities than you need. Selectively adding in letters to help differentiate will be easier than trying to selectively remove some. I would probably stick to 1-9, A, Z, R, W, and other letters that don't match up with numbers.

Reed Copsey
+1  A: 

If you have the choice you might be able to influence this by changing fonts. For example, many programmer-oriented fonts deliberately slash the zero, use different shapes for I, l, and 1, and so on. As I remember it serif fonts are usually used for this purpose as well. I am guessing - I can't back this up - that it was also the reason that many older books are typeset with "text figures", numerals that are of varying heights which flow better with the page and (supposedly) increase readability. (See http://en.wikipedia.org/wiki/Text_figures - yes, I am almost quoting them verbatim.)

jprete
I would have complete control of the font. I'm thinking of striking many letters and numbers. Another possibility is arranging it so it's always letter-number-letter-number etc. So there is a certain amount of patterning to help users.
Cade Roux
+2  A: 

I agree with Reed that your best solution will lie with font.

If you try to eliminate ambiguous numbers, you lose 1 (looks like lowercase l), 8 (capital B) and 0 (upper or lowercase O), which is 30% of the available numeric characters. That's a lot. You might even have issues with 6 and capital G.

So, eliminating similar-looking letters and numbers is really going to limit your choices.

Sure, even with a font, there are some similarities -- zero and capital O are always gonna give you problems.

How about Courier New? Or something similar. Serif. Mono-spaced.

One of my favorite font examples is the name of the state Illinois. Just try to type that in a textbox using Arial. Put three L's in there: Illlinois. Then, try to see that there are 3 L's. And, good luck moving the insertion point to the right spot. SO much easier in a Courier-type font: Illlinois.

There's a reason StackOverflow and other sites displaying code use a Courier-like font to display code. And why SO and other websites and software (Apple) use Courier-like fonts for data entry fields (textboxes, textareas like this.

DOK
Thanks, I only need a few thousand codes, so I'm not terribly worried about the size of the domain. I am also considering just using numbers, so the letters would only increase the space.
Cade Roux
+1  A: 

How about a backtracking approach at generating the codes, where you would invalidate any solution that doesn't fit a set of rules, like not having similarly-looking characters next to one another. If you can identify all the unwanted pairs of characters and flag as invalid any solutions that contain them, I guess it's pretty straitforward.

luvieere