views:

111

answers:

4

I have a bunch of items in my database. Each is assigned a unique ID. I want to shorten this ID and display it on the page, so that if I user needs to contact us (over the phone) regarding a particular item, he can give us the shortened ID, rather than a really big number. Similar to the SKU, on sites like NCIX. Thus, I was thinking about encoding it in base 36. The problem with that, however, is letters like 1lI all look kind of the same. So, I was thinking about eliminating the look-alikes. Is this a good idea, or should I just use a really legible font?

+2  A: 

Use a legible font.

Mark
Also consider using lowercase letters: `io` looks less like `10` than `IO` does.
dan04
@dan04: sure, but `l` looks like `1`. There's no winning. Lowercase `i` is more distinguishable than uppercase `I`, but uppercase `L` is more distinguishable than lowercase `l`.
Mark
+5  A: 

Yes, you should eliminate sources of confusion. Because if a mistake can be made, someone will make it. Very easy to confuse 0 with O and I with l or 1 - hence should not use them both. Well that's easy - since you won't use 3 chars (i, L and o), just get the number in base 36-3 = 33 and convert

SKU.replace('I','X').replace('L','Y').replace('O','Z')

Inversely when given such code and before doing int(SKU, 33), you will have to return XYZ back to the confusing characters. Before that though, if - as expected - you are given by mistake L or I, replace with 1 and if given O, replace with 0. E.g. use SKU.translate() with string.maketrans('LIOXYZ','110IL0')

Nas Banov
This seems like an entirely reasonable answer. Why the downvote?
Gabe
That's a clever way of doing it. For a second I thought you didn't have a 1:1 mapping, but I guess that's why you subtracted the last 3 letters ;)
Mark
+1  A: 

We had a similar situation in a regular app many years ago, at a company I worked for. There was an ID, base 36 (0-9a-z) that often had to be communicated over the phone. That was an application running on a Unix server and viewed on serial terminals (not relevant, just part of the story :).

Our solution was that whenever the user was on that field and pressed F2, a small window popped-up having the radio code for the field: “a9vg5” would display “alpha niner victor golf five”, which the user would just read aloud.

When the application was developed, I had the inclination to display the ID as base 64 encoded, with capitals plus dot and slash, and use different radio-code words for the capitals, but the designated analyst disagreed. You could look-up different words in Wikipedia or be creative.

PS a clarification: although it's not clear the way I wrote it, the analyst disagreed with a good reason, since one has to think both sides of the communication; the user just reads, but the other side on the phone has to remember or look up that e.g. delta==d and Dalton==D.

ΤΖΩΤΖΙΟΥ
A neat solution... but I'm thinking about our users might want to communicate these codes to each other too. Our site has "shipments" which they might want to pair with internal invoices... so they might have to communicate that with their accountant, or... maybe with another party under some circumstances. Giving them a table like this would confuse the heck out of them, I'm sure :)
Mark
+2  A: 

I'm assuming the original ID is numeric. We've had good results from z-base-32 with a similar scenario. We've been using it since April 2009.

I particularly liked the encoding's goals of minimizing transcription errors, through removing confusing letters from the alphabet, and brevity, as shorter identifiers are easier to use.

The encoding orders the alphabet so that the more commonly occurring characters are those that are easier to read, write, speak and remember. Lower case is used as it's easier to read.

I asked this similar question before we decided to use z-base-32.

Robin M
Yes, it's a numeric ID. Didn't read through the whole paper, but it sounds promising and I agree with their decisions and goals. Might have to try and implement that later.
Mark