ansaurus

Question

Any statistics out there on commonly mistyped keys?

Answer 1

A:

I don't know of a statistics source, but it seems there would be a big difference between (1) someone hitting the wrong key because of poor finger positioning, which most typists would immediately backspace and correct on the fly, so statistics on those kinds of events could only be captured in real time as opposed to tabulating what most spelling correctors encounter, and (2) the typist hits the right keys but in the wrong order ("naem" instead of "name") because of speed/distraction/neuron causes, and (3) the typist hits the wrong keys from not knowing how to spell ("maintenence" instead of "maintenance").

For case #1, if the most common letters in English are E, T, A... then there's probably a good chance those are also the most missed keys, in that order, although that doesn't tell you which of the neighbors like "w" and "r" are hit the most instead. A typist trying for an end-of-row key like "a" might actually wrongly hit CAPS LOCK as frequently as wrongly hitting "s".

Personally, it's the non-alphas I usually miss, especially if hunting and pecking for / vs \, { vs [, ' vs ", comma vs period when typing formatted numbers and currency, missing the shift and getting 8 instead of *, etc, etc, and since non-alpha typing is so prevalent when programming, those cases are probably much more frequent for programmers than non-programmers.

joe snyder 2010-08-10 04:10:29

Interesting. While I do have trouble with the non alphas I would say that the among the alphas it's x,c,v that I have the most trouble with rather than e,t,a. I suspect that while these might be the most common letters typist aren't likely to hit the wrong key when typing them because of their placement and how commonly they are used. Do let me know if you find any reputable statistics on this.

Abe Miessler 2010-08-10 04:56:04

Answer 2

+1 A:

I actually had to look into this a couple of years ago--when i began the project i had no idea where to begin, so hopefully i can save you an anyone else in the same situation, some time.

Bottom line is that you can take advantage of a large amount of work done in other fields. The most important of these, i found, is the domain name registrars. For instance, DomainTools has a 'Domain Typo Generator', which works by generating a list of 'typo' domain names, from a domain name your enter.

In addition, i would recommend the remarkably comprehensive 2005 study of this issue by Microsoft Research.

Finally, there's a key concept in computational linguistics derived from the Levenshtein distance, called Damerau-Levenshtein distance, which extends the basic Levenshtein's basic idea of 'edit distance' to the particular problem of humans typing on a keyboard. The principal conclusion from his 1964 research paper was that 80% of all typos can be described by one of just four operations--insertion, deletion, substitution of a single character, or transposition of two characters.this problem was Damerau not only distinguished these four edit operations but also stated that they correspond to more than 80% of all human misspellings. (The only link i supplied for D-L is the Wikipedia article; i did so because i think this is an exellent and brief introduction plus it contains pseudo-code for the D-L algorithm, and finally the article provides links the primary online sources for D-L.

doug 2010-08-12 06:47:27

Awesome info, thanks!

Abe Miessler 2010-08-12 14:08:03

ansaurus

tags:

views:

answers:

Any statistics out there on commonly mistyped keys?

related questions