views:

2127

answers:

18

Background

While at the Gym the other day, I was working with my combination lock, and realized something that would be useful to me as a programmer. To wit, my combination is three seperate sets of numbers that either sound alike, or have some other relation that makes them easy to remember. For instance, 5-15-25, 7-17-2, 6-24-5. These examples seem easy to remember.

Question

How would I implement something similar for passwords? Yes, they ought to be hard to crack, but they also should be easy for the end user to remember. Combination Locks do that with a mix of numbers that have similar sounds, and with numbers that have similar properties (7-17-23: All Prime, 17 rolls right off the tongue after 7, and 23 is another prime, and is (out of that set), the 'hard' one to remember).

Criteria

  • The Password should be easy to remember. Dog!Wolf is easy to remember, but once an attacker knows that your website gives out that combination, it makes it infinitely easier to check.
  • The words or letters should mostly follow the same sounds (for the most part).
  • At least 8 letters
  • Not use !@#$%^&*();'{}_+<>?,./ These punctuation marks, while appropriate for 'hard' passwords, do not have an 'easy to remember' sound.

Resources

This question is language-agnostic, but if there's a specific implementation for C#, I'd be glad to hear of it.

Update

A few users have said that 'this is bad password security'. Don't assume that this is for a website. This could just be for me to make an application for myself that generates passwords according to these rules. Here's an example.

The letters A-C-C-L-I-M-O-P 'flow', and they happen to be two regular words put together (Acclimate and Mop). Further, when a user says these letters, or says them as a word, it's an actual word for them. Easy to remember, but hard to crack (dictionary attack, obviously).

This question has a two-part goal:

  1. Construct Passwords from letters that sound similar (using alliteration) or
  2. Construct Passwords that mesh common words similarly to produce a third set of letters that is not in a dictionary.
+8  A: 

You could use Markov Chains to generate words that sounds like English(or any other language you want) but they are not actual words.

The question of easy to remember is really subjective, so I don't think you can write an algorithm like this that will be good for everyone.

And why use short passwords on web sites/computer applications instead of pass phrases? They are easy to remember but hard to crack.

Alex Reitbort
@Alex: Three questions, What are Markov Chains, How would they help me, and three, what methods are there for using pass phrases for Web Applications (specifically, .NET, but any language would do).
George Stocker
@Gortok: Pass phrase is using some phrase like "My name is Alex" as password.
Alex Reitbort
@Gortok: Markov chains let you generate words that easier to remember then random passwords, but still they not words, so they can sustain dictionary attack
Alex Reitbort
@Alex: That was a subtle hint for you to include something about Markov chains (more than 'use them') in your answer. If Stack Overflow is to be a repository for useful programming information, then Google should not be a primary resource. Other sites may be, but not Google.
George Stocker
ok kids, re:Markov Chains, basically scan some English text, letter by letter, and gather probability of each letter following each other letter. Then use those probabilities to generate random password, letter by letter.
zvolkov
Please be aware that Markov chains by definition can also generate "real" words, including those that were used to build them and those you don't want to appear in production code (slurs, insults, obscenities, etc).
Alan
+17  A: 

First of all make sure the password is long. Consider using a "pass-phrase" instead of a single "pass-word". Breaking pass-phrases like "Dogs and wolves hate each other." is very hard yet they are quite easy to remember.

Some sites may also give you an advice which may be helpful, like Strong passwords: How to create and use them (linked from Password checker, which is a useful tool on its own).

Also, instead of trying to create easy to remember password, in some cases a much better alternative is to avoid remembering the password at all by using (and educating your users to use) a good password management utility (see What is your favourite password storage tool?) - when doing this, the only part left is to create a hard to crack password, which is easy (any long enough random sentence will do).

Suma
How do you deal with case sensitivity, or with how a user forgets how they wrote it; with security questions, I run into that all the time.
George Stocker
I think the plural is "wolves"... :-)
Jimmy J
The other problem is that the longer the password, the less usable it is. Sure, we can type 100+ WPM, but the average user hunts-and-pecks.
Adriano Varoli Piazza
@Adriano Varoli Piazza: Excellent point!
George Stocker
Reminds me of how precisely a phrase can locate one particular document on the web. Often I can't remember a document's title (or context), and only recall a particular phrase - it works surprisingly well. Another trick is a phrase across sentences (end of one sentence + start of the next one).
13ren
@Jimmy J: No guessing-this-password for YOU! :-)
13ren
Unfortunately a lot of websites truncate passwords, commonly to 32 characters but in some extreme cases clip it to 10.
rjh
Another problem is that you need to make sure the pass has at least one upper case, one special symbol, one lowercase and one number, to pass the stupid rules in all those stupid sites that force all these rules. These are the hardest to remember for me.
Daniel Magliola
+1  A: 

Take look at the gpw tool. The package is also available in Debian/Ubuntu repositories.

rawicki
+18  A: 

You might want to look at:

Can Berk Güder
+2  A: 

I prefer giving users a "hard" password, requiring them to change it on the first use, and giving them guidance on how to construct a good, long pass phrase. I would also couple this with reasonable password complexity requirements (8+ characters, upper/lower case mix, and punctuation or digits). My rationale for this is that people are much more likely to remember something that they choose themselves and less likely to write it down somewhere if they can remember it.

tvanfosson
+3  A: 

When you generate a password for the user and send it by email, the first thing you should do when they first login if force them to change their password. Passwords created by the system do not need to be easy to remember because they should only be needed once.

Having easy to remember, hard to guess passwords is a useful concept for your users but is not one that the system should in some manner enforce. Suppose you send a password to your user's gmail account and the user doesn't change the password after logging in. If the password to the gmail account is compromised, then the password to your system is compromised.

So generating easy to remember passwords for your users is not helpful if they have to change the password immediately. And if they aren't changing it immediately, you have other problems.

jmucchiello
Great Answer: but I should have been more clear. This isn't necessarily for other users to use. This could be making a self password generator. I have about 7 random passwords I use, and I'd like to implement an 'easier to remember' structure. $ and ! are not easy to 'say (and commit to memory)
George Stocker
+5  A: 

FWIW I quite like jumbling word syllables for an easy but essentially random password. Take "Bongo" for example as a random word. Swap the syllables you get "Gobong". Swap the o's for zeros on top (or some other common substitution) and you've got an essentially random character sequence with some trail that helps you remember it.

Now how you pick out syllables programmatically - that's a whole other question!

Marc
So how do you pick them out programmatically if you were to use this approach you suggest?
George Stocker
Hmmm. I've just done some looking and it appears that algorithmically defining a "syllable" isn't easy; I can't even find a pre-canned dictionary of them to use. So it's probably not an approach suited to a program. Sorry, not really an answer after all :-(
Marc
You could look for a regex pattern at the end of the word. Off the top of my head, the consonant before the last vowel, and don't include "e" if it's the final letter. Works for the first five words that I think of: monKEY, giRAFFE ( "e" excluded ), influenZA, granoLA, polymorpHISM. (ALMOST works, I guess you'd have to include common compound letters, like th, ph, sh etc.) This would be so much simpler in Japanese. Each character represents a full syllable.
Atømix
The great thing about the di- and tri-graph weight based generators, is they automatically take into account how syllables are constructed
Joe Koberg
+1  A: 

System generated passwords are a bad idea for anything other than internal service accounts or temporary resets (etc).

You should always use your own "passphrases" that are easy for you to remember but that are almost impossible to guess or brute force. For example the password for my old university account was.

Here to study again!

That is 20 characters using upper and lower case with punctuation. This is an unbelievably strong password and there is no piece of software that could generate a more secure one that is easier to remember for me.

Fraser
Pretend I want this to generate my own random letters that would make up a password; and not necessarily to hand out as a password to general users.
George Stocker
http://www.codinghorror.com/blog/archives/000342.html
Jarrod Dixon
Jarrrod, are you trying to say I copied that site? I like helping people and they are my own words. I agree there is a striking similarity and I would be suspicious too if it was not a such a common practice. I'm sorry if that was not your implication but I am not a plagiarist!
Fraser
+2  A: 

My father once told me an story about how back when computer passwords could only be 8 characters long, the trick was to use a 7 letter word and a space at the end. That way, hackers couldn't crack it.

Long time ago though, especially in ICT land.

I generally just use a combination of upper and lowercase letters, some numbers and a couple of alt characters.

Vordreller
A: 

I would really love to see someone implement passwords with control characters like "<Ctrl>+N" or even combo characters like "A+C" at the same time. Converting this to some binary equivalent would, IMHO, make password requirements much easier to remember, faster to type, and harder to crack (MANY more combinations to check).

Mike
+7  A: 

After many years, I have decided to use the first letter of words in a passphrase. It's impossible to crack, versatile for length and restrictions like "you must have a digit", and hard to make errors.

This works by creating a phrase. A crazy fun vivid topic is useful! "Stack Overflow aliens landed without using rockets or wheels". Take the first letter, your password is "soalwurow"

You can type this quickly and accurately since you're not remembering letter by letter, you're just speaking a sentence inside your head.

I also like having words alternate from the left and right side of the keyboard, it gives you a fractionally faster typing speed and more pleasing rhythm. Notice in my example, your hands alternate left-right-left-right.

SPWorley
The only think one needs to be careful about is not using single letter word substitutions, like "u" or "c". Otherwise, "What are you to do if you see flying pigs" could be wru2diucfp instead of waytdiysfp. Argh! The pundits were right, text messaging shortcuts ARE killing the English language. :-)
Atømix
+3  A: 

A spin on the 'passphrase' idea is to take a phrase and write the first letters of each word in the phrase. E.g.

"A specter is haunting Europe - the specter of communism."

Becomes

asihe-tsoc

If the phrase happens to have punctation, such as !, ?, etc - might as well shove it in there. Same goes for numbers, or just substitute letters, or add relevant numbers to the end. E.g. Karl Marx (who said this quote) died in 1883, so why not 'asihe-tsoc83'?

I'm sure a creative brute-force attack could capitalise on the statistical properties of such a password, but it's still orders of magnitude more secure than a dictionary attack.


Another great approach is just to make up ridiculous words, e.g. 'Barangamop'. After using it a few times you will commit it to memory, but it's hard to brute-force. Append some numbers or punctuation for added security, e.g. '386Barangamop!'

rjh
+2  A: 

Here's part 2 of your idea prototyped in a shell script. It takes 4, 5 and 6 letter words (roughly 50,000) from the Unix dictionary file on your computer, and concatenate those words on the first character.

#! /bin/bash

RANDOM=$$
WORDSFILE=./simple-words
DICTFILE=/usr/share/dict/words
grep -ve '[^a-z]' ${DICTFILE} | grep -Ee '^.{4,6}$' > ${WORDSFILE}
N_WORDS=$(wc -l < ${WORDSFILE})
for i in $(seq 1 20); do
    password=""
    while [ ! "${#password}" -ge 8 ] || grep -qe"^${password}$" ${DICTFILE}; do
        while [ -z "${password}" ]; do
            password="$(sed -ne "$(( (150 * $RANDOM) % $N_WORDS + 1))p" ${WORDSFILE})"
            builtfrom="${password}"
        done
        word="$(sort -R ${WORDSFILE} | grep -m 1 -e "^..*${password:0:1}")"
        builtfrom="${word} ${builtfrom}"
        password="${word%${password:0:1}*}${password}"
    done
    echo "${password} (${builtfrom})"
done

Like most password generators, I cheat by outputting them in sets of twenties. This is often defended in terms of "security" (someone looking over your shoulder), but really its just a hack to let the user just pick the friendliest password.

I found the 4-to-6 letter words from the dictionary file still containing obscure words.

A better source for words would be a written document. I copied all the words on this page and pasted them into a text document, and then ran the following set of commands to get the actual english words.

perl -pe 's/[^a-z]+/\n/gi' ./624425.txt | tr A-Z a-z | sort -u > ./words
ispell -l ./words | grep -Fvf - ./words > ./simple-words

Then I used these 500 or so very simple words from this page to generate the following passwords with the shell script -- the script parenthetically shows the words that make up a password.

backgroundied (background died)
soundecrazy (sounding decided crazy)
aboupper (about upper)
commusers (community users)
reprogrammer (replacing programmer)
alliterafter (alliteration after)
actualetter (actual letter)
statisticrhythm (statistical crazy rhythm)
othereplacing (other replacing)
enjumbling (enjoying jumbling)
feedbacombination (feedback combination)
rinstead (right instead)
unbelievabut (unbelievably but)
createdogso (created dogs so)
apphours (applications phrase hours)
chainsoftwas (chains software was)
compupper (computer upper)
withomepage (without homepage)
welcomputer (welcome computer)
choosome (choose some)

Some of the results in there are winners.

The prototype shows it can probably be done, but the intelligence you require about alliteration or syllable information requires a better data source than just words. You'd need pronunciation information. Also, I've shown you probably want a database of good simple words to choose from, and not all words, to better satisfy your memorable-password requirement.

Generating a single password the first time and every time -- something you need for the Web -- will take both a better data source and more sophistication. Using a better programming language than Bash with text files and using a database could get this to work instantaneously. Using a database system you could use the SOUNDEX algorithm, or some such.

Neat idea. Good luck.

ashawley
+1  A: 

I'm completely with rjh. The advantage of just using the starting letters of a pass-phrase is that it looks random, which makes it damn hard to remember if you don't know the phrase behind it, in case Eve looks over your shoulder as you type the password.
OTOH, if she sees you type about 8 characters, among which 's' twice, and then 'o' and 'r' she may guess it correctly the first time.
Forcing the use of at least one digit doesn't really help; you simply know that it will be "pa55word" or "passw0rd".

Song lyrics are an inexhaustible source of pass-phrases.

"But I should have known this right from the start"

becomes "bishktrfts". 10 letters, even only lowercase gives you 10^15 combinations, which is a lot, especially since there's no shortcut for cracking it. (At 1 million combinations a second it takes 30 years to test all 10^15 combinations.)
As an extra (in case Eve knows you're a Police fan), you could swap e.g. the 2nd and 3rd letter, or take the second letter of the third word. Endless possibilities.

stevenvh
+1  A: 

One way to generate passwords that 'sound like' words would be to use a markov chain. An n-degree markov chain is basically a large set of n-tuples that appear in your input corpus, along with their frequency. For example, "aardvark", with a 2nd-degree markov chain, would generate the tuples (a, a, 1), (a, r, 2), (r, d, 1), (d, v, 1), (v, a, 1), (r, k, 1). Optionally, you can also include 'virtual' start-word and end-word tokens.

In order to create a useful markov chain for your purposes, you would feed in a large corpus of english language data - there are many available, including, for example, Project Gutenburg - to generate a set of records as outlined above. For generating natural language words or sentences that at least mostly follow rules of grammar or composition, a 3rd degree markov chain is usually sufficient.

Then, to generate a password, you pick a random 'starting' tuple from the set, weighted by its frequency, and output the first letter. Then, repeatedly select at random (again weighted by frequency) a 'next' tuple - that is, one that starts with the same letters that your current one ends with, and has only one letter different. Using the example above, suppose I start at (a, a, 1), and output 'a'. My only next choice is (a, r, 2), so I output another 'a'. Now, I can choose either (r, d, 1) or (r, k, 1), so I pick one at random based on their frequency of occurrence. Suppose I pick (r, k, 1) - I output 'r'. This process continues until you reach an end-of-word marker, or decide to stop independently (since most markov chains form a cyclic graph, you can potentially never finish generating if you don't apply an artificial length limitation).

At a word level (eg, each element of the tuple is a word), this technique is used by some 'conversation bots' to generate sensible-seeming nonsense sentences. It's also used by spammers to try and evade spam filters. At a letter level, as outlined above, it can be used to generate nonsense words, in this case for passwords.

One drawback: If your input corpus doesn't contain anything other than letters, nor will your output phrases, so they won't pass most 'secure' password requirements. You may want to apply some post-processing to substitute some characters for numbers or symbols.

Nick Johnson
+4  A: 

I am surprised no one has mentioned the Multics algorithm described at http://www.multicians.org/thvv/gpw.html , which is similar to the FIPS algorithm but based on trigraphs rather than digraphs. It produces output such as

ahmouryleg
thasylecta
tronicatic
terstabble

I have ported the code to python as well: http://pastebin.com/f6a10de7b

Joe Koberg
+4  A: 

I have a few times used a following algorithm:

  1. Put all lowercase vowels (from a-z) into an array Vowels
  2. Put all lowercase consonants (from a-z) into another array Consonants
  3. Create a third array Pairs of two letters in such a way, that you create all possible pairs of letters between Vowels and Consonants ("ab", "ba", "ac", etc...)
  4. Randomly pick 3-5 elements from Pairs and concatenate them together as string Password
  5. Randomly pick true or false
    1. If true, remove the last letter from Password
    2. If false, don't do anything
  6. Substitute 2-4 randomly chosen characters in Password with its uppercase equivalent
  7. Substitute 2-4 randomly chosen characters in Password with a randomly chosen integer 0-9

Voilá - now you should have a password of length between 5 and 10 characters, with upper and lower case alphanumeric characters. Having vowels and consonants take turns frequently make them semi-pronounceable and thus easier to remember.

Henrik Paul
A: 

edit: After answering, I realized that this is in no way phonetically memorable. Leaving the answer anyway b/c I find it interesting. /edit

Old thread, I know... but it's worth a shot.

1) I'd probably build the largest dictionary you can ammass. Arrange them into buckets by part of speech.

2)Then, build a grammar that can make several types of sentences. "Type" of sentence is determined by permutations of parts of speech.

3)Randomly (or as close to random as possible), pick a type of sentence. What is returned is a pattern with placeholders for parts of speech (n-v-n would be noun-verb-noun)

3)Pick words at random in each part of speech bucket to stand in for the placeholders. Fill them in. (The example above might become something like car-ate-bicycle.)

4)randomly scan each character deciding whether or not you want to replace it with either a similar-sounding character (or set of characters), or a look-alike. This is the hardest step of the problem.

5) resultant password would be something like kaR@tebyCICle

6) laugh at humorous results like the above that look like "karate bicycle"

San Jacinto