simple XKCD-style passwords on the command-line

2013.Apr.12

We can generate XKCD-style passwords on the command-line without much difficulty.

1
2
$ echo ${(f)$(LC_ALL=C egrep -ox '[a-zA-Z0-9]+' /usr/share/dict/american-english-insane \
       | shuf --random-source=/dev/urandom | head -n4)}

We need a vocabulary to pick words from. I chose to use the dictionaries at /usr/share/dict.

I chose to restrict to ASCII only characters (to ensure the passwords can be typed with any keyboard.)

One might think that the character class [:alnum:] and the bracket expression [0-9A-Za-z] would be different: the former would match all alphanumeric characters (in a locale & unicode aware fashion) and the latter would strictly match just ASCII alphanumeric characters. This, however, is not the case (presumably for convenience.) [0-9A-Za-z]

The manpage explains:

Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d] is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.

With the above filtering, we get 507655 words from /usr/share/dict/american-english-insane.

1
2
$ LC_ALL=C egrep -ox '[a-zA-Z0-9]+'  /usr/share/dict/american-english-insane | wc -l 
507655

More information about this and other word lists can be found on Kevin’s Word List Page. On Ubuntu, you can also install these with:

1
$ sudo apt-get install wamerican-insane

There is a lot of discussion of the strength of these passwords: e.g., XKCD #936: Short complex password, or long dictionary passphrase?. We can increase the difficult of guessing these passwords in a number of different ways. Here’s a small Python script that shows some possibilities.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# scramble_words.py

from random import randint, choice

before, after = (lambda x,y: x+y), (lambda x,y: y+x)
insert = lambda c: choice(before,after)(c,choice('`~!@#$%^&*()-_+=[{]};:\'",<.>/?'))

scramble = lambda c: choice('`~!@#$%^&*()-_+=[{]};:\'",<.>/?') 

translate = lambda c: {
    'a': '4', 'b': '6', 'c': '(', 'd': ']', 'e': '3', 
    'f': '{', 'g': '9', 'h': '&', 'i': '!', 'j': ':',
    'k': '+', 'l': '|', 'm': '}', 'n': '^', 'o': '0', 
    'p': '%', 'q': '*' ,'r': '<', 's': '$', 't': '7', 
    'u': '~', 'v': '>', 'w': '_', 'x': '#', 'y': '/', 'z': '2' }.get(c.lower(),c)

# invent your own

if __name__ == '__main__':
	from sys import stdin
	for word in (x.strip() for x in stdin):
		idx = randint(0,len(word)-1)
		print word[:idx] + translate(word[idx]) + word[idx+1:]
1
2
$ echo ${(f)$(LC_ALL=C egrep -ox '[a-zA-Z0-9]+' /usr/share/dict/american-english-insane \
       | shuf --random-source=/dev/urandom | head -n4 | python scramble_words.py)}

We may wonder if /dev/urandom is a sufficient source of randomness for generating passwords. Thomas Pornin weighs in:

An RNG faster than /dev/random but cryptographically useful?

Is a rand from /dev/urandom secure for a login key?

(Of course, we can always use `pwgen` if we just want to generate a normal, random-character password at the command-line.)