I think the clever thing here is that it starts at the abbreviation and then builds a phrase. This makes it very easy to reason about the entropy while still using arbitrary logic to come up with the phrase to be memorized.

Of course when you generate 20 and let the user pick one you lose some entropy.

Of course when you generate 20 and let the user pick one you lose some entropy.

Good point! This could be fixed by presenting different words for the same set of prefixes, rather than presenting different sets of prefixes. For example, if one of the randomly chosen prefixes was hin, one presentation could display Hinted and another Hindu.

Of course, that won’t stop the user from just hitting refresh :)

Of course when you generate 20 and let the user pick one you lose some entropy.

Not of any significant value though. If the default security margin is 50 bits (as per the web implementation linked), then that’s 2^50 possible outputs. If 32 passwords are being generated per browser refresh, then that’s ~0.000000000003% of the full key space.

Go ahead and refresh until you find something you like. You would have to refresh your browser ~17 trillion times before you reduced the keyspace by 1 bit, or 50%.

Are you sure that math is right? I think you have to take into account that the user isn’t rejecting just those 31 phrases. They are using a rule to reject a huge portion of the search space. For example if you use an over-simplified rejection rule such as the user always picking the lexicographical first password out of the 32 choices the distribution is heavily skewed so guessing the password probably takes a fraction of the time on average.

We probably need to clarify what we’re discussing, so there isn’t miscommunication or confusion. First, let me back up my math, then I’ll see if I understand your concern.

The password generator generates 3-letter prefixes based on a random number of 0 through 1023, which provides each prefix with 10 bits security. It then uses those prefixes to pick words from a word list, and builds a mnemonic based on a massively large bigram DB.

If five prefixes are generated uniformly at random, then the resulting password indeed has 50 bits security. The web interface allows you to change how many prefixes you want, resulting in the same number of words for your mnemonic. At every refresh, 32 passwords are generated.

So, that’s 32 prefixes out of 2^50 possible = 32/1,125,899,906,842,624 = 0.000000000000028421709430404007434844970703125. 2^49 is half the size of 2^50. 2^49 possible prefixes = 562,949,953,421,312. If your browser is generating 32 prefixes per refresh, then you need to refresh your browser 562,949,953,421,312/32 = 17592186044416 times, or about 17.5 trillion times.

At that point, odds switch to your favor that you will generate a duplicate prefix that has already been seen.

They are using a rule to reject a huge portion of the search space. For example if you use an over-simplified rejection rule such as the user always picking the lexicographical first password out of the 32 choices the distribution is heavily skewed so guessing the password probably takes a fraction of the time on average.

If I understand this correctly, you’re assuming the user would mentally pick a fixed point prefix such as “The first prefix must be see”. If so, then yes they lost a full 10 bits of security. However, they’ll have an 1-in-1024 chance that see is the first prefix, which is on average, 1,024 browser refreshes before they find it. If they make two fixed point prefixes, such as the first being see and the second being abs, then they have a 1-on-1024^2 or 1-in-1048576 chance of finding it. So, an average, 1 million browser refreshes.

So while they have reduced their security margin greatly, they also increased their work load greatly, and I’m not seeing how that would be worth it. Of course, they could automate it with a script outside of the browser. Finding one prefix in 1,024 isn’t hard, nor is one in a million. Going beyond that might force them to wait it out though.

I think the clever thing here is that it starts at the abbreviation and then builds a phrase. This makes it very easy to reason about the entropy while still using arbitrary logic to come up with the phrase to be memorized.

Of course when you generate 20 and let the user pick one you lose some entropy.

Good point! This could be fixed by presenting different words for the same set of prefixes, rather than presenting different sets of prefixes. For example, if one of the randomly chosen prefixes was

`hin`

, one presentation could displayHintedand anotherHindu.Of course, that won’t stop the user from just hitting refresh :)

Not of any significant value though. If the default security margin is 50 bits (as per the web implementation linked), then that’s 2^50 possible outputs. If 32 passwords are being generated per browser refresh, then that’s ~0.000000000003% of the full key space.

Go ahead and refresh until you find something you like. You would have to refresh your browser ~17 trillion times before you reduced the keyspace by 1 bit, or 50%.

Are you sure that math is right? I think you have to take into account that the user isn’t rejecting just those 31 phrases. They are using a rule to reject a huge portion of the search space. For example if you use an over-simplified rejection rule such as the user always picking the lexicographical first password out of the 32 choices the distribution is heavily skewed so guessing the password probably takes a fraction of the time on average.

We probably need to clarify what we’re discussing, so there isn’t miscommunication or confusion. First, let me back up my math, then I’ll see if I understand your concern.

The password generator generates 3-letter prefixes based on a random number of 0 through 1023, which provides each prefix with 10 bits security. It then uses those prefixes to pick words from a word list, and builds a mnemonic based on a massively large bigram DB.

If five prefixes are generated uniformly at random, then the resulting password indeed has 50 bits security. The web interface allows you to change how many prefixes you want, resulting in the same number of words for your mnemonic. At every refresh, 32 passwords are generated.

So, that’s 32 prefixes out of 2^50 possible = 32/1,125,899,906,842,624 = 0.000000000000028421709430404007434844970703125. 2^49 is half the size of 2^50. 2^49 possible prefixes = 562,949,953,421,312. If your browser is generating 32 prefixes per refresh, then you need to refresh your browser 562,949,953,421,312/32 = 17592186044416 times, or about 17.5 trillion times.

At that point, odds switch to your favor that you will generate a duplicate prefix that has already been seen.

If I understand this correctly, you’re assuming the user would mentally pick a fixed point prefix such as “The first prefix must be

`see`

”. If so, then yes they lost a full 10 bits of security. However, they’ll have an 1-in-1024 chance that`see`

is the first prefix, which is on average, 1,024 browser refreshes before they find it. If they make two fixed point prefixes, such as the first being`see`

and the second being`abs`

, then they have a 1-on-1024^2 or 1-in-1048576 chance of finding it. So, an average, 1 million browser refreshes.So while they have reduced their security margin greatly, they also increased their work load greatly, and I’m not seeing how that would be worth it. Of course, they could automate it with a script outside of the browser. Finding one prefix in 1,024 isn’t hard, nor is one in a million. Going beyond that might force them to wait it out though.

Is this what your talking about?

Edit: spelling/grammar