What are “interchangable passphrases” in `vault`, and what does that mean in practice?¶

What are “interchangable passphrases”?¶

The “vault” derivation scheme internally uses PBKDF2-HMAC-SHA1¹ to turn the master passphrase² into a pseudo-random bit sequence, which then drives the actual passphrase derivation. In this context, the master passphrase is passed directly as a key to HMAC-SHA1, and because HMAC-SHA1 requires keys of exactly 64 bytes size, the key is thus subject to the HMAC key mapping procedure. Because the mapping of infinitely many arbitrarily sized keys to 64-byte sized keys cannot be one-to-one, there exist pairs of keys that behave identically when passed into (PBKDF2-)HMAC-SHA1, i.e., the keys (master passphrases) are “interchangable” from the vault scheme’s perspective.

Fundamentally, this is an issue of encoding: the master passphrase is interpreted as an encoding of the HMAC-SHA1 key, and this encoding is not unique, so the effective space of HMAC-SHA1 keys is reduced through the presence of “non-canonical” encodings of keys.

What is the HMAC key mapping procedure?¶

HMAC key mapping procedure

Let MP denote the master passphrase, and let K denote the HMAC key candidate. Let B denote the block size of HMAC-SHA1 in bytes, i.e., 64. At the beginning, K = MP.

If K (= MP) is larger than B, set K to SHA1(K). This updates K for all further steps below.
If K is smaller than B, append as many NUL bytes as necessary to extend K to size B.
Use K as the HMAC key.

What effect does the HMAC key mapping procedure have on key security?¶

The key space shrinks to 99.6% of its original size. But since it started out as astronomically large (2⁵¹²), it still is astronomically large.

Mathematical details: key space

variant	key space size	fraction of total size
64-byte keys only	256⁶⁴ = 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096	99.609375%
full key size up to 64 bytes	(256⁶⁵ – 1) / (256 – 1) = 13460387568883548460748825096238025916214579019888834136067575410167731732152266768867764001296969715641757473316629133406121545097483615318772604492382465	100%

The key space sizes can be calculated using the following formulas. Let q = 256 denote the alphabet size of binary strings, and let n denote the string length. The total count of all strings of size n is qⁿ, and the total count of all strings up to (and including) size n is (q^n + 1 – 1) / (q – 1) per the formula for geometric series.

Verification:

$ # using GNU bc 1.07.1
$ 
$ BC_LINE_LENGTH=0 bc <<'HERE'
> # The total count of size 64 byte strings.
> 256^64
> # The total count of byte strings of size 64 or less.
> (256^65 - 1) / (256 - 1)
> # The fraction of the former within the latter.
> scale = 8
> (256^64) / ((256^65 - 1) / (256 - 1))
> HERE
13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096
13460387568883548460748825096238025916214579019888834136067575410167731732152266768867764001296969715641757473316629133406121545097483615318772604492382465
.99609375
$ 
$ BC_LINE_LENGTH=0 bc <<'HERE'
> # The fraction of unusable keys.
> scale = 160
> (256^64 - 1) / (256^65 - 1)
> 1 / 256
> HERE
.0039062499999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999997097
.0039062500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

In particular, assuming a sufficiently secure master passphrase, this mapping procedure is still cryptographically secure against attackers without possession of the master passphrase if the hashing function (here: SHA1) is secure against preimage attacks:

The attacker can attempt to guess a NUL-extended version of the passphrase if it is shorter than or equal in length to 64 bytes. This has the same computational cost as guessing the master passphrase directly, which is cryptographically secure by assumption.
The attacker can attempt to guess a hashed and NUL-extended version of the passphrase if it is larger than 64 bytes. This amounts to carrying out a preimage attack against the SHA1 digest of the master passphrase, which is also cryptographically secure by assumption.

What effect does the HMAC key mapping procedure have on `derivepassphrase`?¶

derivepassphrase vault does not check for interchangable passphrases, and will happily accept any (non-empty) passphrase it is given. The derivepassphrase.vault.Vault class does not check for interchangable passphrases either, and will happily accept any passphrase it is given, even empty ones.

Most interchangable variations of a master passphrase contain binary characters such as NUL, or even arbitrary byte sequences, which may be hard to type in or impossible to express in certain storage formats. As such, it is unlikely—but otherwise supported—that the user would want to enter or store a different, interchangable version of their master passphrase in the first place.

PBKDF2 is a key derivation function, published in RFC 2898. It uses a pseudo-random function such as HMAC-SHA1 (hashed message authentication code, specified in RFC 2104 and using SHA1 as the underlying hash function) when processing its input. PBKDF2 passes the key on to its pseudo-random function, and otherwise only depends on the output of the pseudo-random function, not on the key. ↩
If you use a master SSH key, it is first converted to an “equivalent master passphrase”. ↩

What are “interchangable passphrases” in vault, and what does that mean in practice?¶