Thursday, October 7, 2021

I encrypted with AES-GCM and you won't believe what happened next...

Now that you fall for the clickbait, let me give you a quick summary of what this post is about. We will talk about AES-GCM documented and largely unknown limitations. We won’t get into the cryptographic details of the algorithm, so no need to worry about that. I’ll propose some workarounds to the limitations too. There is some basic math involved but you can skip it if you are not into that :)


Introduction

It is usually brought up that whenever you solve a problem with encryption, now you created a key management problem. An important part of key management is rotation. What is not well understood is when or how often you should rotate. Is it based on the amount of encryptions you perform? Amount of data? Or is it based on time? Another important question is, why is rotation needed?


There are best practices and compliance requirements that are based on the security and specs of the algorithm, key storage solution, who has access, and other factors.

NIST has a Special publication “NIST Special Publication 800-57 Part 1 - Revision 5”, named: “Recommendation for Key Management: Part 1 – General”. These are the factors listed that affect the Cryptoperiod:




Cryptoperiod vs Key life

Let’s not confuse Cryptoperiod with the maximum amount of encryptions you can perform without rotating the key. Cryptoperiods are usually suggested or enforced through standards like NIST and regulations like PCI.


This blog post is about how to extend the key life, which may or may not extend the Cryptoperiod for a particular regulation. This means we will focus mostly on “7. Limitations for algorithm usage (e.g., the maximum number of invocations to avoid nonce reuse)”  for AES-GCM.



GCM vs CBC

Before AES-GCM became the default standard to implement symmetric encryption, the most common one was AES-CBC. AES-CBC does not have any documented limitation on the amount of encryptions it can perform, at least I could not find it anywhere. It is usually suggested not to exceed 264 encryptions, mostly to prevent IV reuse, and even if that happens it’s not the end of the world, it would just leak some data about the plaintext. AES-CBC is one of, if not the most resilient mode of AES when IV/nonce are repeated, and for that to happen you need to perform a lot of encryptions. For example, if a service encrypts 10,000 records per/second you will have a 50% chance of IV collision after 58 million years.


AES-GCM offers increased security compared to AES-CBC in the sense that it is an AEAD algorithm, which means it’s authenticated. In other words if you tamper with some bits it will be detected when decrypting. AES-CBC on the other hand doesn’t come built in with authentication/integrity checks and it needs to be implemented separately. AES-CBC can be implemented securely, but the way it’s designed it’s easy to make a mistake when it’s implemented in real life. That’s what happened on TLS implementations with attacks like Padding Oracle attacks, BEAST, and Lucky13, hence why it’s discouraged to be used.


Let's see what Microsoft has to say about it:


“...Microsoft believes that it's no longer safe to decrypt data encrypted with the Cipher-Block-Chaining (CBC) mode of symmetric encryption when verifiable padding has been applied without first ensuring the integrity of the ciphertext, except for very specific circumstances. This judgement is based on currently known cryptographic research…

...In summary, to use padded CBC block ciphers safely, you must combine them with an HMAC (or another data integrity check) that you validate using a constant time comparison before trying to decrypt the data. Since all altered messages take the same amount time to produce a response, the attack is prevented…

...Be certain that each usage at each layer of a symmetric block cipher algorithm, such as AES and 3DES, in CBC mode incorporate the use of a secret-keyed data integrity check (an asymmetric signature, an HMAC, or to change the cipher mode to an authenticated encryption (AE) mode such as GCM or CCM).


In summary, it can be implemented safely, they even provide code samples for it, but argue AES-GCM is the way to go forward.


Let’s check what Cloudflare has to say:

“...There are still valid uses of CBC mode (such as for encrypting static content), but in the context of TLS, MAC-then-encrypt with CBC has had too many issues to be ignored…

...AEAD takes a stream cipher and mixes in the authentication step along the way rather than computing the MAC at the end. CloudFlare implements two such cipher modes, AES-GCM and ChaCha20-Poly1305.


So basically Microsoft and Cloudflare are suggesting AES-GCM as the algorithm to use. ChaCha20 is not a NIST approved algorithm, though it can be used through LibSodium/NaCl, but if you don’t need NIST compliance (also likely forget about FIPS 140-2 and FedRAMP compliance) those libraries are strongly recommended, just make sure you are using XChaCha20-Poly1305-IETF for symmetric encryption. It’s not just Microsoft and Cloudflare, if you Google around about what encryption algorithm you should use, most reputable results will suggest AES-GCM.


You will find some experts warning about AES-GCM though.

Latacora:

Nonces are particularly important for AES-GCM, which is the most popular mode of encryption. Unfortunately, it’s particularly tricky with AES-GCM, where it’s just-barely-but-maybe-not-quite on the border of safe to use random nonces.


This brings us to the next section.


AES-GCM: what’s wrong about it?

Note: “Nonces” and IVs are used interchangeably in the literature (like NIST) and in this blog post.


Unlike AES-CBC that was standardized with 128 bit long IVs, AES-GCM was mostly standardized for 96 bit nonces.


Let’s see what NIST has to say about GCM nonces.


NIST SP 800-38D (Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC) specifies:

...The IVs in GCM must fulfill the following “uniqueness” requirement:

The probability that the authenticated encryption function ever will be invoked with the same IV and the same key on two (or more) distinct sets of input data shall be no greater than 2-32...


2-32 equals to 2.3 x 10-10. This means we can’t encrypt more than 232 times, for 96 bit nonces, or we risk repeating random nonces. We will see how this math works in the next section.

But also the specs states:


...The total number of invocations of the authenticated encryption function shall not exceed 232, including all IV lengths and all instances of the authenticated encryption function with the given key...


232 might seem big but if you encrypt 500 records/second you will hit that number after 100 days, yes days, not years.


Note: AES-GCM can be used to encrypt messages up to 236 - 32 bytes long (the spec states it in bits: 239 − 256 bits), under a given key, nonce pair. We won’t center on this limitation since that’s around 64GB, which is usually enough.


You may now be asking yourself, why has the industry replaced AES-CBC supporting practically “unlimited” encryptions, when it’s being used by cryptographers to implement secure protocols like Signal [Source Code] with AES-GCM.


Why didn’t they standardize AES-CBC as an AEAD? I’m not sure about the latter but I have an idea about the former. When standard organizations looked for a replacement for AES-CBC they were looking to replace it in the context of TLS (like for HTTPS), and TLS is used for transport, not for encrypting some value and storing it. The other constraint they have is it needs to be fast, space efficient, and operate performantly on existing TLS acceleration hardware. TLS protocol doesn’t rely on random nonces for GCM; it uses counters, and since the protocol itself already keeps track of encrypted traffic sent back and forth, it’s easy to implement a key renegotiation after a certain time and/or a certain amount of traffic. The question about why 96 bit nonces, it’s because of efficiency, NIST GCM specs:


An initialization vector IV, that can have any number of bits between 1 and 264. For a fixed value of the key, each IV value must be distinct, but need not have equal lengths. 96-bit IV values can be processed more efficiently, so that length is recommended for situations in

which efficiency is critical.



What happens if a nonce is repeated?

We won’t focus on the actual cryptographic implications, we need to trust it’s bad, but it’s not that you lose confidentiality instantaneously. As a matter of fact if the nonce repeats itself nothing really happens, the algorithm still works, and it’s likely you will never know about it.

For someone to try to take advantage of it, they must retrieve 2 ciphertexts of different plain texts with the same nonce, so they would likely need to either retrieve a big database where you have been encrypting and persisting data for a while, sniffing traffic, or if you provide a service that encrypts (like if you encrypt cookies), to start gathering those until they find a collision. 


Statistically speaking, after 248 encryptions, the chance of a collision is around 50%. If your ciphertexts with tags (16 byte longs, and likely optional for the attacker) and nonces are 60 bytes long, an attacker needs to capture and store, in average, 15 Petabytes of information to have 50% chance of a collision. So if you are encrypting non-sensitive stuff you shouldn’t probably care about this, but if you are encrypting sensitive PII, PHI, or PCI related data you might want to go the extra mile to protect that data.


Understanding AES-GCM limitations

Let’s understand where this 232 number of encryptions comes from. It’s based on the Birthday paradox. Explained in cryptographic layman terms, if you don’t want any 2 random numbers to repeat, you must limit the amount of random numbers you generate. The amount will depend on how much you want to prevent this from happening and how big the random numbers can be. If your numbers go from 1 to 10, it’s extremely likely that if you generate 10 random numbers, 2 will repeat, the likelihood will still be very high even if you generate 5 random numbers. Go ahead and click “Again!” here several times and see how often 2 numbers repeat.


The rule of thumb is if you divide the random space size exponent by 2, so 296, would result in 248, the chances of any 2 random numbers repeating after generating 248 random numbers is around 50%. 50% is too high, so NIST adjusted it to what they deemed acceptable, 232 encryptions.


232 equals to 4.29 x 109. In the table below, for the row of 96 bits, we see the nearest number, opting for the safest option, is 4.0 x 108. Meaning NIST deemed safe that the probability of collision shouldn’t be higher than 10-12. Whatever solution we devise we need to make sure we keep within safe margins.


https://en.wikipedia.org/wiki/Birthday_problem#Probability_table



You might be thinking you are safe because you are already using AES-GCM with a well known cryptography library or the one that comes bundled with the standard library of the programming language you use, but that won’t help you. Most libraries (I checked LibSodium and Google Tink) and languages (I checked golang: “Never use more than 232 random nonces with a given key because of the risk of a repeat.”) stick to the AES-GCM spec and expects the implementer to keep the number of encryptions under 232.


Random salt solution

Adam Langley is giving us a hint of the first solution:

...there are situations where ensuring nonce uniqueness is not trivial, generally where multiple machines are independently encrypting with the same key. A system for distributing nonces is complex and hard to have confidence in. Generating nonces randomly and depending on statistical uniqueness is reasonable, but only if the space of nonces is large enough.


Note: I strongly suggest reading this blog post entirely, it’s actually about another solution using synthetic IVs, that I won’t be discussing here since it’s not widely supported and implementing on your own is a recipe for disaster. You might find an implementation for a language you use today, but keep in mind someday you might want to decrypt or encrypt from another service written in another language and you will be stuck. 


How can we make the nonce larger without deviating from NIST Standardized AES-GCM implementation? By adding some uniqueness to each encryption, and that can be done by deriving an encryption key each time you encrypt.


Nonces can’t be repeated for the same encryption key, so if we derive a key for each encryption then we can reduce the chance of the nonce repeating. For Key derivation we will use the standard HMAC-SHA256(Key, salt). The size of the salt will determine how many keys we can derive, for example if the salt can be 0 or 1, we can only generate 2 keys.


Your first guess could be to think that you can multiply the amount of keys by the amount of safe nonces, so if you want to encrypt safely up to 264 times, you would need salts of 32 bits, so when you multiply the salt space of 232 with the safe amount of encryptions, 232, gives you 264, but this is wrong! We need to consider the birthday paradox for the key space too. 


To be able to perform 264 (equivalent to ~1019) encryptions maintaining a probability of collision of 10-12, we need to go to the table, and see that even 128 bits is not enough (2.6x1013 is much lower than 1019), so we need to move to 192 bits (1.1x1023 is much bigger than 1019).


Another way to quickly calculate how many bits is enough is to use the same rule NIST applied at the beginning: 96 nonce size / 3 = 32 (when as an exponent 232 it gives you the amount of safe encryptions), so if we need another 232 to get to 264, we need to add 3 times 32 to the original 96 bit nonces, so 192 bits (or 12 bytes).


In summary, salts need to be random numbers of 12 bytes of length.


You need to consider that you need to persist these salts (in plain, no need to protect them) along with each encrypted record. The nice thing about 12 bytes is that if you want (need) to encode in base64, no “=” will be appended at the end, since 96 is divisible by 6 (26 is 64, that’s where 64 comes from in base64).


This solution is nice and the size of the salt can be adjusted to your needs, the only drawback is that for each encryption you need to generate a random number (It doesn’t necessarily need to be a SecureRandom which is a bit slower, but it’s strongly recommended. If you can’t use SecureRandom you need to use an algorithm that provides uniformly distributed random numbers), compute an HMAC and store the extra 12 bytes. If you can’t live with that, then this solution might not be the best one. The next solution will reduce the 12 bytes, to 4 bytes with even less collision chance.


Time-based solution

So now we know that AES-GCM functions well if you don’t repeat nonces, and that if you use counters (like 1,2,3...) you are safe. So why not reuse the same algorithm proposed in the previous solution but instead of random salts we use salts based on time, for example 32 bit Unix time. In that case we have 4 bytes of salt instead of 12 bytes. In this case the birthday paradox doesn’t apply so we can multiply 232 by 232 giving 264 encryptions. As long as you don’t perform more than 232 encryptions in one second (if you were then you would already have a team of cryptographers, and be working for the NSA or Google) you should be fine with this solution “forever”. You don’t need to worry about time corrections or time shifts between different servers, as long as the clock is not stuck, in the end these side-effects end up cancelling each other.


There are 2 things to consider with this solution:

  1. You will be storing the date of when the data was encrypted, if you believe this can give your adversary an advantage then go with the previous solution. It can also be useful if you want to re-encrypt data that has been encrypted a long time ago, since you have the timestamp, but don’t assume encryption date with creation date if you implemented re-encryptions.

  2. Unix 32 bits timestamps are signed, and will overflow in the year 2038 (I recommend checking how the language you use will handle this event if you plan on keeping the same encryption algorithm till 2038. Decryption won’t be affected). So the “forever” is not really forever. This overflow won’t really affect you since the counter will still be unique for another 68 years after 2038, but if you were relying on the timestamp you might need to perform some transformations, since the dates could be interpreted as dates before 1970. Alternatively you can use 64 bit Unix timestamps, but that increases the salt size to 8 bytes.


Summary

I hope this blog post has all the information you need to understand it, and all the information you need to implement any of the 2 proposed solutions. If you have any questions or comments please make them and I’ll try to address them.