Length Extension Attack

Length Extension Attack

A case of hash algorithm usage is to check the authenticity of the expeditor. A way to do that is to hash a secret with the data we want to share. Then our hash function will do H(secret || data), that way we can check if it is sent by the good person since a malicious user would have the date but not the secret the hash will not fit.

The main goal of a length extension attack is to be able to create a legit hash without knowing the secret. It is an attack possible on hash algorithm based on the Merkle-Damgard construction, algorithms like blake2 are immune to length extension.

Length extension attack and demonstration

What is a length extension attack?

A case of hash algorithm usage is to check the authenticity of the expeditor. A way to do that is to hash a secret within the data we want to share. Then our hash function will do H(secret || data), that way we can check if it is sent by the good person since a malicious user would have the date but not the secret, so the hash will not be the right one.

The main goal of a length extension attack is to be able to create a legit hash without knowing the secret. It is an attack possible on hash algorithm based on the Merkle-Damgard construction, algorithms like blake2 are immune to length extension.

How does it work?

Length extension attacks exploit the mechanism of some hash function such as md5 sha1 and even most versions of sha2. On those functions when you hash a file, they are operating on sequential blocks of input (you can’t load it at once or if the file is bigger than the RAM it would crash).

These hash functions work with an internal state which is modified throughout the algorithm, at the beginning it is the string we want to hash, at the end our resulting hash. The problem is that you can take an existing hash and say that it is the current state of the function and carry on from it to append additional data. So, the idea (which is not how it really work here as there are some problems that we will discuss later) is that if you have a secret and a message, for example secret=PASS123! and message=100, when we will proceed the hash function the result will be H (password +message) = H (PASS123! + 100).

But since we can restart from an existing hash, nothing prevents you from restarting from this hash in order to modify the end of the message. Doing so the message will still be valid because the secret has not changed and has already been hashed: we calculate H(H(secret+message)+ ‘0’). Then the person receiving the message will validate its authenticity since the secret’s hash has not changed even if the message was hijacked, but the receiver will not know it.

Problems

But it is not that simple, indeed when those hash functions are running, they are not only hashing the message but the message + its padding + its length with a padding depending on the function you are using. So, the hash that we want to hijacked will looks like this: H (H (secret + message +padding + length) +’0’+padding+length) and it will not fit the hash H(secret+message+’0’+padding+length). In consequence we can hijack a hash with a length extension attack, but we cannot create every message, only specific ones so the damage are limited.

Demonstration

Okay now let’s make a demonstration. In order to do a length extension attack we will use a hash_extender algorithm, made by iagox86 available on this page https://github.com/iagox86/hash_extender . Let’s suppose the original message was the combination of a secret and some data. To be original we will take here secret = secret, data = data and md5 as algorithm.

Our original hash is :
enter image description here

Let’s now use the hash_extender tool in order to add something (here we will add “append” ) at the end of the message and still have a valid hash without knowing the secret, so the tool will calculate H (H (secret + message +padding + length) +’0’+padding+length). The function needs several user inputs, the data, the size of the secret (if you don’t know it you can make a loop in order to try as many sizes as you want), the string you want to add, the initial hash and finally the algorithm used.

With our example: enter image description here

We now have our valid signature without knowing the secret, which is a security problem, therefore when you use a hash algorithm you need to check which one you choose according to the needs you have.