Why SHA-2 migration is important
The SHA-1 algorithm, one of the first widely used methods of protecting electronic information, has reached the end of its useful life, according to security experts at the National Institute of Standards and Technology (NIST). The agency is now recommending that IT professionals replace SHA-1, in the limited situations where it is still used, with newer algorithms that are more secure. But why this? To answer that question, let us first discuss what SHA is and why SHA-2 is an improvement over SHA-1.
What is a cryptographic HASH function?
The term "hash" indicates a mathematical algorithm that allows you to transform any file (or in any case an arbitrary number of bits) into a string of characters, called digest, of pre-established fixed size.
Regardless of the incoming file size and its nature – text, image, audio, video etc. – the digest obtained by applying the hash function to the file will always have the same length.
The fields of application of hashes are numerous, but mainly the digest of an electronic document is calculated to ensure the integrity of its content.
In fact, each electronic document will correspond to a precise digest, capable of uniquely identifying the form and content of that document. By modifying the original document, even minimally, the digest of the modified document will be completely different.
For example, applying a cryptographic hash to the phrase “HCLTech”, the resulting digest is:
6f9b3313ba6e45a233b491a1f3208bea75a88f995ffe2d2fe868bf026d5a8889
By calculating the digest of the sentence “HCL Tech” we obtain instead:
6be97fa6117a14a96e497e251329122c889ca98594ea4a5d625dcbc5e11f0803
It was sufficient to insert a space after “HCL” to get a completely different alphanumeric string.
Applying the same process to our files we could, for example, check that a contract digest emailed to a customer matches the contract digest actually received by the customer. If the two digests do not match, we would have evidence of tampering with the file. On the contrary, the correspondence of the two digests would guarantee the integrity of the file and its content.
SHA-1 Algorithm
SHA-1 is one of the members of the Secure Hash Algorithm, a family of cryptographic hash functions – SHA-0, SHA-1, SHA-224, SHA-256, SHA-384 and SHA-512 – developed by the NSA (National Security Agency) US since 1993.
Like all cryptographic hash functions, the SHAs produce a string of pre-established size.
SHA-0 and SHA-1 were the first models developed in the 1990s. The SHA-2 series developed in the 2000s included SHA-224, SHA-256, SHA-384 and SHA-512. These are designed in such a way that two documents with different contents generally produce two unique sets of hash values, which really helps preventing hash collisions.
The SHA-0 algorithm, first published in 1993 by NIST, was quickly discontinued after a significant weakness was found. It was superseded by SHA-1 in 1995, which includes an additional computational step that addresses SHA-0's undisclosed issues. Both algorithms hash a message up to 264-1 bits into a 160-bit "digest". Both use a 512-bit block size and a 32-bit word size in their operation.
SHA-1 is used in some common Internet protocols and security tools. These include IPsec, PGP, SSL, S/MIME, SSH and TLS. SHA-1 is also typically used as part of the security scheme for unclassified government documents. Some parts of the private sector also use this algorithm for some sensitive information. However, it was formally withdrawn from government use in 2010.
Why is there a Problem with SHA-1?
Over time, security standards usually become less effective for two reasons. Research finds weaknesses in them, and the plummeting cost of computing power makes computationally difficult attacks more practical. For example, SHA-1's predecessor, MD5, was in use well beyond the point that attacks on it were cheap and easy.
There are no practical attacks on SHA-1 yet, but it is just a matter of years before they occur. Security researchers have discovered an attack strategy that requires only 261 computations. This would reduce the time required to perform an attack under current standards. In fact, in 2012, noted security researcher Bruce Schneier reported the calculations of Intel researcher Jesse Walker, who found that the estimated cost of performing a SHA-1 collision attack will be within the range of organized crime by 2018 and for a university project by 2021.
How can SHA-1 be Attacked?
Simply put, SHA-1 can be exploited by attackers to generate and install a fake certificate. For those interested in a more in-depth technical explanation of hash attacks, in increasing order of difficulty for an attacker, see the sections below.
Collision Attacks
A collision attack occurs when it is possible to find two different messages that hash to the same value. A collision attack against a CA happens at the time of certificate issuance.
In a past attack against MD5, the attacker was able to produce a pair of colliding messages, one of which represented the contents of a benign end-entity certificate, and the other of which formed the contents of a malicious CA certificate.
Once the end-entity certificate was signed by the CA, the attacker reused the digital signature to produce a fraudulent CA certificate. The attacker then used their CA certificate to issue fraudulent end-entity certificates for any domain.
Collision attacks may be mitigated by putting entropy into the certificate, which makes it difficult for the attacker to guess the exact content of the certificate that will be signed by the CA. Entropy is typically found in the certificate serial number or in the validity periods. SHA-1 is known to have weaknesses in collision resistance.
Second Preimage Attacks
In a second preimage attack, a second message can be found that hashes to the same value as a given message. This allows the attacker to create fraudulent certificates at any time, not just at the time of certificate issuance. SHA-1 is currently resistant to second preimage attacks.
Preimage Attacks
A preimage attack is against the one-way property of a hash function. In a preimage attack, a message can be determined that hashes to a given value. This could allow a password attack, where the attacker can determine a password based on the hash of the password found in a database. SHA-1 is currently resistant to preimage attacks.
The end of SHA-1
Doubts about the solidity of SHA-1 began in 2005 when two researchers, Vincent Rijmen and Elisabeth Oswald, hypothesized that with a total of 280 (we are talking about a number with 24 zeros!) SHA-1 computations it was possible to give rise to a collision, i.e. generating the same digest for two different files.
In 2015 it was a group of three researchers who hypothesized that, through a cluster (a chain of computers) made up of 64 CPUs and a cost per time between 75,000 and 120,000 dollars, a collision could be given after 257 SHA- 1.
On February 23, 2017, after 2 years of research in collaboration with the CWI Institute in Amsterdam, Google definitively broke the SHA-1.
For the attack, the group of researchers started from a 2013 paper by Marc Stevens containing a theoretical approach for generating a collision. By creating a specific PDF file, it was able to get two files with different content and the same SHA-1 digest.
The Google attack, based on cloud computing solutions, involved 9 quintillion SHA-1 computations (9,223,372,036,854,775,808) and represents one of the largest computational processes ever completed. It has been calculated that one year of time and the computing power of 110 CPUs are needed to complete it.
Although the resources to be used may seem high, the method identified by Google is still 100,000 times faster than a brute-force attack which, in order to achieve the same result, would require the computing power of 12 million CPUs for an entire year.
The solution to the SHA-1 vulnerability, as declared by Google itself in its press release, is represented by the transition to more advanced and secure cryptographic hash functions whose violation involves an exponential increase in the computational effort. SHA-256 represents a valid solution.
SHA-224, SHA-256, SHA-384 and SHA-512 were published by NIST between 2001 and 2004. These four algorithms, also known as the SHA-2 family, are generally more robust than SHA-1.
SHA-224 and SHA-256 use the same maximum input message, word, and block size as SHA-1. Conversely, SHA-224 produces a 224-bit digest, while SHA-256 produces a 256-bit digest.
SHA-384 and SHA-512 increase the block size to 1024 bits, the word size to 64 bits, and the maximum input message length to 2128-1 bits. The digest produced by SHA-384 is 384 bits long, while the SHA-512 digest contains 512 bits.
Today’s more powerful computers can create fraudulent messages that result in the same hash as the original, potentially compromising the authentic message. NIST had previously announced that federal agencies should stop using SHA-1 in situations where collision attacks are a critical threat.
As attacks on SHA-1 in other applications have become increasingly severe, NIST will stop using SHA-1 in its last remaining specified protocols by Dec. 31, 2030. By that date, NIST plans to:
- Publish FIPS 180-5 (a revision of FIPS 180) to remove the SHA-1 specification.
- Revise SP 800-131A and other affected NIST publications to reflect the planned withdrawal of SHA-1.
- Create and publish a transition strategy for validating cryptographic modules and algorithms.
The last item refers to NIST’s Cryptographic Module Validation Program (CMVP), which assesses whether modules — the building blocks that form a functional encryption system — work effectively. All cryptographic modules used in federal encryption must be validated every five years, so SHA-1’s status change will affect companies that develop modules.
“Modules that still use SHA-1 after 2030 will not be permitted for purchase by the federal government, Companies have eight years to submit updated modules that no longer use SHA-1. Because there is often a backlog of submissions before a deadline, we recommend that developers submit their updated modules well in advance, so that CMVP has time to respond.” “We recommend that anyone relying on SHA-1 for security migrate to SHA-2 or SHA-3 as soon as possible.” — Chris Celi, NIST computer scientist