A “Hash function” is a complex encryption algorithm used primarily in cryptography and is like a shortened version of full-scale encryption.
Hash vs. Encryption
Encryption is a broad term, while a hash algorithm is just one of the many encryption schemes.
Encryption – The process of converting information from its normal, comprehensible form into an obscured guise, unreadable without special knowledge.
Hash – A special form of encryption often used for passwords that use a one-way algorithm that, when provided with a variable-length unique input (message), will always provide a unique fixed-length unique output called a hash or message digest.
Hash Collisions – A collision is when two different messages result in the same exact hash. Hash algorithms are written to avoid collisions, but some, such as MD5 – have been shown to have collisions.
A Hash Example
Website User Registration and subsequent Login
- a user goes to a website and clicks a button that says “New User Registration“.
- unknown to the user, his browser has downloaded the Hash algorithm as Java code which begins running in the computer memory.
- when he types in his user ID, it is not encrypted – but when he then types in a password (a short message) – the Hash JAVA routine encrypts it (into a more extended, “message digest” – or hash) – so for this example, he types in his password “mypass” but before it is sent to the webserver, it is encrypted by the JAVA Hash algorithm running on his machine as a hash: “5yfRRkrhJDbomacm2lsvEdg4GyY=“.
- the web server and stores the hash (not the original message) in a database as “5yfRRkrhJDbomacm2lsvEdg4GyY=” – IMPORTANT: the web host never sees the actual password but stores only the hash in its database.
- The next time the user connects to the site – he types in his ID and password (“mypass“), which is converted by the JAVA routine to “5yfRRkrhJDbomacm2lsvEdg4GyY=“. The server compares “5yfRRkrhJDbomacm2lsvEdg4GyY=” to the message stored in its database – it matches, and the user is granted access. Since the server only stores the longer, encrypted message – it NEVER has to decrypt anything !!!NOTE: if he typed a wrong password, such as “mypass1” – an entirely different message would be created and would not match the message on the server’s database, and he would be blocked.
IMPORTANT – How this protects the system from un-authorized Users logging in: if an individual somehow intercepted the “password” as it was being sent to the server or somehow got access to the server database – all they would have is the hash (5yfRRkrhJDbomacm2lsvEdg4GyY=), and not the password (mypass).
So, then they connect to that website and are prompted for a login and password. They will only get access if they type “mypass” – but all they know is the hash – not the actual password. Even if they manage to view the JAVA code and see the exact algorithm that converted the password to the hash – it is challenging, if not impossible – to reverse the process and find the password from the hash.
Hash algorithms take a long string (or message) of any length as input and produce a fixed-length string as output; not all such are suitable for use in cryptography. The output is sometimes termed a message digest or a digital fingerprint. The term “hash” is derived from the breakfast dish since it is comprised of a bunch of mixed-up pieces of food:
- A dish of chopped meat, potatoes, and sometimes vegetables, usually browned.
- A jumble; a hodgepodg.
- A mess: made a hash of the project.
- to chop into pieces; mince.
SHA (Secure Hash Algorithm)
NIST supports five hash algorithms called SHA for generating a condensed representation of a message (message digest). The five algorithms are SHA-1, SHA-224, SHA-256, SHA-384, and SHA-512. ), and they are detailed in FIPS 180-2 . When a message of any length < 264bits (for SHA-1 and SHA-256) or < 2128bits (for SHA-384 and SHA-512) is input to an algorithm, the result is an output called a message digest. The message digests range in length from 160 to 512 bits, depending on the algorithm.
MD5 (Message-Digest algorithm 5)
MD5 is a widely used message-digest algorithm (aka, cryptographic hash function) with a 128-bit hash value. It is not merely a checksum generator, though the term is sometimes imprecisely used. It is one of a series of message digest algorithms designed by Professor Ronald Rivest of MIT. When some analytic work indicated that MD5’s predecessor, MD4, was likely to be insecure, MD5 was designed in response in 1991.This indication was subsequently confirmed when weaknesses were found in MD4 in 1994 (Dobbertin, 1998).
MD5 has been widely used and was initially thought to be cryptographically secure. However, work in Europe in 1994 uncovered weaknesses that make further use of MD5 questionable. Specifically, it has been shown that it is computationally feasible to generate a collision, two different messages with the same hash. Unlike MD4, it is still thought to be very difficult to produce a message with a given hash. In 2004, a distributed project with the name MD5CRK was initiated to demonstrate that MD5 is insecure by finding a collision. Many security researchers and practitioners recommend that SHA-1 (or another high-quality cryptographic hash function) be used instead of MD5 because of these concerns.
MD5 hashes (or message digests) are commonly represented as a 32-digit hexadecimal number. A sample looks like this (using characters 0-9, a-f):
The MD5 hash (sometimes called md5sum, for MD5 checksum. of a zero-length string is:
RIPEMD-160 (RACE Integrity Primitives Evaluation Message Digest)
RIPMD-160 a 160-bit message-digest algorithm (and cryptographic hash function) developed in Europe by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel and first published in 1996.It is an improved version of RIPEMD, which was based on the design principles used in MD4.It is similar in both strength and performance to the more popular SHA-1.
There are also 128, 256, and 320-bit versions of this algorithm, RIPEMD-128, RIPEMD-256, and RIPEMD-320.The 128-bit version was intended only as a drop-in replacement for the original RIPEMD, which was also 128-bit. It had been found to have questionable security. The 256 and 320-bit versions diminish only the chance of accidental collision. They don’t have higher levels of security as compared to, respectively, RIPEMD-128 and RIPEMD-160.
RIPEMD-160 was designed in the open academic community, in contrast to the NSA-designed algorithm, SHA-1.On the other hand, RIPEMD-160 is a less popular and correspondingly less well-studied design.
Previous Example detailed – one-way Hash Encryption of a Password
This scenario is a perfect candidate for “one-way hash encryption, ” also known as a message digest, digital signature, one-way encryption, digital fingerprint, or cryptographic hash. It is referred to as “one-way” because although you can calculate a message digest, given some data, you can’t figure out what data produced a given message digest. This is also a collision-free mechanism that guarantees that no two different values will produce the same digest. Another property of this digest is that it is a condensed representation of a message or a data file, and as such, it has a fixed length.
There are several message-digest algorithms used widely today.
SHA-1 (Secure Hash Algorithm 1) is slower than MD5, but the message digest is larger, making it more resistant to brute force attacks. Therefore, it is recommended that the Secure Hash Algorithm is preferred to MD5 for all of your digest needs. Note that SHA-1 now has even higher strength brothers, SHA-256, SHA-384, and SHA-512 for 256, 384, and 512-bit digests.
Typical Registration Scenario
Here is a typical flow of how our message-digest algorithm can be used to provide one-way password hashing:
1) User registers with some site by submitting the following data:
2) before storing the data, a one-way hash of the password is created: “mypass” is transformed into “5yfRRkrhJDbomacm2lsvEdg4GyY=“.
The data stored in the database ends up looking like this:
3) When jsmith comes back to this site later and decides to Login using his credentials (jsmith/mypass), the password hash is created in memory (session). It is compared to the one stored in the database. Both values are equal to “5yfRRkrhJDbomacm2lsvEdg4GyY=” since the same password value “mypass” was used both times when submitting his credentials. Therefore, his Login will be successful.
Note, any other plaintext password value will produce a different sequence of characters. Even using a similar password value (“mypast“) with only one-letter difference, results in an entirely different hash: “hXdvNSKB5Ifd6fauhUAQZ4jA7o8=“.
|plaintext password||encrypted password|
As mentioned above, given that a robust encryption algorithm such as SHA is used, it is impossible to reverse-engineer the encrypted value from “5yfRRkrhJDbomacm2lsvEdg4GyY=” to “mypass“.
Therefore, even if a malicious hacker gets a hold of your password digest, he/she won’t be able to determine what your password is.
- “FIPS Publication 180-2 (with Change Notice 1)” – Archived Publications. Accessed February 20, 2021. Link.
- “RFC 3174 – US Secure Hash Algorithm 1 (SHA1)”. Accessed February 20, 2021. Link.