Introduction
Focusing on the use of hashing for the validation of a data stream against published hashes, there are a number of useful programs that provide this functionality. Essentially, these programs 1) must be easy to use, 2) must accurately compute hashes according to published algorithms, and 3) must present the information in a usable form. It is not important whether hashing is the primary purpose of the software or just an incidental feature of a broader application. What is important is that a useful capability is provided attended with as little "noise" (bugs and fluff) as possible.
The programs reviewed here provide three levels of functionality:
- Programs that compute hashes.
- Programs that also provide hash validation.
- Programs that also include a database of hashes for revalidation.
The reviewed applications implement their user interfaces in one of three ways:
- Windows console application (DOS command line).
- Windows Explorer context menu entry.
- Windows Explorer property page tab.
It cannot really be said that any one of these approaches is better than the others because each provides its own capabilities. A console application, for example, allows for scripting and ad-hoc programming that is not possible with graphical applications, but its user interface is somewhat limited. A Windows Explorer context menu entry provides quick access to a full-scale application, but this also switches the user to a new application context. An Explorer property page tab offers a handy and familiar access to program controls without context switching, but the small physical window size places constraints on application features.
See also: What is Hashing? and Technical Discussion at the end of this article.
Rated Products

Hashtab
Calculate and display hash values from over two dozen popular hashing algorithms like MD5, SHA1, SHA2, RipeMD, HAVAL and Whirlpool.

Platforms/Download: Mac OS | Windows (Desktop) |
Version reviewed: 5.1.0.23
Gizmos Freeware
Our Rating: 5/5 |
![]() |
Read more...

Hashing
A clean, fast and reliable application performing hash algorithms including MD5, SHA1 and SHA-2.
Platforms/Download: Windows (Desktop) |
Version reviewed: n/a
Gizmos Freeware
Our Rating: 5/5 |
Read more...

HashMyFiles
Compute hashes for a single file, a group of files or an entire file system using CRC, MD5, and SHA-1.
Platforms/Download: Windows (Desktop) |
Version reviewed: 2.0
Gizmos Freeware
Our Rating: 4/5 |
Read more...

HashCheck Shell Extension
An open source program employs a Windows property page tab as its user interface and computes hashes using CRC, MD4, MD5 and SHA-1.
Platforms/Download: Windows (Desktop) |
Version reviewed: 2.1.11
Gizmos Freeware
Our Rating: 3.5/5 |
Read more...

Febooti Hash & CRC
Works as a tab on the file property page and computes MD5, SHA-1, CRC32 and other popular hash checksums of files.
Platforms/Download: Windows (Desktop) |
Version reviewed: 3.5
Gizmos Freeware
Our Rating: 2.5/5 |
Read more...

Microsoft File Checksum Integrity Verifier
A console application with a unique capability to create a database of hashes for many files through an entire file system.
Platforms/Download: Windows (Desktop) |
Version reviewed: 1.0
Gizmos Freeware
Our Rating: 2/5 |
Read more...
More Hash Programs
Here are some more hash programs. I haven't downloaded them yet, but here's the info I got off their websites.
- Hasher is a Windows application that computes MD5, SHA-1/224/256/384/512 hashes of a text string, disk file, or group of files. Hasher can save hash values to disk for future verification. Informative website. VB6 source code available. Visual Basic runtime required.
- HashCalc is a Windows application that computes MD-2/4/5, SHA-1/256/384/512, RIPEMD-160, PANAMA, TIGER, ADLER32, CRC32, and eDonkey/eMule hashes of a text string or disk file. Doesn't look like it supports hash comparison.
- FSUM is a command-line application that computes MD-2/4/5, SHA-1/256/384/512, RIPEMD-160, PANAMA, TIGER, ADLER32, CRC32, and eDonkey/eMule hashes of one or more disk files. It can compare hashes against a list and recurse subdirectories.
- WinHasher is a Windows applet and command-line program that computes MD5, SHA-1/256/384/512, RIPEMD-160, Whirlpool (2003), and Tiger (1995) hashes of a text string, disk file, or group of files. C# source code available. .NET v2 required.
- WinMD5 is a Windows app that computes the MD5 of a disk file. It needs MD5SUM files to automate file verification. .NET required.
- Crypto Hash Calculator is a portable Windows app that computes the MD-2/4/5, SHA, SSL3, MAC and HMAC of a text string or disk file. Doesn't look like it automates hash comparison.
- digestIT 2004 is a Windows Explorer context menu that calculates the MD5 or SHA-1 hash of a file or files. 64-bit version available.
- WinMd5Sum Portable is a portable Windows app that computes the MD5 of a file via drag-and-drop. Looks like you can paste a comparison hash value in the app for an automated verification.
- Hash on click The freeware version of this context menu add-on can calculate the CRC32, MD5 and SHA-1 of a file. Doesn't look like it automates hash comparison.
- MD5Summer is a stand-alone application that computes MD5 and SHA-1 hashes of a disk file or group of files. Can read/write GNU MD5sum files. Source code available. Developer warns that "this [program] may be buggy". Beta software (3/2011).
- Checksum is a context menu add-on that uses the MD5 and SHA1 hash routines. It's a portable app that can create hashes for files, groups of files, and recurse subdirectories. It also supports file masks (*.mp3), custom mask groups (music=*.mp3,*.wav,*.ogg) and ignore lists. Checksum reads and writes .md5 and .sha files to support file verification and can create .m3u and .pls music playlists. Checksum can also be run from the command-line. Logging of the program's actions is supported. Separate "Hashdrop", "Batch Runner" and "Simple Checksum" programs extend Checksum's functionality by adding batch processing and drag-and-drop support.
- Gizmo Hasher (unrelated to this site) is a Explorer context menu entry that computes SHA-1/256/384/512, CRC-16/32, RIPEMD-128/160, MD-2/4/5, MD2, HAVAL-5-256, HASH-32-5, GHASH-32-3, GOST, SizeHash-32, FCS-16/32 and Tiger hashes of a disk file, group of files, or directory. Can read/write hash values to disk for future verification.
- Nero MD5-Checksum computes the MD5 hash for a file. There's not much information about this utility on the Nero website.
- RapidCRC computes CRC32 and MD5 hashes. Supports file names with embedded CRC32, e.g "MyFile [45DEF3A0].avi". Source code available via CVS. Program is in Beta (3/11).
- Easy Hash is a portable application and Explorer context menu that computes over 130 different hash functions for file directories, individual files and text strings. Compares two directories to find duplicates. It can save generated hashes to .CSV, .HTML, .SFV, .MD5 and .SHA1 file extensions. Can associate itself to .sfv, .md5 and .sha1 files. Can install itself in Total Commander, Unreal Commander and/or Free Commander. Easy Hash can also "reverse hash" CRC-32 hash values, which can be used recover passwords. Claims to reset passwords in MaNDOS, RAdmin, Mantis, Joomla, Wordpress, Mambo, vBulletin, TYPO3, phpBB, Drupal, Prestashop and Magento.
- ExactFile is a Windows application that calculates MD-2/4/5, SHA-1/256/384/512, CRC32, Adler32, GOST, RIPEMD-128/160, TIGER-128/160/192 hashes for files, directories, and optionally subdirectories. The program can be associated with .md5, .sha1 and .sfv file extensions and can use these file formats to verify checksums. A command-line version of the program called EXF is available. Programs are in Beta (3/11).
- ilSFV uses .sfv, .ms5 and .sha1 file extensions to verify file hashes. As of 3/11, website has no WOT rating, so be careful. (3/11)
- Kana Checksum supports the CRC32 and MD5 hash algorithms. It can be used as a stand-alone program or integrated into Explorer as a context menu selection. Supports .md5 and .sfv files.
- Hasher supports the SHA1, MD5, CRC32 and ELF hash algorithms. It can calculate the hash of a file or text string.
- File Verifier ++ supports the CRC16/32, BZIP2 CRC, MPEG2 CRC, JamCRC, Posix CRC, ADLER32, MD4/5, EDONKEY2K, RIPEMD-128/160/256/320, SHA-1/224/256/384/512 and WHIRLPOOL algorithms. It's portable, but can also be integrated in to the Explorer context menu. A command-line version of the program is included. The program can calculate the hash for files, directories, subdirectories, and text strings. File selection can also be done using regular expressions. Can compare hashes to previously calculated values. Website has no WOT rating, so be careful. Beta software (3/11).
- Jacksum is a Java application that can work as a Windows or command-line application. It can also be a "send to" context menu option. It supports 58 hash algorithms and can calculate the hash values of text strings, files, directories and sub-directories. It can write hash values to files (e.g. .md5, .sfv). Java Runtime Environment required. Not sure if/how it compares hash values. Source code available. Informative website.
- MultiHasher supports the CRC32, MD5 and SHA-1/256/384/512 hashing functions. It can calculate hashes for files, multiple files and text strings. Support for .md5 and .sfv files. Program is in Beta (3/11).
- FlyingBit Hash Calculator is an Explorer context menu addition that supports CRC16/32/64, eDonkey/eMule, RIPEMD-160, MD5/MD4, Tiger and SHA-1 algorthims. Reads and writes .md5 files. Website has no WOT rating and Siteadvisor warns of a security risk as of 3/2011, so be careful.
- eXpress CheckSum Verifier (XCSV) & eXpress CheckSum Calculator (XCSC) WOT warns that the website has a poor reputation, so I didn't research these products any further (3/11).
What is Hashing?
Hashing is the process of computing a fixed-length string (called a "message digest") from a data stream usually for the purpose of validating, authenticating, or digitally signing that stream. The stream could be a disk file, an email message, or packets of data in network transport. Hashing is not encryption because the message digest cannot readily be transformed back into the original data from which it was computed. Instead, hashing is a mechanism for representing a block of data in a predictable way by the use of a standard, public algorithm.
The usefulness of hashing arises partly from the ease with which message digests can be computed and partly from the fact that no two data streams should ever produce the same message digest. These characteristics suggest some important uses to which hashing algorithms can be put.
Digitally signing an email message, for example, involves computing a cryptographic hash of the body of the message and all attachments that is then encrypted using the sender's private key. (Note: It is the hash that is encrypted and not the message itself when the message is only digitally signed and not encrypted. When the entire message is encrypted, the recipient's public key is used to do so.) The encrypted hash is then attached to the message for transport. On the receiving end, the digital signature accompanying the message is decrypted using the sender's public key, which could have accompanied the message or could have been drawn from a key escrow, and the resulting data, which was the original hash of the message body with attachments, is compared against a newly computed hash of the received message data. If the original hash and the newly computed one are identical, then there is a high degree of probability that the message was not altered in transit and that it did come from the person whose digital signature accompanied the message.
Another important use for cryptographic hashes is in the verification of an acquired data stream against a published hash for that stream. For example, individuals, companies, and organizations often provide file download services on web sites and in online databases. In addition to offering content in the form of downloadable files, hashes of those files are often published that the consumer can validate the downloaded content against. If the consumer's own computation of the hash of the content using his or her own tool, which can be different from the tool used by the content provider, is identical to the published hash, then there is a high degree of probability that the downloaded content is identical to the published data. This is useful for ensuring that the received file is what was published both from the standpoint of malicious alteration and from the standpoint of accidental alteration or truncation in transit, which is much more likely.
What the hash verification does not do is validate that the acquired data stream is harmless. Because of the ease with which hashes can be computed, malicious web site owner's can publish hashes for infected content. Even "dear john" letters, which might be far from harmless to the recipient, can be digitally signed for email transport. The hashing involved in either case says nothing whatsoever about the nature of the content provided. Unsuspecting consumers might infer trust in the content from the existence of the published hash or digital signature when in fact all the hash can do is facilitate validation that the content received matches the content published. Trust in the content itself must be derived from other knowledge that the consumer/recipient possesses about the publisher/sender.
Suggested further reading, see Cryptographic Hashes: What They Are, and Why You Should be Friends
Technical Discussion
For those who are interested in knowing more about the various hashing algorithms in use, a technical discussion of these algorithms and their possible uses follows.
There are several hashing algorithms in common use having different purposes and varying degrees of reliability for error detection, data validation, and cryptographic security. Common algorithms include Cyclic Redundancy Check (CRC), Message Digest (MD), Secure Hash Algorithm (SHA), RACE Integrity Primitives Evaluation Message Digest (RIPEMD), and Whirlpool.
Virtually all algorithms have gone through revisions or replacements to improve their inherent security (SHA-2, for example, being more cryptographically secure than SHA-1, and the final version of Whirlpool more than earlier versions). In addition, some algorithms, such as SHA and RIPEMD, offer variations that reduce the likelihood of accidental collisions (two messages having the same hash). SHA-512, for example, is SHA-2 with a 512 bit (64 byte) message digest size that reduces the likelihood of accidental collisions versus SHA-256, but a larger digest size does not make an otherwise identical hash algorithm more secure. The larger digest sizes satisfy the needs of encryption algorithms that require them.
The security of a hashing algorithm, however, is defined by its resistance to certain kinds of attacks such as pre-image attacks and deliberate collision attacks irrespective of its digest size. (Digest size merely refers to the number of bits in the hash produced by the algorithm.)
CRC is a high-performance algorithm that can be implemented in hardware for the validation of data moving through the electronics of a computer or network device at high speed. Its purpose is to provide maximum performance in the detection of errors in the data stream. CRC is not suitable for cryptographic use because of its low collision resistance, but it does provide basic error checking when performance is paramount.
The widely-used MD5 algorithm is the latest of a serious of algorithms in the same family. It produces a 128 bit message digest. It has been shown, however, that MD5 is not collision resistant. In 2007, two Danish researchers demonstrated that it is possible for two executable programs, one benign and the other not, to share the same MD5 message digest. It would be difficult for malicious coders to exploit the researchers' methodology because it would require the coders to insinuate themselves into the publication of the original program, but it is difficult to be sure that this could not lead to a practical attack vector. The researchers' conclusion was that MD5 ought not be used for code signing and cryptographic purposes.
SHA was created by the National Security Agency (NSA) of the United States Government, and it has been used as a general purpose algorithm for cryptographic applications since the mid-1990s. (Although hashing algorithms are not reversible encryption systems, they are used by such systems for various purposes.) Weaknesses in early versions, SHA-0 and SHA-1, led to the creation of SHA-2. (SHA-256 and SHA-512 are both SHA-2 algorithms with differing message digest sizes.) A public competition for a successor to SHA-2, which will become SHA-3, is currently being conducted by the National Institute of Standards and Technology (NIST). Among the current SHA algorithms, SHA-256 provides a good compromise between performance and security having no known collision vectors. One of the chief criticisms of SHA-1 and 2, however, has been that their development was conducted by a secret governmental agency.
Unlike SHA, RIPEMD was created by an open academic community--the COSIC group of Belgium's Katholieke Universiteit Leuven, which is the same group whose Rijndael encryption algorithm won the competition for the U.S. Government's Advanced Encryption Standard in 2001. RIPEMD comes in two versions, RIPEMD-128 (the faster) and RIPEMD-160 (the more secure) each of which has an extension for a larger hash result size (256 and 320 bits respectively). RIPEMD creators caution that the larger hash result sizes of the extensions should not be regarded as more secure than the base algorithms and are merely provided for applications that require larger message digests.
The Whirlpool hash algorithm was created by one of the co-creators, Vincent Rijmen, of the Rijndael encryption system that became the Advanced Encryption Standard. Whirlpool is actually based on Rijndael with certain key differences that make it a one-way hashing algorithm instead of a reversible encryption system. Whirlpool, which has a fixed message digest size of 512 bits, has been revised twice to deal with weaknesses found in early versions. These versions are referred to as Whirlpool-0, Whirlpool-T, and then just Whirlpool for the final published version. All implementations are expected to use the final version.
Programs that use any of these algorithms for file validation purposes ought to at least compute MD5 and SHA-1 hashes as these are the most widely used by software publishers. If both are used, the effects of their respective weaknesses can be canceled because it is extremely unlikely that a given malicious file could simultaneously exploit the weaknesses of both. For non-cryptographic purposes, this would be sufficient. Good supplemental algorithms to these would be SHA-256 and Whirlpool as these currently have no known weaknesses. The inclusion of other algorithms does not necessarily make a given program better. There are some differences in the algorithms used by the various programs reviewed here.
The various versions of SHA and RIPEMD and the latest version of Whirlpool are included in the International Standards Organization (ISO) standard 10118-3:2004 for dedicated hash functions.
Understanding Digital Certificates and Code Signing
From: https://www.oracle.com/technetwork/java/javase/documentation/digitalcerts-codesigning-4312830.html
Aurelio Garcia-Ribeyro
2018-01-26
This document provides a somewhat simplified explanation [1] of the technology behind code signing and digital certificates.
Code signing relies on digital certificates to do its job. To understand certificates and how they are used we need a basic understanding of some concepts: Symmetric and Asymmetric Encryption, and Hashing.
Symmetric and Asymmetric Encryption, and Hashing.
Symmetric Encryption
Whenever we need to protect information it is common practice to encrypt it. This means encoding the information in a way that is not easy to understand unless you know how to translate it.
For example, instead of writing “GOOD MORNING” I could replace every letter with a letter that is 3 letters earlier in the alphabet, so “G” becomes “D”, “O becomes L”, etc. and write instead “DLLA JLOKFKD”. [2]
The encrypted message has all of the information of the original message but you need to know the encrypting algorithm (shift letters by a given number) and the encryption key (how many positions to shift) to be able to get the original message back.
We could have used a more complicated “key” like: “The first letter shifts 8 positions, second letter 12 positions, third letter 5 positions. Repeat the “8-12-5” sequence for every group of 3 letters until you encode the whole message”. Even with such a simple algorithm a long enough and random enough key could create something difficult to decipher without the key.
In modern cryptography, it is common that the algorithm used for encryption is known and written along the encrypted message –so authorized users know how to decrypt. The message remains safe only as long as we safeguard the key.
Even without knowing the key it is sometimes possible to decrypt a message.
It might be possible to “brute force” the solution. If the key is small enough a computer could try all possible combinations rather quickly. If there are only a few million combinations for the key, a modern computer could try them all and guess the result in less than a minute.
Mathematicians and researchers are constantly looking for weakness in encryption algorithms, if they find one it might be possible to decrypt the original message without guessing the key or diminish the number of possible keys making a brute-force attack possible.
Since computer power grows over time, and weakness are discovered in previously “safe” algorithms, we must assume that anything considered secure now will not always be secure. Therefore, cryptography comes with expiration dates.
Asymmetric Encryption
In our encryption example above we shifted letters “3 spaces”. The same value/key that we used for encrypting is used for decrypting. If you shifted 3 spaces to the right for encrypting, just shift 3 to the left for decrypting. This type of encryption, where the same key is used for encrypting and decrypting, is called symmetric encryption and has the benefit of being fast and taking relatively few resources to compute. Symmetric encryption has the drawback that you have to share the key used for encrypting with everyone authorized to decrypt.
Around 1970, mathematicians and researchers came up with a method of encryption in which two seemingly unrelated values [3] could be used for encrypting/decrypting in such a way that if you encrypt with one value you can only decrypt with the other value and vice-versa. This is known as Asymmetric Encryption.
Asymmetric Encryption is the basis of what is called Public-Key Cryptography.
When compared with symmetric encryption asymmetric encryption takes a lot more processing power, making it slower and more expensive, but has the benefit that if I keep one of the values to myself (let’s call it private key because... well that’s what it’s called!) I can share the other value (you guessed it: the public key [4]) with everyone in the world and have the basis for code signing and TLS authentication.
The possibility of keeping one of the keys secret and making the other public means that I can do two important things by sharing the public key with everyone:
1) Everyone can then use the public key to encrypt anything [5] so that only the owner of the matching private key can decrypt it. This ensures a secure “one-way” communications.
2) The owner of the private key can use it to confirm that they encrypted something. Anything that can be decrypted with a public key could only have been encrypted with the corresponding private key.
This is the cornerstone of digital signatures.
Hashing
Another useful technique used in cryptography is to calculate a unique [6] value for each message. This value is called a hash or a checksum.
Hashing is a one-way function. Unlike encrypted data, hashes of data do not contain all the information needed to re-create the original input. You can calculate the hash for any message but there is no way to get back the original message if all you have is the hash.
Good hashing functions will produce large variations in the result given even very small changes in the data. In some hashing the resulting value will be of the same size regardless of how large or short the input is.
For example, a commonly used hash function is SHA-256 which produces a 256-bit hash (writing this in hexadecimal requires 64 characters).
Here is the SHA256 checksum for two similar short texts:
Input Text |
SHA256 Checksum |
Hello |
185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969 |
HellO |
4ff7975b53db6c029d88f6ac67bd78d12fed72cdb2e252a26556d594b87bc9d8 |
Simply by changing the last “o” in Hello to upper case the checksum or hash is very different.
The SHA256 checksum of large binaries looks similar to that of the small examples:
Input File |
SHA256 Checksum |
OpenJDK 9.0.1: package hosted on Java.net approx. 200 MB |
a312ea3c51940361af738fda809e08e16972ae9dd314cb087e0d31e251b416a3 |
Ubuntu download ubuntu-16.04.3-desktop-amd64.iso approx. 1.5 GB |
1384ac8f2c2a6479ba2a9cbe90a585618834560c477a699a4a7ebe7b5345ddc1 |
In the previous examples. Knowing the checksum for an Ubuntu download doesn’t let you “re-create” the complete download. You can’t even tell if you are looking at the checksum for a large file or that of a small value.
Although it is fairly straight forward and relatively fast to compute the checksum for a given value, the opposite: guessing an input that would produce a given checksum, is not possible [7].
Hashing is also used for other non-cryptographic applications like checking that something was transmitted correctly, creating efficient ways of indexing data, and storing passwords [8] securely.
Digital Signing
The idea of digital-signing is straight forward:
- Take anything that you want to sign and compute its checksum or hash.
- Generate a private/public key and use the private key to encrypt the checksum/hash that you have calculated for the input [9]. Remember: anything encrypted with the private key can only be decrypted with the corresponding public key.
- Ship the information and include alongside the encrypted checksum (“The Signature”) and the public key to validate it.
If later, someone wants to know if the information they received remains unchanged, they can compute the checksum for the information. Let’s call this the “Calculated checksum”.
Then, using the public key they can decrypt the encrypted checksum that came with the information. Let’s called this the “Signed checksum”.
If the “Calculated checksum” and the “Signed checksum” match this tells us that 1) The information hasn’t changed (since the checksums are the same) and 2) that only someone with access to the private key that matches the public key could have created that signature.
Note that signing the data does not encrypt it. The idea of signing is not to keep the information secret but simply to ensure that the information has not been altered and that it was signed by someone who held a particular private key.
The problem is: how do we know if a public key really matches the private key from a particular person or company?
We need someone or something who can vouch for the authenticity of the public key…
Certificate Authorities
Certificate Authorities (CAs) are entities that act as trusted third parties. Once you have a trusted authority we can have use them to establish a “chain” of trust as follows:
- The person that wants to sign something needs to convince the certificate authority that they are who they claim to be.
- End users like you and I trust those certificate authorities and accept that “if this CA says they verified a user then we can trust that they did.”
Think of it as: “I don’t know you and you don’t know me. But we both know Sara. If Sara tells me you are really who you claim to be, since I trust Sara, I can now trust that I know who you are.” The way that Certificate Authorities tell the world that they know who you are is to give you a digital certificate.
Different Certificate Authorities have different processes for issuing a certificate but they all go something like this:
A person or company creates, in their own computer, a private/public key combination using some program available in most operating systems or by downloading a program specifically for this purpose.
That person or company saves the private key. They won’t share this with anybody, not even the Certificate Authority.
The person (let’s call them person from now on. It gets long to keep writing “person or company”) contacts the CA who will issue the certificate.
The CA will ask the person for the public key that her or she generated in the previous step, details on what they want the certificate for (e.g. code signing, TLS authentication, encrypting email, etc.), a name to put in the certificate, an address or location, the domain to authenticate, etc.
The CA will also ask you for details to validate that you are indeed who you claim to be. They might require a copy of your driver license, or a company’s articles of incorporation or, if you are asking for a TLS certificate, they might ask you to prove that you control the domain name for which you want the certificate by putting some text of their choosing in your website.
The exact details of what you will need to provide will change depending on what you will need the certificate for and which CA you use. A certificate for encrypting email for a single address will usually need less scrutiny than a TLS certificate for a web domain with the name of a well-known bank. This is why a digital certificate has a list of things for which it’s good for and will not be trusted for other purposes.
The CA will also want to know how long to issue the certificate for. Most Certificate Authorities will issue certificates for 1 to 3 years. They will usually charge for their services and charge higher for certificates that expire later.
Once they are satisfied that you are who you claim to be and have met all their requirements they will produce a document that will contain:
- Some of the given information (name, address, URL or email, etc.)
- What the certificate is valid for (e.g. code signing and mail encryption)
- When the certificate is being issued, and until when will it be considered valid.
- The public key that you provided them
- Information about the issuing Certificate Authority.
They will then “sign” all this information with the CA’s own private key. The digital certificate is then simply: your information, your public key, a list of “what is this good for”, valid from and to dates, and the CAs signature.
The question then becomes, how do you get the CA’s public key so you can verify that they created the digital certificate?
Certificate Authorities work with the developers of operating systems, browsers, and runtimes like the Java Runtime (JRE). The developers of those programs evaluate each candidate Certificate Authority by looking into auditors’ reports, industry certifications, how well established they are, and many other criteria that varies for each program. If they think a CA is trustworthy and meets their particular needs the developers of the operating systems, browsers, and runtimes include the CA’s public key –which they receive directly from the CA’s- in their programs.
You can see what Certificate Authorities are included in your browsers. For example, in Firefox on macOS you can type “about:preferences#privacy” into the address bar (on Windows use "about:preferences#advanced") and scroll to the bottom where you will find the “Certificates” section. You can choose to view the complete list of trusted CAs’ certificates and see the details of what is in each certificate.
All browsers, some operating systems, and a few runtimes have a similar –though not necessarily exactly the same- lists of “Trusted Certificate Authorities”.
A browser or operating system will “trust” certificates issued by any of the certificate authorities whose public keys it includes in its own keystore. Certificates issued by any other CA’s will not be recognized and will be treated as “self-generated”.
Note, as in the example from the image above, that a certificate authority might have more than one public key included in a given browser or operating system with different expiration dates, different purposes, and different technical details.
And now for the real-world complexities
So far, the theoretical model is very elegant but the real world is never so simple. There are a few choices to be made that impact how well the system works.
Key-Length
When one creates an encryption key it is necessary to decide how long the key will be. Longer keys are harder to guess, allow for more secure encryption, and will be considered safe longer. However, the length of the key also determines how lengthy the encryption process is. Some algorithms can only handle key sizes up to a given size. In some cases, extremely long keys would make encryption too slow without providing significant benefit. In extreme cases, longer keys would mean that some devices or older software cannot handle the key.
Shorter keys will be faster to use but if you make a key too short it will weaken the protection of the encryption.
Certificate Authorities have guidelines for minimum and maximum length of Keys they will accept and they change over time. Keys that were considered “too long” a few years ago are now considered “too short to be secure”. At any point in time there is a range of acceptable key-lengths, and that range changes over time.
Encryption and Key-Generating Algorithms
Similar to key-length, new algorithms are being invented that might provide better security or need less resources to process, at the same time vulnerabilities in older algorithms sometimes make digital certificates insecure even before they have reached their planned expiration date.
Lost or stolen private keys
The security of this system depends on private keys being controlled by the person identified in a certificate. It is possible though that the computer that had the private key is destroyed, or worse it could be compromised and someone else could get access to the private key.
Bad Security Practices might be discovered
Some Certificate Authorities, through mistake or negligence, have been found to do things that compromise the integrity of certificates they issue. Imagine if we discovered that a CA had incorrectly given certificates to someone claiming to be a well-known bank but it is later discovered it was really a scammer trying to set up a fake look-alike site to steal the bank’s users’ credentials.
Certificate Revocation
For the reasons listed above, and a few others, it is sometimes necessary to revoke a digital certificate before it reaches its expiration date.
Certificate Authorities keep track of all certificates they created that have been revoked and provide a revocation check mechanism. Part of the certificate validation process is to contact the CA that issued the certificate and ask what certificates have been revoked.
Rather than having every browser [10] contact a CA every time that they need to validate a certificate it is common to use local copies of revocation lists. When a browser needs to validate a digital certificate, it will ask the CA not only about that certificate but for a complete list of all the certificates that it has revoked. The browser will then save the list and use that for a while instead of contacting the CA for every certificate. For many programs, the default setting is to trust the list for up to one hour. Any certificate from that CA presented during that hour will be checked against the local copy. After one hour, the list is considered too old and is discarded. The next request will cause the browser to request a fresh copy of the list [11].
When a CA issues a digital certificate, it also gives the owner of the new certificate, instructions on how to ask for the certificate to be revoked if it becomes compromised.
In extreme cases, if a Certificate Authority itself is compromised, developers of Browsers might stop including that Certificate Authority’s public keys and therefore stop trusting all certificates issued by that CA.
Certificate Chains
Certificate Authorities don’t use the same private key to sign every certificate issued under their name. What they do is create intermediate certificates and even intermediate certificate authorities. This means that they used one of their “master” private keys to generate certificates that, amongst their permissions have “generate other certificates under my name”. In some cases, those other intermediate certificates had some restrictions like “This is only valid to generate digital certificates for a given geographic location”. For large companies, it might even be possible to get a “generic” digital certificate that say something like “can be used to generate TLS certificates for any website that ends in xyz.com”. They would give that intermediate certificate to an administrator of company xyz, and that administrator could then create certificates for www.xyz.com, support.xyz.com, mail.xyz.com, etc. without having to go through a separate validation process for each.
This means that the validation is not simply from the certificate that you receive to one of the public keys that ships with your browser. It follows the chain of certificates, validating each in turn, until you reach a public key that ships in your browser. For this reason, the certificates in the browser are also called “root certificates”.
Note that if any certificate in the chain is revoked, expired, issued (signed) with an algorithm that is no longer trusted, or missing the “can generate sub-certificates” permission, the whole chain from that point on gets broken and none of the final (or leaf) certificates issued through that chain will be considered valid.
Time-Stamping
Updating a TLS certificate once a year or so usually doesn’t involve too much work as it is stored in a centralized location. Signed code however is copied and distributed to many locations, frequently outside of a single organization. Updating signed code requires re-distribution of the signed code. In some cases, signed code is meant to be authenticated and used unchanged for a long time. Since code-signing certificates are issued for only 1 to 3 years it is sometimes necessary to distribute an update to the code where the only difference between the current version and the updated one is a new signature.
To extend the useful lifetime of a digital signature, and therefore minimize the number of times that code has to be signed, the concept of Timestamping was introduced.
The idea behind timestamping is that, if something is signed before the certificate used for signing expires, and the certificate has not been revoked it is ok to continue trusting the signature even after the certificate expires. The problem then is how to validate that something was signed before the certificate expired.
Some Certificate Authorities created time-stamping services. They generate a digital certificate with a very long lifetime, sometimes up to ten years rather than the usual 1 to 3 years. They offer a mechanism for anyone to send the hash of a digital signature. The time-stamping service then appends the current time to the received hash and digitally signs both together with its own private key effectively creating a time-stamp that says: “this signature existed at this point in time.” [12] The signed code then adds this time-stamp to the signature as proof that the signature existed at that point in time.
The time-stamping service however is subject to the same rules of expiration and revocation so the time-stamp itself will eventually become invalid [13]. Time-stamping certificate expiration however should happen very infrequently so the need to redistribute code with only the signature updated could be drastically reduced (but not eliminated!) [14].
Footnotes
[1] This is simplified! If you need to learn more details there are plenty of technical sites available to you. I meant to give you a high-level understanding so we can talk about expiration dates and algorithm strengths... not to let you debate technical details with an expert in this field.
[2] This one is one of the oldest encryption algorithms. It is believed that Julius Caesar used it over 2000 years ago. See https://en.wikipedia.org/wiki/Caesar_cipher . Modern algorithms are a lot more complicated.
[3] They are not really unrelated but you can’t calculate one if you only know the other one.
[4] The only thing that makes one key “private” and the other one “public” is which one I choose to share. There is nothing different between them that would make it so that I would have to choose one over the other for “private".
[5] In practice, it would take too long to encrypt anything but a small message with asymmetric encryption so it is more common to simply make up a random symmetric key, use that for encrypting the message with a faster algorithm, and only encrypt the made-up key using asymmetric cryptography.
[6] Unique meaning the same message will always result in the same value, not that another message couldn’t have the same value.
[7] Or rather would take a very fast computer longer than the age of the universe to try enough values to have a good enough chance of guessing.
[8] When systems use passwords for authentication it is common to store a hash of the password, rather than the actual password, in case the database storing them is compromised. Applications calculate the hash of user-entered passwords and compare hashes instead of comparing actual passwords.
[9] Although we could encrypt the complete message rather than just a checksum, since the message might be long, and asymmetric encryption is slow, it ends up being more efficient to encrypt only a checksum of the message.
[10] Browser, or OS, or Runtime. Whomever is validating the certificate.
[11] There are some improvements like OCSP Stapling that make this process less cumbersome but the overall idea is the same: Somehow check to see if the certificate is revoked before trusting it. See https://en.wikipedia.org/wiki/OCSP_stapling.
[12] Note that the time-stamping service only receives a hash, not the complete document, not even the complete signature that is being time-stamped. It time-stamps “whatever hash you pass” regardless of whether it’s really a hash for a signature or a random set of characters. It doesn’t know or care.
[13] It is possible to daisy-chain time-stamps, certifying that a time-stamp was valid with a newer time-stamp before the original time-stamping server expires. Not all program can validate daisy-chained time-stamps though and it will still be necessary to redistribute the code with the newer time-stamp. In most cases it is easier to simply re-sign from scratch. [back]
[14] To learn about time-stamping in more detail an interesting resource is https://www.secureblackbox.com/kb/articles/11-TimeStamping.rst?page=all although it is meant to explain how a particular product uses Time-Stamping it does a good job at describing the underlying processes. [back]
One-Way Accumulators:
One-Way Accumulators (OWA) offer a decentralized alternative to Digital Signatures. Their advantages include the following:
- Most notably, the main advantage of OWAs over Digital Signatures is that no one need know how to authenticate, sign, or time stamp a message, thereby dispensing with the need for a CA.
- More particularly:
- OWAs allow for a straight-forward and efficient method of producing collective signatures.
- 'Forgery' in the utilization of OWAs is infeasible because the putative forger cannot make a valid time-stamp of a document that was not expected at the time recorded on the stamp. For instance, a student who wishes to plagiarize a paper written on a given date (and so time-stamped) would be unable to change the time-stamp in order to misrepresent their (plagiarized) authorship to pre-date the original's authorship.
- OWAs are no less secure than one-way functions, and indeed many cryptographic protocols are based upon the presupposed 'hardness' of reversing one way functions.
- The relationship between OWAs and One-Way Trapdoor Functions is, at present, unknown.
In conclusion, it would appear that these one-way hash functions offer considerable advantages to traditional methods for authentication, membership testing, and time-stamping.
Disadvantages of one-way accumulators?
One-way accumulators are built upon a (quasi)-commutative one-way function. With quasi-commutativity, I refer to the following property:
For f:X×Y→Xf:X×Y→X, it is true that f(f(x,y1),y2)=f(f(x,y2),y1)f(f(x,y1),y2)=f(f(x,y2),y1).
Although accumulators seem like a very useful cryptographic building block, I don't see them often in practical applications (in fact I can only think of Zerocoin). I suspect that this is because the scheme has certain disadvantages.
The accumulators that I know of seem to be based on number-theory (unlike conventional hash functions). This makes them a lot slower.
For example, Wikipedia describes the following function:
One trivial example is how large composite numbers accumulate their prime factors, as it's currently impractical to factor the composite number, but relatively easy to find a product and therefore check if a specific prime is one of the factors. New members may be added or subtracted to the set of factors simply by multiplying or factoring out the number respectively. More practical accumulators use a quasi-commutative hash function where the size (number of bits) of the accumulator does not grow with the number of members.
As they mention, this is clearly not practical because of the size of the output values.
Another example I have seen is f(x,y)=xy(modn)f(x,y)=xy(modn) where n=pqn=pq (with pp and qq both safe primes). Even though this doesn't have the problem of the Wikipedia example, it is still not very efficient (even though you can do the exponentiations using the square-and-multiply method).
An advantage of a cryptographic accumulator and actually the reason to use them is that due to the quasi commutativity you can compute witnesses for membership of values in the accumulator where the accumulator and the witnesses are of constant size.
Say you have a set Y={y1,y2,y3}Y={y1,y2,y3} and compute the accumulator as acc=f(f(f(x,y1),y2),y3)acc=f(f(f(x,y1),y2),y3) you want to compute a witness for a value say y2y2, then by quasi commutativity, the value for your witness is wity2=f(f(x,y1),y3)wity2=f(f(x,y1),y3) and you can check given y2y2 and wity2wity2 whether y2y2 is in the accumulator accacc, you can check whether acc=f(wity2,y3)acc=f(wity2,y3) holds.
Furthermore, existing accumulator schemes (CL02, C+09, N05) come with zero-knowledge proofs of accumulator membership (you do not have to reveal the value y2y2 and the witness wity2wity2 directly, but you provide a zero-knowledge proof of knowledge of such a pair - which makes them attractive for privacy-preserving applications). Such accumulators are typically also dynamic, i.e., allow update of witnesses in the public if the accumulator is updated. Furthermore, there are also so called universal accumulators, which also allow to produce witnesses for non-membership of a value in the accumulated set (see A+09 or L+07).
All known efficient accumulators are based on number theoretic assumption, but I would not say that they are inefficient. Note that in your last RSA example, the membership check requires one exponentiation, which is not really very expensive.
is the function f weak in terms of eg. collision-resistance, is it not efficient enough...?
For a secure accumulator one requires collssion-freeness, i.e., it is computationally infeasible to find a witness for some value that is not accumulated in the accumulator. For RSA accumulators that requires that you only accumulate primes (so you have to map your values to accumulate to primes with some deterministic algorithm). Otherwise, you could factor your value into two factors and exponentiate one onto your witness and provide the second as value to be checked and the check would work. This is ruled out if you take primes. There are however, other secure pairing based accumulators that do not suffer from this problem.
Accumulators are used for various purposes, such as timestamping (the original application), membership testing, distributed signatures, redactable and sanitizable signatures as well as for revocation in group signatures and anonymous credential systems.
There are constructions for accumulators based on bloom filters (see Nyberg, Fast accumulated hasing, FSE 1996), but they are rather impractical (but do not rely on number theoretic assumptions).
Editor
This software review is maintained by volunteer editor Albert E. Lyngzeidetson, Ph.D.. Registered members can contact the editor with any comments or questions they might have by clicking here.
Back to the top of the article.
Comments
A thoroughly functional and productivity-enhancing hash generator/comparator is QuickHash. This is another product from Nick Shaw (Foolish IT) who has given us d7 and CryptoPrevent. It is portable, fast and feature rich. From its own page [https://www.foolishit.com/free-tech-tools/quickhash/]:
"QuickHash is a utility to quickly display the MD5, SHA1, SHA256, SHA384, SHA512, (and SHA3 in v2.x) hashes of any selected file, and optionally compare the hashes with any hash string."
"QuickHash comes in two versions:
v1.x – does not require the .NET Framework (suitable for usage in older Windows operating systems such as Windows XP – Windows 7 that may not have the .NET Framework installed, and also WinPE environments) but may hang or report invalid data on files approaching 2GB in size.
v2.x – adds the newest SHA3 hash support in addition to supporting files greater than 2GB in size, but requires the .NET 4 Client Profile (this is already installed by default on Windows versions 8.0 and newer.)"
"New Features in QuickHash v2.x
Ability to select and hash multiple files within a single tabbed interface
Ability to enable/disable hash calculations per hash type (click on text box area to calculate/recalculate on demand)
Ability to submit malicious files to Foolish IT for review (allowing hash definition creation for use within our other products such as CryptoPrevent Malware Prevention, dFunk (d7II PC Technician Software), KillEmAll v5, etc.)
Display the time in seconds it took to calculate the hash type
Built in updating feature
Added optional debug logging information
Using a config file to save settings for portability"
"License for all versions of QuickHash
QuickHash is FREE for both personal and commercial usage."
Hashtab
-------
Short note:
the link to hashtab doesn't work
http://www.implbits.com/Products/HashTab.aspx
ah well .. neither http://www.implbits.com/hashtab.aspx which is the 'More" button on their website... nor the screenshots - error: the requested content cannot be loaded...
to obtain the file you need to enter your email
also it says, under 'Next steps'
" Once you have installed HashTab, just right click on any file "
Igorware
--------
sofar I used Igorware hasher
http://www.igorware.com/hasher
it really is very flexible: you can copy a SHA-1 string to clipboard, run igorware hasher and it automatically compares to clipboard.
it is portable.
Regretfully...! It does not support SHA-256.
HashMyFiles
-----------
It supports SHA-256, SHA-384, SHA-512 as well
Version 1.80: Added support for SHA-256 and SHA-512 hashes.
Version 1.85: Added support for SHA-384 hashes.
-
As for myself: I am still looking for a tool, similar to Igorware Hasher
(small, portable, compare file to a string in clipboard .. -with- SHA-256 support)
=
You could try DP Hash ( http://www.paehl.de/cms/hash_dp )which is portable and supports 34 hash algorithms. One drawback is that it doesn't support drag & drop.
Another is Hasher Lite ( http://www.den4b.com/?x=products&product=hasher ) which is portable too. This program supports drag & drop and SHA-256 as well. It is free for private/personal use.
I'm looking for a utility to validate the integrity of my archived photos and detect bit rot. I'd like the program to store the checksum in the file properties, unless that would invalidate the checksum. If it does invalidate the checksum, then simply saving to a new file. The next feature is to run validation tests on a schedule, detecting bit rot and alerting me. The ideal solution will also have parity information so the photo can be recovered.
Photo bit rot of archived photos is a big concern of mine.
Thanks. Jake.
I downloaded HashTab from Softpedia and didn't have to give an email address. I'm just mentioning this for the benefit of anyone who doesn't want to give out their email address.
what about md5check
http://www.softpedia.com/get/System/File-Management/MD5-Check.shtml
and md5 checker
http://www.georgejopling.co.uk/md5check/md5check.html