What is a perceptual hash function?
When programmers need to create a shorter surrogate for a larger file or block of data, they often turn to hash functions. These programmers analyze a block of data and produce a short number that can act as a stand-in or shorthand for the larger collection of bytes, sometimes in an index and other times in a more complicated calculation.
Perceptual hash functions are tuned to produce the same result for similar images or sounds. They aim to imitate human perception by focusing on the types of features (colors and frequencies) that drive human sight and hearing.
Many popular non-perceptual hash functions are very sensitive to the smallest changes. Simply flipping one bit, say by changing the amount of blue in a pixel from 200 to 199 units, might change half of the bits in the hash functions. Perceptual hash functions are designed to return answers for images or sounds that a human might feel are similar. That is, small changes in the media don’t affect the output.
Hash functions simplify searching and indexing through databases and other data storage. Hash tables, a popular data structure known for fast response, rely on a good hash function as an index to quickly locate the larger block of data. Facial recognition algorithms, for instance, use a perceptual hash function to organize photos by the people in the image. The algorithms use the relative distances between facial features — like eyes, nose, and mouth — to construct a short vector of numbers that can organize a collection of images.
Some algorithms depend on hash functions to flag changes. These approaches, often called “check sums,” began as a quick way to look for mistransmitted data. Both the sender and receiver might add together all of the bytes in the data and then compare the answer. If both agree, the algorithm might assume no mistakes were made — an assumption that is not guaranteed. If the errors made in transmission happened in certain a way — say adding three to one byte while also subtracting three from a different one — the mistakes would cancel out and the checksum algorithm would fail to catch the error.
All hash functions are vulnerable to “collisions” when two different blocks of data produce the same hash value. This happens more often with hash functions that produce shorter answers because the number of possible data blocks is much, much greater than the number of potential answers.
Some functions, like the U.S. government’s standard Secure Hash Algorithm (SHA256), are designed to make it practically impossible for anyone to find a collision. They were designed using the same principles as strong encryption routines to prevent reverse engineering. Many cryptographic algorithms rely on secure hash functions like SHA256 as a building block, and some refer to them colloquially as the “duct tape” of cryptography.
Perceptual hash functions can’t be as resistant. They are designed so that similar data produces a similar hash value, something that makes it easy to search for a collision. This makes them vulnerable to spoofing and misdirection. Given one file, it is relatively easy to construct a second file that looks and appears quite different but produces the same perceptual hash value.
How do perceptual hash functions work?
READ full article: What is a perceptual hash function?