piximetry
asked on
Calculate visual hash of image
We need to calculate visual hash function of images, that is: images that look the same should be mapped to the same hash value.
Any ideas where to look or how to approach this problem?
Any ideas where to look or how to approach this problem?
It's not hard to map images that look the same to the same hash value if you also map images that look different to the same hash value.
It can get tricky if you also want to map images that look different to different hash values.
You could make a hash that changes if you make even the slightest change in the image.
But suppose you want an all black image to map to a different hash value than an all white image
If you also want a 99.9% black image to map to the same hash value as an all black image, and then want a 99.8% black image to map to the same hash value as a 99.9% black image, and then want a 99.7% black image to map to the same hash value as a 99.8% image.
If you continue this way, eventually you would have to say that a 0.01% black image and then a 0.00% black image all have the same hash value as the first image.
If you want a 0% black image to have a different hash than a 100% black image, then at some point you'd have to have, say, a 42.3% black image hash to a different vaue than a 42.4% black image even if they may be hard to tell apart by looking at them.
Instead of a hash, you might extract a feature, such as blackness, and say that
images "look the same" if the nunerical value if that feature is close.
But when people look at an image, they can pay attention to more than just a single feature, so there may be different images that are both 50% black, but
wih the black arranged in different ways so that someone would think they look different.
You could extract a numerical value for several different features, and say that the pictures "look the same" if all of the features have close numerical values.
Taken to an extreme, you could call each pixel a different feature.
Or you could extract a smaller set of features that are deemed relevant for the set of images you have,
If your "looks like" criterion is complicated, the set of features may also be complicated.
You might be able to use a simplifies set of features that is adequate for a given set of images, but if the set of features does not capture everything that your "looks like" criterion captures, then it may be possible to contrive either a pair of images with all features having the same numerical value which look different, or a pair of images with features having different values which look the same.
It can get tricky if you also want to map images that look different to different hash values.
You could make a hash that changes if you make even the slightest change in the image.
But suppose you want an all black image to map to a different hash value than an all white image
If you also want a 99.9% black image to map to the same hash value as an all black image, and then want a 99.8% black image to map to the same hash value as a 99.9% black image, and then want a 99.7% black image to map to the same hash value as a 99.8% image.
If you continue this way, eventually you would have to say that a 0.01% black image and then a 0.00% black image all have the same hash value as the first image.
If you want a 0% black image to have a different hash than a 100% black image, then at some point you'd have to have, say, a 42.3% black image hash to a different vaue than a 42.4% black image even if they may be hard to tell apart by looking at them.
Instead of a hash, you might extract a feature, such as blackness, and say that
images "look the same" if the nunerical value if that feature is close.
But when people look at an image, they can pay attention to more than just a single feature, so there may be different images that are both 50% black, but
wih the black arranged in different ways so that someone would think they look different.
You could extract a numerical value for several different features, and say that the pictures "look the same" if all of the features have close numerical values.
Taken to an extreme, you could call each pixel a different feature.
Or you could extract a smaller set of features that are deemed relevant for the set of images you have,
If your "looks like" criterion is complicated, the set of features may also be complicated.
You might be able to use a simplifies set of features that is adequate for a given set of images, but if the set of features does not capture everything that your "looks like" criterion captures, then it may be possible to contrive either a pair of images with all features having the same numerical value which look different, or a pair of images with features having different values which look the same.
you migt try searching for
image similarity metric
for some ways to approach this problem
image similarity metric
for some ways to approach this problem
ASKER
Ah, let me clear that up: what we need is to match images that have been resampled and/or resized. Visually, they should appear *exactly* the same, but on the pixel/byte level, there will be minute differences.
It's okay if changes in colour or shade do not get "detected", we just need to match across resampling and resizing.
It's okay if changes in colour or shade do not get "detected", we just need to match across resampling and resizing.
then how about resampleing the images to a standard size and seeing if the resulting pixels are within sampling error?
ASKER
I need to compare about 100k files with one another, so doing a 1-on-1 match would be too expensive. Doing comparisons on just some kind of hash value is MUCH faster.
We can do such a more expensive comparison as a last step though, to rule out false positives.
We can do such a more expensive comparison as a last step though, to rule out false positives.
You might resample to a standard size, round the pixel values by the sampleing error,
and take any hash of those values, but there could be false negatives where one image has values just on the edge to round up, and the other has values just on the edge to round down.
and take any hash of those values, but there could be false negatives where one image has values just on the edge to round up, and the other has values just on the edge to round down.
problem is you're trying to hash too far-- a 500x500 pixel image, 8 bits deep can represent 2^2million different images. If you hash this down to 32 bits, you still have 2^(2million-32) similar images mapping to the same hash code.
What I'd do is resample the pics down to say 50x50 and compare a few random pixels of that for nearness-- that should filter out 99.99% of the dissimilar ones. Then if it passes the 50x50 test, compare at a somewhat higher, say 1/4 resolution.
What I'd do is resample the pics down to say 50x50 and compare a few random pixels of that for nearness-- that should filter out 99.99% of the dissimilar ones. Then if it passes the 50x50 test, compare at a somewhat higher, say 1/4 resolution.
Try taking the averages of each colour plane across the whole image. The averaging should eliminate 'slight' differences and will obviously be independent of image size, if you use ozo's idea of allowing a small variance then you should avoid false negatives.
This method WILL generate false-positives but hopefully not many.
Paul
This method WILL generate false-positives but hopefully not many.
Paul
You have 100k files.
How many unique has values do you want or expect?
Using grg99's technique:
If you have 100k unique files, you might want a 1 Meg (2^20) hash table.
You could hash on average RGB values for binary subregions of the image.
If you use 2 bits/4 levels for R, G, and B for each quadrant (4 subregions),
you would get a 24-bit hash value.
How many unique has values do you want or expect?
Using grg99's technique:
If you have 100k unique files, you might want a 1 Meg (2^20) hash table.
You could hash on average RGB values for binary subregions of the image.
If you use 2 bits/4 levels for R, G, and B for each quadrant (4 subregions),
you would get a 24-bit hash value.
ASKER
The hash value doesn't need to be limited to 32 bits -- it can also be binary string up to 256 or so bytes.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Just an idea... do you think it makes sense to use a Fourier Transform of an image instead of the image itself?
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
And a further benefit to averaging. If you convert the image into one of the luminance/chrominance colour spaces you could bias your match towards the luminance channel while allowing a wider epsilon on the chrominance, that way you might even catch rebalanced images.
Paul
Paul
rather than a single hash, it's probably better to extracr a small enough set of linear features that it's feasible to do a close to comparison on each feature or the sum of the squares of the differences.
A set of 64 features may be the pixels in a 8x8 resampleing or 8x8 Fourier components.
You might do a quick test with few features and then a more discriminating test with more features.
A set of 64 features may be the pixels in a 8x8 resampleing or 8x8 Fourier components.
You might do a quick test with few features and then a more discriminating test with more features.
ASKER
I would like to keep it open and get a few more comments... there are some usable comments here, but not a 100% winner yet.
What do mean by "look the same" ?
Same size? Subject matter? Color pallette? Fourier transform?
How different do two images have to be before they get their own hash value?
Is this the sort of thing you are looking for?
http://cui.unige.ch/spc/Presentation/2004/baumann_3_12_04.pdf