We have 100k images from which we need to remove duplicates.
A duplicate match occurs when the images are VISUALLY the same, regardless if image format, quality setting, dimensions or small differences. Obviously we cannot use a md5-sum on the file itself, so we need something more advanced.
Ideally, we'd like to compute some kind of hash value that is based solely on visual appearance.
Our community of experts have been thoroughly vetted for their expertise and industry experience.