Visipics and finding similar images and dupicates with ImageMagick

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
buchert
Posts: 36
Joined: 2015-02-13T11:15:29-07:00
Authentication code: 6789

Visipics and finding similar images and dupicates with ImageMagick

Post by buchert »

I'm a Linux user and I like to work from the command line. But I still use the Windows program VisiPics with Wine to find similar images. It gets better results than the Linux program findimagedupes, and it's better than any other Windows program I've tried.

I checked the installation folder for VisiPics and found that it uses ImageMagick: http://pastebin.com/NxP5KDMC

So I'd like to stop using Visipics and just use ImageMagick, but I don't know the commands the VisiPics developer used. He stopped developing it in 2004.

There's a million programs to find dupicate images, but I like Visipics because it find similar images. I'd like a command that can search through a folder of hundreds of jpgs and give me a text output of similar images. Even better would be if the command could find the similar images and duplicates and move them to a subdirectory of the working directory, placing them next to each other through a naming scheme.

I don't know if this is a tall order, but thought I'd ask.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

Imagemagick has a compare function for determining how different or the same two images are. See

http://www.imagemagick.org/script/compare.php
http://www.imagemagick.org/Usage/compare/

One of the methods is a perceptual hash. See
viewtopic.php?f=4&t=24906

Sorry, I am not a Windows user and know nothing about Visipics or how it uses Imagemagick.
Bonzo
Posts: 2971
Joined: 2006-05-20T08:08:19-07:00
Location: Cambridge, England

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by Bonzo »

The VisiPics developer mentions algorithms which I presume he wrote and will probably not be built into Imagemagick. For a start you can compare different size images and I believe Imagemagick can not do that.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by snibgo »

Given that one image may be a copy of another, but slightly rotated and cropped and scaled and adjusted for colour and tone, with a pretty border added, there is no simple algorithm that will test this situation.

This is the kind of pattern-matching that humans are good at but computers are not.

Visipics seems to work by applying a number of processes in a brute-force manner. The author has done the hard work of combining the low-level operations from IM (or whatever) into high-level decisions, thinking about both false negatives and false positives, and building an image-management layer on that.

Not a trivial task.
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

Bonzo wrote:The VisiPics developer mentions algorithms which I presume he wrote and will probably not be built into Imagemagick. For a start you can compare different size images and I believe Imagemagick can not do that.
Perceptual hash

Code: Select all

compare -metric phash image1 image2 null:
can compare two different size images (without the -subimage-search). It is to some extent rotation and scale invariant and also insensitive to other kinds of changes (brightness, contrast, compression, distortion). See my link in my previous post.

It is, however, generally limited to comparing color images to color images. It will not compare color to grayscale and is not good at comparing grayscale to grayscale.
buchert
Posts: 36
Joined: 2015-02-13T11:15:29-07:00
Authentication code: 6789

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by buchert »

Thanks for the information. The developer has been working on version 2 since 2008, which will have a cli. https://fr.linkedin.com/in/gfouet

I'm curious to try the following command out on a folder of jpgs. I mostly use Visipics for color images, so it's not a big deal if it can't work with grayscale. Could this command:

Code: Select all

compare -metric phash image1 image2 null:
be modified to compare a folder of jpgs and move the similar ones to a subfolder?
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

You would have to write a script and that depends upon your OS. That is why we always ask for the IM version and platform.

P.S. Similar images have small compare values. Identical images have a value of zero. The larger the value, the more different the images are.
buchert
Posts: 36
Joined: 2015-02-13T11:15:29-07:00
Authentication code: 6789

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by buchert »

Manjaro Linux 16.06.1
ImageMagick 6.9.5.2-1

I tried this but it doesn't work:

Code: Select all

for image in *.jpg;
    do comp=$(compare -metric phash $image null:);
    mkdir -p $comp;
    mv $image $comp;
done
buchert
Posts: 36
Joined: 2015-02-13T11:15:29-07:00
Authentication code: 6789

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by buchert »

I'm getting these errors with the bash command above:

Code: Select all

compare: missing an image filename `test0001.jpg' @ error/compare.c/CompareImageCommand/957.
mkdir: missing operand
Try 'mkdir --help' for more information.
mv: missing destination file operand after 'test0001.jpg'
Try 'mv --help' for more information.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by snibgo »

buchert wrote:compare: missing an image filename
That means you are missing an image filename. "compare" needs two inputs and one output. (The output is often "NULL:".)
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

You need two loops to compare two different images.

Code: Select all

compare -metric phash image1 image2 null:
if you put this result into a variable, you need

Code: Select all

value=$(compare -metric phash image1 image2 null: 2>&1)
buchert
Posts: 36
Joined: 2015-02-13T11:15:29-07:00
Authentication code: 6789

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by buchert »

Thanks!

I'm doing something wrong. This is how I applied your code:

Code: Select all

#!/bin/bash

for image in *.jpg;
    do value=$(compare -metric phash image1 image2 null: 2>&1);
    mkdir -p $value;
    mv $value;
done
What do I put in between mv and $value? I tried $image but got strange results. A bunch of folders were created and all the jpgs were moved into one of the folders.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

buchert wrote:Thanks!

I'm doing something wrong. This is how I applied your code:

Code: Select all

#!/bin/bash

for image in *.jpg;
    do value=$(compare -metric phash image1 image2 null: 2>&1);
    mkdir -p $value;
    mv $value;
done
This will not work for several reasons. You have no files named image1 and image2. They were just placeholders in my last message.

Second, using *.jpg, you only get one image at a time per loop iteration, so you can only compare $image to $image which will always be the same. You need two nested loops, one over $image1 and the other over $image2 (as variables for the filenames). You will also need to trap out the filenames so that you do not compare one image to itself. You may also want to set up the loop so that you do not check each image against another image twice, once as image1 vs image2 and the other as image2 vs image1.

Third, $value is the value returned from compare and it does not make sense to use them for directories. You should create new directory(s) ahead of time. Then use a conditional test inside the inner loop to find values less than a threshold (similar image) and move them to the new pre-created directory. If you want to move images that differ from each other, then make your conditional greater than some threshold. Or you can test values for different ranges and put the corresponding images into their respective range directories. Values can range from 0 to huge values, so there will be too many values for directories unless you put images into ranges as directories.

mv will be used to move the images that conform to the directories to those directories. Your mv is trying to find an image named $value and there will be no images. The command only returns the comparison value.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Visipics and finding similar images and dupicates with ImageMagick

Post by fmw42 »

Here is a start. However, if you do not check the image1 vs image2 and image2 vs image1 issue, then you will end up moving both images.

Also note, you might find one smaller version of a larger image and it might move the smaller version if found first, since the phash will see them as the same. So you might want to add another conditional to compare sizes so you move the larger images of $image1 and $image2.

Code: Select all

threshold=20
for image1 in *.jpg; do
	for image2 in *.jpg; do
		if [ "$image1" != "$image2" ]; then
    		value=$(compare -metric phash $image1 $image2 null: 2>&1);
    	fi
    	if [[ $value < $threshold ]]; then
    		mv $image1 /path2emptydirectory/$image1
    	fi
done
Just edited to fix the $ in the compare statement.
Post Reply