All forums > ThumbsPlus v10 Questions

Image "Similarity". Is this a metric in the database?

(1/2) > >>

hockeyrink:
I have a website, and many of the images are low-res. My hi-res imagery on the server has some issues where the website product image ("img1234.jpg") should have been "img1234-SKU.jpg", but ISN'T.

So when I did a search for the largest version of "img1234.jpg" from my webserver, it often got it wrong, finding a different version of "img1234.jpg".

I'd like to know if the image similarity feature could work this out for me. Like:

* FIND the website's image name & similarity metric
* COMPARE it to all other images of the same name, then
* SORT by metric, THEN by filesize.
Is this possible in a TB database, or am I gonna have to build an ugly BASH script using imagemagik?  :-[

Daan van Rooijen:
When you open your database in Access, MDB Viewer or some other tool, you'll find fields named 'metric1' and 'metric2' in the Thumbnail table. But how those are used in similarity comparisons is probably only for Cerious to know.

As a user of the program, you could simply:

- Download your website's files
- Thumbnail all images
- Press Ctrl-F on any image that you suspect has differently-sized duplicates, and use the Image Similarity tab to locate them.

hockeyrink:
Gotcha. That's a reasonable starting point. Thanks for the pointer!

Daan van Rooijen:
Good, I hope it will help you fix the problem!

Of course, you could also use the "Edit | Find Similar" function (with a threshold setting of 5 or so), to find all different sets of similar images at one time. In the results list, you could Tag (press INS) all images that should be renamed to reflect their higher resolution. This would create a Tagged Images gallery that contains all images that need renaming. Maybe that's a faster method.

hockeyrink:
Hmm... Might be a plan if I can't automate this. I have literally 2500 images to review to make sure I've got the largest file of.

Did some reading up on image similarities (pHash), and those "metric1" and "metric2" fields may be the key for me. Seems to be a 512bit field, which could be the results of a 64x64 image analysis (color & contrast maybe?). Then you are supposed to do something called a "Hamming distance" analysis, which is essentially "how many changes to FOO has to be made to match BAR?". The closer the similarity, the lower the Hamming distance.

I'll update the forum on how the process goes...

Navigation

[0] Message Index

[#] Next page

Go to full version