Coder Social home page Coder Social logo

vitali-fedulov / imagehash2 Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 32 KB

Fast image similarity search with hash tables (Golang). Version 2 (LATEST)

License: MIT License

Go 100.00%
image-hash image-hashing image-similarity near-duplicate near-duplicate-detection similar-images similarity similarity-detection similarity-search

imagehash2's Introduction

Fast similar image search with Go (LATEST version)

Resized and near-duplicate image search for very large image collections (thousands, millions, and more). The package generates 'real' hashes to be used in hash-tables, and consumes very little memory. It is recommended to cross-check the similarity result with more precise images4 package.

Demo (a usage scenario for image similarity search).

Algorithm for nearest neighbour vector search by vector quantization.

Go doc.

Major (semantic) versions have their own repositories and are mutually incompatible:

Major version Repository Comment
2 imagehash2 (this) recommended, with improved precision
1 imagehash as fast, but has a generalization defect

Parameters

The most important parameter is numBuckets. It defines granularity of hyper-space quantization. The higher the value, the more restrictive the comparison is. And, when used together with images4 package, higher numBuckets considerably accelerates the search process, because fewer image ids fall into a single quantization cell.

The second parameter is epsilon, which can be safely set to 0.25.

Example of comparison for 2 photos

The demo shows only the hash-based similarity comparison (without making actual hash table). But the hash table, typically a Golang map, is implied in full implementation.

package main

import (
	"fmt"
	"github.com/vitali-fedulov/imagehash2"
	"github.com/vitali-fedulov/images4"
)

const (
	// Recommended initial parameters.

	// Increase this value to get higher precision.
	numBuckets = 4

	// No need to change epsilon value.
	epsilon = 0.25
)

func main() {

	// Open and decode photos (skipping error handling for clarity).
	img1, _ := images4.Open("1.jpg")
	img2, _ := images4.Open("2.jpg")

	// Icons are compact image representations needed for comparison.
	icon1 := images4.Icon(img1)
	icon2 := images4.Icon(img2)

	// Hash table values.

	// Value to save to the hash table as a key with corresponding
	// image ids. Table structure: map[centralHash][]imageId.
	// imageId is simply an image number in a directory tree.
	// And centralHash type is uint64.
	centralHash := imagehash2.CentralHash9(icon1, epsilon, numBuckets)

	// Hash set to be used as a query to the hash table. Each hash from
	// the hashSet has to be checked against the hash table.
	hashSet := imagehash2.HashSet9(icon2, epsilon, numBuckets)

	foundSimilarImage := false

	// Checking hash matches. In full implementation to search in many
	// images, this will be done on the following hash table of type
	// map[centralHash][]imageId. Where centralHash type is uint64.
	for _, hash := range hashSet {
		if centralHash == hash {
			foundSimilarImage = true
			break
		}
	}

	// Comparison result.
	if foundSimilarImage {

		fmt.Println("Images are *approximately* similar.")

		// It is recommended to cross-check the result with
	        // the higher-precision func Similar from package images4.
		if images4.Similar(icon1, icon2) == true {
			fmt.Println("Images are similar")
		}

	} else {
		fmt.Println("Images are distinct.")
	}

}

imagehash2's People

Contributors

vitali-fedulov avatar

Stargazers

xyxu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.