Coder Social home page Coder Social logo

imclust's Introduction

NAME

imclust - cluster images

USAGE

    imclust [OPTIONS] PATH...

DESCRIPTION

Imclust does cluster images from the directory, and produces a CSV-file output (or web visualization). It accepts also multiple directories as input, direct image files, or CSV files where the first column are paths to images, or combination of them.

Clustering is done in six steps:

  1. loading and resize of images,
  2. perception - transformation of images into vectors,
  3. reduction of dimensionality of vectors (optional),
  4. clustering,
  5. ordering/sorting of clusters,
  6. assembling the visualization or the output data-file.

Instead of perception/reduction vectors, precomputed vectors can be used for clustering using the -vec switch.

CACHING

    Caching of perception and reduction outputs is enabled by default.
    If in current directory or in any parent directory a "CACHE" directory
    is find, it will be used.  Otherwise inputs directory will be used
    for cache files.  Explicit cache directory can by requested by -cd,
    or no caching by -nc.

OPTIONS

      -h  This help.
      -v  Verbose.
      -f  Force recomputing all data, avoid cached.
   -html  Write html output instead of CSV.
 -o PATH  The base of the output file name.
  -j NUM  Number of threads for loading, dflt. 8.
 -cd DIR  Cache directory to use.
     -nc  Don't cache computed perception nor reduction vectors.
  -c NUM  Requested number of clusters.
  -n NUM  Number of clustering attempts/restarts.
  -m NUM  Limit the max number of images to cluster.
 -mt NUM  Number of members threshold for the cluster to be accepted.
 -dt NUM  Absolute distance threshold from the center cluster, for
          the image to be accepted.
-pt PERC  Percentual threshold.
  -b NUM  Batch size.
  -r NUM  Reduce vector dimensionality to NUM, dflt. auto from 3072.
 -rp NUM  No. of patterns to train reduction, dflt. auto from 8192.
 -nn STR  Model name, dflt. densenet201 (none, resnet50, resnet152v2, vgg16, inceptionv3, efficientnetb6, densenet121, densenet169, densenet201).
 -cl STR  Clustering algorithm, dflt. km (from km, bkm, kmd).
 -rd STR  Dimensionality reduction, dflt. pca (from none, pca).
  -s STR  Sorting of cluster centers, dflt. tsp (from none, size, tsp).
-vec STR  Suffix of files with precomputed vectors for every picture,
          for "dir/f_12.jpg" we expect "dir/f_12.vgg" if STR is "vgg".
          Comma separated list of suffixes is allowed, to concatenate
          several vectors into single input for clustering.
     -nm  No metric.
    -jpg  Jpg input files only.

CLUSTERING

      km  scikit KMeans
     bkm  scikit MiniBatchKMeans
     kmd  scikit KMedoids

EXAMPLES

  imclust -c 128 -nn resnet50 -r 512 -n 128 images

    Cluster images from the images directory to 128 clusters using resnet50
    embeddings reduced to 512 length with 128 runs of the algorithm.  Results
    will be in the images directory itself in images/.km32.resnet50-pca294.csv.
    Visulaize it by converting to html page with clusters, and open in browser:

  impage -tt images.km32.resnet50-pca152.csv
  or
  impage -tt -m images.km32.resnet50-pca152.csv
  firefox /tmp/impage/images.km32.resnet50-pca152/index.html

    Cluster directory dir according to both .model1 and .model2 precomputed
    raw vectors into 96 clusters as a best from 16 attempts:

  imclust -n 16 -c 96 -vec model1,model2 -o dir-cl.csv dir

SEE ALSO

impage -h

VERSION

imclust 0.5 (c) R.Jaksa 2021

imclust's People

Contributors

rudojaksa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.