Coder Social home page Coder Social logo

dl002 / pix-plot Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yaledhlab/pix-plot

0.0 0.0 0.0 5.5 MB

A WebGL viewer for UMAP or TSNE-clustered images

Home Page: https://s3-us-west-2.amazonaws.com/lab-apps/pix-plot/index.html

License: MIT License

Dockerfile 1.00% CSS 4.57% JavaScript 73.03% HTML 3.37% Python 18.03%

pix-plot's Introduction

PixPlot

This repository contains code that can be used to visualize tens of thousands of images in a two-dimensional projection within which similar images are clustered together. The image analysis uses Tensorflow's Inception bindings, and the visualization layer uses a custom WebGL viewer.

App preview

Dependencies

To install the Python dependencies, you can run (ideally in a virtual environment):

pip install -r utils/requirements.txt

If you have an NVIDIA GPU, consider replacing tensorflow with tensorflow-gpu in requirements.txt. You'll need to have CUDA and CUDNN working as well.

Image resizing utilities require ImageMagick compiled with jpg support:

brew uninstall imagemagick && brew install imagemagick

The html viewer requires a WebGL-enabled browser.

Quickstart

If you have a WebGL-enabled browser and a directory full of images to process, you can prepare the data for the viewer by installing the dependencies above then running:

git clone https://github.com/YaleDHLab/pix-plot && cd pix-plot
python utils/process_images.py "path/to/images/*.jpg"

To see the results of this process, you can start a web server by running:

# for python 3.x
python -m http.server 5000

# for python 2.x
python -m SimpleHTTPServer 5000

The visualization will then be available on port 5000.

Processing Data with Docker

Some users may find it easiest to use the included Docker image to visualize a dataset.

To do so, you must first install Docker. If you are on Windows 7 or earlier, you may need to install Docker Toolbox instead.

Once Docker is installed, start a terminal, cd into the folder that contains this README file, and run:

# build the container
docker build --tag pixplot --file Dockerfile .

# process images - use the `-v` flag to mount directories from outside
# the container into the container
docker run \
  -v $(pwd)/output:/pixplot/output \
  -v /Users/my_user/Desktop/my_images:/pixplot/images \
  pixplot \
  bash -c "cd pixplot && python3.6 utils/process_images.py images/*.jpg"

# run the web server
docker run \
  -v $(pwd)/output:/pixplot/output \
  -p 5000:5000 \
  pixplot \
  bash -c "cd pixplot && python3.6 -m http.server 5000"

Once the web server starts, you should be able to see your results on localhost:5000.

Curating Automatic Hotspots

By default, PixPlot uses k-means clustering to find twenty hotspots in the visualization. You can adjust the number of discovered hotspots by changing the n_clusters value in utils/process_images.py and re-running the script.

After processing, you can curate the discovered hotspots by editing the resulting output/plot_data.json file. (This file can be unwieldy in large datasets -- you may wish to disable syntax highlighting and automatic wordwrap in your text editor.) The hotspots will be listed at the very end of the JSON data, each containing a label (by default 'Cluster N') and the name of an image that represents the centroid of the discovered hotspot.

You can add, remove or re-order these, change the labels to make them more meaningful, and/or adjust the image that symbolizes each hotspot in the left-hand Hotspots menu. Hint: to get the name of an image that you feel better reflects the cluster, click on it in the visualization and it will appear suffixed to the URL.

Demonstrations

Collection # Images Collection Info Image Source
Per Bagge 29,782 Bio Lund University
Meserve-Kunhardt 27,000 Finding Aid Beinecke (Partial)

Acknowledgements

The DHLab would like to thank Cyril Diagne, a lead developer on the spectacular Google Arts Experiments TSNE viewer, for generously sharing ideas on optimization techniques used in this viewer.

pix-plot's People

Contributors

duhaime avatar pleonard212 avatar tokee avatar oliviersorba avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.