Coder Social home page Coder Social logo

tfjs-tsne's Introduction

tSNE for TensorFlow.js

This library contains a improved tSNE implementation that runs in the browser.

Installation & Usage

You can use tfjs-tsne via a script tag or via NPM

Script tag

To use tfjs-tsne via script tag you need to load tfjs first. The following tags can be put into the head section of your html page to load the library.

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tsne"></script>

This library will create a tsne variable on the global scope. You can then do the following

// Create some data
const data = tf.randomUniform([2000,10]);

// Get a tsne optimizer
const tsneOpt = tsne.tsne(data);

// Compute a T-SNE embedding, returns a promise.
// Runs for 1000 iterations by default.
tsneOpt.compute().then(() => {
  // tsne.coordinate returns a *tensor* with x, y coordinates of
  // the embedded data.
  const coordinates = tsneOpt.coordinates();
  coordinates.print();
}) ;

Via NPM

yarn add @tensorflow/tfjs-tsne

or

npm install @tensorflow/tfjs-tsne

Then

import * as tsne from '@tensorflow/tfjs-tsne';

// Create some data
const data = tf.randomUniform([2000,10]);

// Initialize the tsne optimizer
const tsneOpt = tsne.tsne(data);

// Compute a T-SNE embedding, returns a promise.
// Runs for 1000 iterations by default.
tsneOpt.compute().then(() => {
  // tsne.coordinate returns a *tensor* with x, y coordinates of
  // the embedded data.
  const coordinates = tsneOpt.coordinates();
  coordinates.print();
}) ;

API

tsne.tsne(data: tf.Tensor2d, config?: TSNEConfiguration)

Creates and returns a TSNE optimizer.

  • data must be a Rank 2 tensor. Shape is [numPoints, dataPointDimensions]
  • config is an optional object with the following params (all are optional):
    • perplexity: number — defaults to 18. Max value is defined by hardware limitations.
    • verbose: boolean — defaults to false
    • exaggeration: number — defaults to 4
    • exaggerationIter: number — defaults to 300
    • exaggerationDecayIter: number — defaults to 200
    • momentum: number — defaults to 0.8

.compute(iterations: number): Promise

The most direct way to get a tsne projection. Automatically runs the knn preprocessing and the tsne optimization. Returns a promise to indicate when it is done.

  • iterations the number of iterations to run the tsne optimization for. (The number of knn steps is automatically calculated).

.iterateKnn(iterations: number): Promise

When running tsne iteratively (see section below). This runs runs the knn preprocessing for the specified number of iterations.

.iterate(iterations: number): Promise

When running tsne iteratively (see section below). This runs the tsne step for the specified number of iterations.

.coordinates(normalize: boolean): tf.Tensor

Gets the current x, y coordinates of the projected data as a tensor. By default the coordinates are normalized to the range 0-1.

.coordsArray(normalize: boolean): Promise<number[][]>

Gets the current x, y coordinates of the projected data as a JavaScript array. By default the coordinates are normalized to the range 0-1. This function is async and returns a promise.

Computing tSNE iteratively

While the .compute method provides the most direct way to get an embedding. You can also compute the embedding iteratively and have more control over the process.

The first step is computing the KNN graph using iterateKNN.

Then you can compute the tSNE iteratively and examine the result as it evolves.

The code below shows what that would look like

const data = tf.randomUniform([2000,10]);
const tsne = tf_tsne.tsne(data);

async function iterativeTsne() {
  // Get the suggested number of iterations to perform.
  const knnIterations = tsne.knnIterations();
  // Do the KNN computation. This needs to complete before we run tsne
  for(let i = 0; i < knnIterations; ++i){
    await tsne.iterateKnn();
    // You can update knn progress in your ui here.
  }

  const tsneIterations = 1000;
  for(let i = 0; i < tsneIterations; ++i){
    await tsne.iterate();
    // Draw the embedding here...
    const coordinates = tsne.coordinates();
    coordinates.print();
  }
}

iterativeTsne();

Example

We also have an example of using this library to perform TSNE on the MNIST dataset here.

Limitations

This library requires WebGL 2 support and thus will not work on certain devices, mobile devices especially. Currently it best works on desktop devices.

From our current experiments we suggest limiting the data size passed to this implementation to data with a shape of [10000,100], i.e. up to 10000 points with 100 dimensions each. You can do more but it might slow down.

Above a certain number of data points the computation of the similarities becomes a bottleneck, a problem that we plan to address in the future.

Implementation

This work makes use of linear tSNE optimization for the optimization of the embedding and an optimized brute force computation of the kNN graph in the GPU.

Reference

Reference to cite if you use this implementation in a research paper:

@article{TFjs:tSNE,
  author = {Nicola Pezzotti and Alexander Mordvintsev and Thomas Hollt and Boudewijn P. F. Lelieveldt and Elmar Eisemann and Anna Vilanova},
  title = {Linear tSNE Optimization for the Web},
  year = {2018},
  journal={arXiv preprint arXiv:1805.10817},
}

tfjs-tsne's People

Contributors

1wheel avatar bldrvnlw avatar darthtrevino avatar dependabot[bot] avatar dsmilkov avatar fil avatar tafsiri avatar zhihua-chen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tfjs-tsne's Issues

Using script tag sample code leads to an exception - works with older version of TFJS

Using the sample code at https://github.com/tensorflow/tfjs-tsne#script-tag we get the following exception

tfjs-tsne:1 Uncaught TypeError: n.ENV.findBackend is not a function
at F (tfjs-tsne:1)
at new t (tfjs-tsne:1)
at Object.t.tsne (tfjs-tsne:1)
at tsne-test.html:7

However, changing it to load an older TFJS works fine

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"></script>

The following causes the exception

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script> <script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tsne"></script> <script> // Create some data const data = tf.randomUniform([2000,10]); // Get a tsne optimizer const tsneOpt = tsne.tsne(data); // Compute a T-SNE embedding, returns a promise. // Runs for 1000 iterations by default. tsneOpt.compute().then(() => { // tsne.coordinate returns a *tensor* with x, y coordinates of // the embedded data. const coordinates = tsneOpt.coordinates(); coordinates.print(); }) ; </script>

Comparing compute and iterable t-SNE

Hi,

I am developing a brief interface where user can select if compute t-SNE at once or observe the iterations. Unfortunately, I am not able to obtain comparable results using the base code published in the MANIFEST. I am using the Iris data.

var tsneOpt = tsne.tsne( tensorData, { perplexity : PERPLEXITY } );

// This is the 'compute' version
tsneOpt.compute( ITERATIONS ).then( () => {
tsneOpt.coordsArray().then( coords => {
console.log( 'Projection finished!' );
coords.forEach( ( d, i ) => {
data[ i ].x = d[ 0 ];
data[ i ].y = d[ 1 ];
} );
drawProjection();
} );
} );

// This is the iterable version
async function iterateProjection() {
await tsneOpt.iterateKnn();
const step = 20;
for( let i = 0; i < ITERATIONS; i += step ) {
await tsneOpt.iterate( step );
tsneOpt.coordsArray().then( coords => {
console.log( 'Projection stepped!' );
coords.forEach( ( d, i ) => {
data[ i ].x = d[ 0 ];
data[ i ].y = d[ 1 ];
} );
drawProjection();
} );
}
}

Thank you in advance!

Cannot read property 'embedding' of undefined

Running the synthetic data example gives me this error:

tsne_optimizer.ts:579 Uncaught (in promise) TypeError: Cannot read property 'embedding' of undefined
    at tsne_optimizer.ts:579
    at Object.Tracking.tidy (tracking.js:36)
    at TSNEOptimizer.<anonymous> (tsne_optimizer.ts:577)
    at step (tsne_optimizer.ts:23)
    at Object.next (tsne_optimizer.ts:23)
    at fulfilled (tsne_optimizer.ts:23)

Immediately after the "tsne.ts:299 Initializing probabilities".

I tried lowering the number of points and dimensionality but it had no effect.

an example where knn is super slow & model is not converging

I'm trying to port https://beta.observablehq.com/@fil/tsne-js (made with https://github.com/karpathy/tsnejs) to tfjs-tsne, and I hit two problems:

  1. the kNN procedure takes forever
  2. then the model doesn't converge to anything meaningful

I'm not sure what I'm doing wrong… posting here as @Nicola17 seems to think the issue might be with tfjs-tsne.

Example under @observablehq:
https://beta.observablehq.com/@fil/hello-tfjs-tsne

and as a standalone page:
https://bl.ocks.org/Fil/ee79c135db0c3451d4f44c60925e9466

CUDA/OpenGL (non-WebGL) implementation

I'm excited about a tSNE implementation with linear complexity. I'd like to test it with large datasets, running in a desktop/server environment instead of in a browser, however. Do you have any plans to add support for a CUDA backend?

Adding a convenience for reshaping the coordinates tensor to [[x1,y1] ...]

After calling .coordinates if you want to plot the data you often will need to do something like

const coords = [];
  for (let i = 0; i < coordinates.length; i += 2) {
    coords.push([coordinates[i], coordinates[i + 1]]);
  }

What about providing a function that does this transformation (and awaits the tensor data) for users?

If you agree I could work on adding this. cc @Nicola17

Nearly every coordinate is 0.5, 0.5

I used tfjs-tsne successfully on word embeddings from about 10 languages derived from https://fasttext.cc/docs/en/pretrained-vectors.html

But the Swedish one had all but a few of 20,000 items with coordinates 0.5, 0.5.

This was also a problem with Finnish but tfjs-tsne worked fine for English, German, Italian, French, Hindi, Chinese, and more.

I left all the defaults as with the sample code from https://github.com/tensorflow/tfjs-tsne

Here's a page for generating the Swedish coordinates - https://ecraft2learn.github.io/ai/word-embeddings/tsne-sv.html

In contrast this very similar German one worked fine - https://ecraft2learn.github.io/ai/word-embeddings/tsne-de.html

(These pages take a few minutes on my laptop.)

The word embeddings for Swedish seem to work fine when used to find the closest words so it is likely that tfjs-tsne is the cause and not the word embeddings.

Unclear cause of "Requested texture size [19991x2] greater than WebGL maximum..."

Using the same scripts for 13 languages only Japanese yields an "Requested texture size [19991x2] greater than WebGL maximum" error. Note that Chinese works fine. And reducing the Japanese data from 20,000 entries to 16,000 entries works fine. Clearly some datasets are too large for tfjs-tsne but in this case all languages were 20000x300 so why only Japanese causing this error?

https://ecraft2learn.github.io/ai/word-embeddings/tsne-ja16000.html (works fine)

https://ecraft2learn.github.io/ai/word-embeddings/tsne-ja20000.html (produces the following error)

Uncaught (in promise) Error: Requested texture size [19991x2] greater than WebGL maximum on this browser / GPU [16384x16384].
at validateTextureSize (tfjs:2)
at createAndConfigureTexture (tfjs:2)
at createFloat32MatrixTexture (tfjs:2)
at e.createFloat32MatrixTexture (tfjs:2)
at e.acquireTexture (tfjs:2)
at e.acquireTexture (tfjs:2)
at e.uploadToGPU (tfjs:2)
at e.compileAndRun (tfjs:2)
at e.slice (tfjs:2)
at ENV.engine.runKernel.$x (tfjs:2)

Wrong install instructions

Maybe I'm mistaken but when I tried to install the library with

yarn add tensorflow@tfjs-tsne

it did not work and I got a prompt where I had to specify the tensorflow version to install.

When I followed the instruction on the npm page it worked.

yarn add @tensorflow/tfjs-tsne

Maybe you should update the install instructions.

Best,
Jan

README.md script tag instructions don't work with tfjs v1

This throws a Uncaught (in promise) TypeError: n is not a function error in chrome & ff:

<!DOCTYPE html>
<meta charset='utf-8'>

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-tsne"></script>

<script>
  const tfdata = tf.randomUniform([2000,10]);
  const tsneOpt = tsne.tsne(tfdata);
  tsneOpt.compute()
</script>

Downgrading tfjs fixes:

<script src="https://cdn.jsdelivr.net/npm/@tensorflow/[email protected]"></script>

T-SNE in browser example throws error

The example at this URL: https://storage.googleapis.com/tfjs-examples/tsne-mnist-canvas/dist/index.html . Throws an error when you I hit Start TSNE.

Varyings over maximum register limit� tfjs-tsne-mnist.js:25403
Uncaught (in promise) Error: Failed to link vertex and fragment shaders. tfjs-tsne-mnist.js:25403
at Object.linkProgram (tfjs-tsne-mnist.js:25403)
at Object.createVertexProgram (tfjs-tsne-mnist.js:28154)
at Object.createBruteForceKNNProgram (tfjs-tsne-mnist.js:28422)
at KNNEstimator.initilizeCustomWebGLPrograms (tfjs-tsne-mnist.js:28633)
at new KNNEstimator (tfjs-tsne-mnist.js:28586)
at TSNE. (tfjs-tsne-mnist.js:29959)
at step (tfjs-tsne-mnist.js:29884)
at Object.next (tfjs-tsne-mnist.js:29851)
at fulfilled (tfjs-tsne-mnist.js:29818)

System Info:
Chrome v67.0.3396.99 Official Build 64-bit
MacBook Pro 13" 2016 High Sierra 10.13.5
Intel Iris Graphics 550 1.5GB

The example is linked to from this page to this

"Tensor is disposed" error in synthetic_data example

I observe an error when running the synthetic data example:

cd examples/synthetic_data
yarn
yarn watch

Specifically, it appears that the reshape op in the generateData function returns a disposed Tensor:

Uncaught (in promise) Error: Tensor is disposed.
    at Tensor.throwIfDisposed (tensor.js:253)
    at Tensor.add (tensor.js:386)
    at tf.tidy (index.js:45)
    at Object.Tracking.tidy (tracking.js:36)
    at generateData (index.js:42)
    at start (index.js:30)
    at HTMLDocument.document.addEventListener (index.js:123)

I confirmed that tf.version_core == "0.10.1" as expected.

Update NPM build

It has been 4 months since the last version of tfjs-tsne was published on NPM and the project has received some useful fixes. Is it possible to publish a new refreshed build on NPM?

Support for tfjs-node

Are there any plans to make tfjs-tsne to work with tfjs-node in order to use it in an non browser environment ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.