Coder Social home page Coder Social logo

phymucs / clustering_vae Goto Github PK

View Code? Open in Web Editor NEW

This project forked from afagarap/pt-clustering-ae

0.0 2.0 0.0 18.96 MB

Code implementation for "Improved Clustering Performance using Latent Data Representation from Variational Autoencoder" by Agarap & Azcarraga (2019)

License: GNU General Public License v3.0

Python 0.12% Jupyter Notebook 99.88%

clustering_vae's Introduction

Improved Clustering Performance using Latent Data Representation from Variational Autoencoder

by Abien Fred Agarap and Arnulfo Azcarraga, PhD

Abstract

Clustering is a well-studied task in terms of development of efficient algorithms, better initialization algorithms, and better distance functions. Conventionally, datasets are used as they are, or with some normalization done before using them as input to a clustering algorithm. In this study, we present using the learned latent data representation of a variational autoencoder as input to a clustering algorithm. We compared the clustering on the original data representation, on the reconstructed data, and on the learned latent data representation. Results have shown that the best clustering was on the latent data representation, with a Davies-Bouldin Index (DBI) of 1.328, Silhoutte Score (SS) of 0.238, and Calinski-Harabasz Score (CHS) of ~3609, while using the original data representation gained a DBI of 1.811, SS of 0.155, and CHS of ~1269, both on the Fashion-MNIST dataset. We had similar findings on the MNIST dataset as the best clustering was on the latent data representation as well, with a DBI of 1.557, SS of 0.186, and CHS of ~1497, while using the original data representation gained a DBI of 2.857, SS of 0.061, and CHS of ~391. This stands to reason since the latent data representation includes only the most salient features of a data, and its variance is minimized through the optimization of the VAE model.

Results

Experiment Setup

Experiments were done in Google Colaboratory that provides a free access to Tesla K80 GPU with a 12GB RAM allotment. We trained the VAE model with Adam optimization on the MNIST dataset (60,000 training images) for 60 epochs with mini-batches of 512 which took 3 minutes and 35 seconds, and on the Fashion-MNIST dataset (60,000 training images) for 60 epochs with mini-batches of 512 which took 3 minutes and 42 seconds.

For clustering, we trained the k-Means clustering algorithm using the 60,000 training images (from both MNIST and Fashion-MNIST) in its (1) original data representation, (2) latent code representation (from VAE), and (3) reconstructed data representation (from VAE). Then, we used the trained clustering model on the Gaussian noise-augmented 10,000 test images in all three data representations as we did in training.

Visualization of Learned Representations

We visualize the reconstructed data by the VAE model on the MNIST and Fashion-MNIST datasets. Figure 1 shows the original data with Gaussian noise and a reconstructed sample from the trained VAE model on the MNIST dataset. Figure 2 shows the original data with Gaussian noise and a reconstructed sample from the trained VAE model on the Fashion-MNIST dataset. This finding corroborates the results from literature on the capability of VAE to denoise images

Figure 1. The top row consists of the MNIST images with added Gaussian noise while the bottom row are the reconstructed images using the trained VAE.

Figure 2. The top row consists of the Fashion-MNIST images with added Gaussian noise while the bottom row are the reconstructed images using the trained VAE.

We then visualize the learned representation of the VAE model for the MNIST and Fashion-MNIST datasets. To this end, we used t-SNE visualization to reduce the dimensionality of the learned latent representation z from 10 to 3, and the original data x and the reconstructed data from 784 to 3. Figures 3 and 4 show the scatter plot of the features. We can see in these figures that the clusters in the learned latent data representations are denser compared to the clusters in the original data and in the reconstructed data.

Figure 3. t-SNE visualization of the MNIST dataset with number of components = 3, and perplexity = 50. The left plot shows the t-SNE visualization for the original data, the middle for the reconstructed data, and the right for the latent code.

Figure 4. t-SNE visualization of the Fashion-MNIST dataset with number of components = 3, and perplexity = 50. The left plot shows the t-SNE visualization for the original data, the middle for the reconstructed data, and the right for the latent code.

Clustering on Samples

We used a k-Means clustering model on the MNIST and Fashion-MNIST datasets with the k-means++ initialization and with k = 10 since there are 10 classes in both MNIST and Fashion-MNIST, and trained it for 500 iterations per dataset input. The first input is the original data representation, the second is the reconstructed data by the VAE model, and the third is the latent code from the VAE model.

As we have seen in Figures 3 and 4, the clusters in the learned latent data representation are denser when compared to the clusters in the original data and in the reconstructed data. The clustering performance as shown in the Voronoi digrams (see Figure 5 and 6) is at its best when we use the latent data representation.

Figure 5. Voronoi diagram for the k-Means clustering algorithm prediction on the MNIST dataset. The left diagram shows the clustering on the original dataset, the middle for the reconstructed data, and the right for the latent code.

Figure 6. Voronoi diagram for the k-Means clustering algorithm prediction on the Fashion-MNIST dataset. The left diagram shows the clustering on the original dataset, the middle for the reconstructed data, and the right for the latent code.

We posit the following reasons for this density: (1) only the most salient features are included in the latent space representation of the data - a simpler and lower-dimension data representation - this is also the reason behind its denoising capability, and (2) as per 12, 24, 27, learning the approximateinference inevitably results to lower variance, influencing thedata points to cluster near their expected value9 โ€“ which consequently helps the objective of the clustering algorithm, i.e. variance minimization.

Although these visual results do provide some evidence that the clustering performance may be improved through the use of latent code, we proceed to a more explicit measurement of the clustering quality as laid down in Tables 1 and 2.

Table 1. Clustering Evaluation on the original data, reconstructed data, and latent code for the MNIST dataset.

Data Davies-Bouldin Index Silhouette Score Calinski-Harabasz Score
Original 2.857 0.061 391.245
Reconstructed 2.300 0.099 563.361
Latent Code 1.557 0.186 1497.287

Table 2. Clustering Evaluation on the original data, reconstructed data, and latent code for the Fashion-MNIST dataset.

Data Davies-Bouldin Index Silhouette Score Calinski-Harabasz Score
Original 1.811 0.155 1269.635
Reconstructed 1.419 0.216 2010.841
Latent Code 1.328 0.238 3609.196

A low Davies-Bouldin Index denotes better cluster separations, while both high Silhouette Score and high Calinski-Harabasz Score denote better-defined clusters. As we can see, in both MNIST and Fashion-MNIST dataset, the clustering was at its best on the latent code from the VAE.

License

Using latent code from variational autoencoder for clustering
Copyright (C) 2019  Abien Fred Agarap

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

clustering_vae's People

Contributors

afagarap avatar

Watchers

 avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.