Coder Social home page Coder Social logo

gamo's Introduction

GAMO: Generative Adversarial Minority Oversampling

The following is an implementation of an end-to-end deep oversampling approach for feature extraction-classification in presence of class imbalance in image dataset. The algorithm can be described as a game between three players, where a classifier performs its usual actions, a generator attempts to create convex combination of points inside a class which are likely to be misclassified by the classifier, and a discriminator which enforces the generator to adhere the class distribution.

Reference

@InProceedings{Mullick_2019_ICCV,
author = {Mullick, Sankha Subhra and Datta, Shounak and Das, Swagatam},
title = {Generative Adversarial Minority Oversampling},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
} 

Data and code files

The GAMO framework can be used on pre-computed feature vectors or flattened image. Additionally, GAMO can extract useful convolutional features from images by itself in an end-to-end fashion. To illustrate both of these features of GAMO framework we have provided a couple of exemplary codes, respectively applicable on MNIST (flattened image is taken as features) and Fashion-MNIST (deep convolutional features are extracted, where the network is simultaneously trained with the classifier).

Dependencies:

You can use either python2.7 (and above) or python3 as per your choice. Additionally you will need keras (with any backend), scikit-learn, scipy, numpy, os, sys, opencv, pickle, matplotlib as supporting libraries.

Data preparation:

Neither MNIST nor Fashion-MNIST are sufficiently imbalanced in nature to test the efficacy of GAMO. Therefore,we subsample from the different classes and form a new training set with an imbalance ratio of 100. You can download MNIST and Fashion-MNIST from the sources, convert it to csv and respectively run the preprocessing script MNIST_process.py and fMNIST_process.py to create training and test sets which are similar to those used in our experiments.

MNIST codes:

  • Data files (generated by MNIST_process.py):
    • Mnist_100_testData.csv (testing dataset)
    • Mnist_100_trainData.csv (training dataset)
  • Code files:
    • dense_net.py (network models)
    • dense_suppli.py (supplementary functions, for data loading, pre-processing, performance evaluation etc.)
    • dense_gamo_main.py (main code file for GAMO)

Fashion-MNIST codes:

  • Data files (generated by fMNIST_process.py):
    • fMnist_100_testData.csv
    • fMnist_100_trainData.csv
  • Code Files: (the naming convention follows from the MNIST)
    • fashion_mnist_net.py
    • fashion_mnist_suppli.py
    • fashion_mnist_gamo_main.py

GAMO2pix:

A tool to visualize the feature vectors generated by GAMO in the original image space. This may be useful for an application which explicitly requires the artificially generated images, in addition to the trained classifier. Here, as an example we provide the code only for the Fashion-MNIST.

  • Code files:
    • fashion_mnist_gamo2pix_net.py
    • fashion_mnist_gamo2pix__main.py
  • Additional requirements:
    • fashion_mnist_net.py
    • fashion_mnist_suppli.py
    • trained generator network and class-specific generator output processing networks
    • trained convolutional network
    • training and test dataset

For example, some of the generated images for CIFAR10, Fashion-MNIST, and SVHN when visualized by GAMO2pix are as follows, where the imbalance ratio compared to the majority class decreases from the top:

gamo's People

Contributors

sankhasubhra avatar

Watchers

James Cloos avatar Sourav Das avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.