Coder Social home page Coder Social logo

nzxyin / fiwgan-ciwgan Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gbegus/fiwgan-ciwgan

0.0 0.0 0.0 13.14 MB

fiwGAN/ciwGAN (Featural and Categorical InfoWaveGAN): Generative Adversarial Phonology and Semantics

License: Other

Python 96.81% Shell 3.19%

fiwgan-ciwgan's Introduction

fiwGAN (Featural InfoWaveGAN): Lexical Learning in Generative Adversarial Phonology

PAPER HERE: https://www.sciencedirect.com/science/article/pii/S0893608021001052

In fiwGAN.py. An architecture for modeling lexical learning from raw acoustic inputs called Featural InfoWaveGAN (fiwGAN) that combines Deep Convolutional GAN architecture for audio data (WaveGAN) with categorical variables in information theoretic proposal InfoGAN. Unlike InfoGAN, latent code is distributed binomially and the training is performed with sigmoid cross-entropy. Based on WaveGAN (Donahue et al. 2019) and InfoGAN (Chen et al. 2016), partially also on code by Rodionov (2018).

ciwGAN (Categorical InfoWaveGAN)

An architecture for modeling lexical learning from raw acoustic inputs called Categorical InfoWaveGAN that combines Deep Convolutional GAN architecture for audio data (WaveGAN) with categorical variables in information theoretic proposal InfoGAN.

Based on WaveGAN (Donahue et al. 2019) (https://github.com/chrisdonahue/wavegan) and WGAN-GP implementation of InfoGAN by Sergey Rodionov (https://github.com/singnet/semantic-vision/blob/master/experiments/concept_learning/gans/info-wgan-gp/10_originfo_sepQ_v2_lr1e-3/train.py).

In addition to the Generator and the Discriminator networks, the architecture introduces a network that learns to classify generated outputs and forces the Generator to encode lexical information in its latent space. Lexical and semantic encoding is represented with a set of categorical binary variables. The network is trained on five lexical items from TIMIT. The network learns to generate lexical items and encodes the identity of each item in categorical variables of the latent space. By manipulating the categorical variables in the latent space that encode lexical information, the network outputs the five lexical items, suggesting that each lexical item is represented with unique categorical code. Such representation can serve as the basis for lexical and semantic learning from raw acoustic input.

After 19,244 steps trained on oily, water, rag, suit and year from TIMIT, the network learns to output lexical items based on latent code. The following generated outputs are generated with the following values of c:

  1. [1, 0, 0, 0, 0]: suit
  2. [0, 1, 0, 0, 0]: year
  3. [0, 0, 1, 0, 0]: water
  4. [0, 0, 0, 1, 0]: oily
  5. [0, 0, 0, 0, 1]: rag

Audio sample 1 Audio sample 2

To change number of categorical latent variables:

--num_categ n

fiwgan-ciwgan's People

Contributors

chrisdonahue avatar gbegus avatar nzxyin avatar andimarafioti avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.