Coder Social home page Coder Social logo

andiac / grammarvae Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mkusner/grammarvae

0.0 2.0 0.0 134.57 MB

Code for the "Grammar Variational Autoencoder" https://arxiv.org/abs/1703.01925

Makefile 0.01% C++ 1.94% Python 91.27% Shell 0.09% HTML 0.28% TeX 0.95% CSS 0.03% Jupyter Notebook 1.77% Batchfile 0.01% Gnuplot 0.01% C 1.47% Cuda 2.16%

grammarvae's Introduction

Grammar Variational Autoencoder

This repository contains training and sampling code for the paper: Grammar Variational Autoencoder.

Requirements

Install (CPU version) using pip install -r requirements.txt

For GPU compatibility, replace the fourth line in requirements.txt with: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp27-none-linux_x86_64.whl

Creating datasets

Molecules

To create the molecule datasets, call:

  • python make_zinc_dataset_grammar.py
  • python make_zinc_dataset_str.py

Equations

The equation dataset can be downloaded here: grammar, string

Training

Molecules

To train the molecule models, call:

  • python train_zinc.py % the grammar model
  • python train_zinc.py --latent_dim=2 --epochs=50 % train a model with a 2D latent space and 50 epochs
  • python train_zinc_str.py

Equations

  • python train_eq.py % the grammar model
  • python train_eq.py --latent_dim=2 --epochs=50 % train a model with a 2D latent space and 50 epochs
  • python train_eq_str.py

Sampling

Molecules

The file molecule_vae.py can be used to encode and decode SMILES strings. For a demo run:

  • python encode_decode_zinc.py

Equations

The analogous file equation_vae.py can encode and decode equation strings. Run:

  • python encode_decode_eq.py

Bayesian optimization

The Bayesian optimization experiments use sparse Gaussian processes coded in theano.

We use a modified version of theano with a few add ons, e.g. to compute the log determinant of a positive definite matrix in a numerically stable manner. The modified version of theano can be insalled by going to the folder Theano-master and typing

  • python setup.py install

The experiments with molecules require the rdkit library, which can be installed as described in http://www.rdkit.org/docs/Install.html.

The Bayesian optimization experiments can be replicated as follows:

1 - Generate the latent representations of molecules and equations. For this, go to the folders

molecule_optimization/latent_features_and_targets_grammar/

molecule_optimization/latent_features_and_targets_character/

equation_optimization/latent_features_and_targets_grammar/

equation_optimization/latent_features_and_targets_character/

and type

  • python generate_latent_features_and_targets.py

2 - Go to the folders

molecule_optimization/simulation1/grammar/

molecule_optimization/simulation1/character/

equation_optimization/simulation1/grammar/

equation_optimization/simulation1/character/

and type

  • nohup python run_bo.py &

Repeat this step for all the simulation folders (simulation2,...,simulation10). For speed, it is recommended to do this in a computer cluster in parallel.

2 - Extract the results by going to the folders

molecule_optimization/

equation_optimization/

and typing

  • python get_final_results.py
  • ./get_average_test_RMSE_LL.sh

grammarvae's People

Contributors

andiac avatar chriscummins avatar mkusner avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.