Coder Social home page Coder Social logo

fdavidcl / ruta Goto Github PK

View Code? Open in Web Editor NEW
37.0 5.0 5.0 15.32 MB

Unsupervised Deep Architechtures in R

Home Page: https://deivi.ch/ruta

License: GNU General Public License v3.0

R 100.00%
deep-learning deep-neural-networks autoencoders unsupervised-learning r

ruta's Introduction

Ruta

Software for unsupervised deep architectures

R language Downloads Travis license


Get uncomplicated access to unsupervised deep neural networks, from building their architecture to their training and evaluation

Get started

How to install

In order to develop Ruta models, you will need to install its dependencies first and then get the package from CRAN.

Dependencies

Ruta is based in the well known open source deep learning library Keras and its R interface, which is integrated in Tensorflow. In order to install them easily, you can use the keras::install_keras() function. Depending on whether you want to use the system installation, a Conda environment or a Virtualenv, you may need to call use_condaenv() or use_virtualenv() from reticulate.

Another straightforward way to install these dependencies is to use global system-wide (sudo pip install) or user-wide (pip install --user) installation with pip. This is generally not recommended unless you are sure you will not need alternative versions or clash with other packages. The following shell command would install all libraries expected by Keras:

$ pip install --user tensorflow tensorflow-hub tensorflow-datasets scipy requests pyyaml Pillow h5py pandas pydot

Otherwise, you can follow the official installation guides:

Check whether Keras is accesible from R by running:

keras::is_keras_available() # should return TRUE

Ruta package

From an R interpreter such as the R REPL or the RStudio console, run one of the following commands to get the Ruta package:

# Just get Ruta from the CRAN
install.packages("ruta")

# Or get the latest development version from GitHub
devtools::install_github("fdavidcl/ruta")

All R dependencies will be automatically installed. These include the Keras R interface and purrr.

First steps

The easiest way to start working with Ruta is to use the autoencode() function. It allows for selecting a type of autoencoder and transforming the feature space of a data set onto another one with some desirable properties depending on the chosen type.

iris[, 1:4] |> as.matrix() |> autoencode(2, type = "denoising")

You can learn more about different variants of autoencoders by reading A practical tutorial on autoencoders for nonlinear feature fusion.

Ruta provides the functionality to build diverse neural architectures (see autoencoder()), train them as autoencoders (see train()) and perform different tasks with the resulting models (see reconstruct()), including evaluation (see evaluate_mean_squared_error()). The following is a basic example of a natural pipeline with an autoencoder:

library(ruta)

# Shuffle and normalize dataset
x <- iris[, 1:4] |> sample() |> as.matrix() |> scale()
x_train <- x[1:100, ]
x_test <- x[101:150, ]

autoencoder(
  input() + dense(256) + dense(36, "tanh") + dense(256) + output("sigmoid"),
  loss = "mean_squared_error"
) |>
  make_contractive(weight = 1e-4) |>
  train(x_train, epochs = 40) |>
  evaluate_mean_squared_error(x_test)

For more details, see other examples and the documentation.

ruta's People

Contributors

fcharte avatar fdavidcl avatar hadley avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ruta's Issues

Add deconvolution layers

Current workaround:

deconv <- function(filters, kernel_size, ...) {
  layer_keras("conv_2d_transpose", filters = filters, kernel_size = kernel_size, ...)
}

Global weight decay

Maybe with global defaults for a whole network? Regularizers, activations and so on

Refactoring

  • Check which methods are exported or whether to export them at all
  • Method consistency
  • Clean code

Interpret latent space

Hi - in general, I'd like to use your package for topicmodeling on large scale text collections. I managed to get document level vectors with this code.

network <-
  input() +
  dense(256, "elu") +
  variational_block(4, seed = 42, epsilon_std = .5) +
  dense(256, "elu") +
  output("sigmoid")

learner <- autoencoder_variational(network, loss = "binary_crossentropy")
model <- learner %>% train(x_train_mat, epochs = 5)

# summary(learner$network)

inputs <- get_layer(model$models[[1]], index = 1)$input
latent_space <- get_layer(model$models[[1]], index = 7)$output

latent_model <- keras_model(
  inputs = inputs,
  outputs = latent_space
)

Via PCA the output looks like this:

image

Now I have two questions:

  1. Do you know whether there is a better way of interpreting the latent space? LDA outputs topic proportions. I thought maybe a sigmoid or (softmax for sparsity) activation at the bottleneck. But this way the model does not learn anything useful.

  2. Do you know how I could get word-level features that are associated with the each latent dimension?

Thanks in advance!

Best Simon

(PS: if you prefer a fully reproducible example, please let me know).

Refactoring

  • Reconstruction error name in losses
  • Loss names
  • object, x, and other parameter names in generics
  • Allow for more parameters in layers (e.g. dropout)

Machine-readable parameters and arguments

Provide an API or similar which specifies all possible argument values for each parameter.

  • Layers
  • Networks
  • Autoencoder models
  • Training
  • Regularizations
  • Other functionalities

Variational autoencoder error (ruta)

I tried to run the code for VAE - just copy pasting in RStudio - and I got the following error when executing the following command.

model <- learner %>% train(x_train, epochs = 5)

Error in py_call_impl(callable, dots$args, dots$keywords) :
_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'z_mean_3/Identity:0' shape=(None, 3) dtype=float32>, <tf.Tensor 'z_log_var_3/Identity:0' shape=(None, 3) dtype=float32>]

Any clue on what I am doing wrong? Thanks.

PS: Figured out from more reading. Needed the following lines:
if (tensorflow::tf$executing_eagerly())
tensorflow::tf$compat$v1$disable_eager_execution()

Working now.

Thanks.

Variational autoencoder is broken

Variational autoencoder tutorial gives the following error when it's run:

stop(structure(list(message = "TypeError: in user code:\n\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/training.py\", line 1021, in train_function *\n return step_function(self, iterator)\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/training.py\", line 1010, in step_function **\n outputs = model.distribute_strategy.run(run_step, args=(data,))\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/training.py\", line 1000, in run_step **\n outputs = model.train_step(data)\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/training.py\", line 860, in train_step\n loss = self.compute_loss(x, y, y_pred, sample_weight)\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/training.py\", line 918, in compute_loss\n return self.compiled_loss(\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/compile_utils.py\", line 239, in __call__\n self._loss_metric.update_state(\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/utils/metrics_utils.py\", line 70, in decorated\n update_op = update_state_fn(*args, **kwargs)\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/metrics.py\", line 178, in update_state_fn\n return ag_update_state(*args, **kwargs)\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/metrics.py\", line 455, in update_state **\n sample_weight = tf.__internal__.ops.broadcast_weights(\n File \"/home/david/.local/share/r-miniconda/envs/r-reticulate/lib/python3.8/site-packages/keras/engine/keras_tensor.py\", line 254, in __array__\n raise TypeError(\n\n TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(), dtype=tf.float32, name=None), name='Placeholder:0', description=\"created by layer 'tf.cast_5'\"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as `tf.cond`, `tf.function`, gradient tapes, or `tf.map_fn`. Keras Functional model construction only supports TF API calls that *do* support dispatching, such as `tf.math.add` or `tf.reshape`. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer `call` and calling that layer on this symbolic input/output.\n", 
call = py_call_impl(callable, dots$args, dots$keywords), 
cppstack = structure(list(file = "", line = -1L, stack = c("/home/david/R/x86_64-pc-linux-gnu-library/4.1/reticulate/libs/reticulate.so(Rcpp::exception::exception(char const*, bool)+0x74) [0x7f06a41c1524]", 
"/home/david/R/x86_64-pc-linux-gnu-library/4.1/reticulate/libs/reticulate.so(Rcpp::stop(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x29) [0x7f06a41b0bc4]", ...

Until this is fixed, users should implement variational autoencoders directly in Keras: https://keras.rstudio.com/articles/examples/variational_autoencoder.html. Sorry for any inconvenience.

Calculate correct layer for encoder in variational models

Seen in #32 that ruta's encoder does not have all necessary layers.

x_train_mat <- matrix(1:200, nrow = 10)

network <-
  input() +
  dense(256, "elu") +
  variational_block(4, seed = 42, epsilon_std = .5) +
  dense(256, "elu") +
  output("sigmoid")

learner <- autoencoder_variational(network, loss = "binary_crossentropy")
model <- learner %>% train(x_train_mat, epochs = 5)

latent_model <- model$models$encoder

## Correct:
inputs <- keras::get_layer(model$models[[1]], index = 1)$input
latent_space <- keras::get_layer(model$models[[1]], index = 7)$output

latent_model <- keras::keras_model(
  inputs = inputs,
  outputs = latent_space
)

Intelligent layer activations

When defining a network through an integer vector, decide on layer activations instead of leaving all as "linear"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.