Coder Social home page Coder Social logo

bimsbbioinfo / maui Goto Github PK

View Code? Open in Web Editor NEW
48.0 15.0 20.0 3.13 MB

Multi-omics Autoencoder Integration: Deep learning-based heterogenous data analysis toolkit

License: GNU General Public License v3.0

Python 10.28% Jupyter Notebook 89.72%
bioinformatics deep-learning autoencoder latent-factor-model cancer-genomics multi-omics

maui's Introduction

maui

maui

Downloads codecov Codacy Badge PyPI version Documentation Status Code style: black

Multi-omics Autoencoder Integration (maui) is a python package for multi-omics data analysis. It is based on a bayesian latent factor model, with inference done using artificial neural networks. For details, check out our LSA paper: https://www.life-science-alliance.org/content/2/6/e201900517

Installation

maui works with Python 3.6 and TensorFlow 1.1 (does not yet support the yet unreleased TensorFlow 2.0). The easiest way to install is from pypi:

pip install -U maui-tools

This will install all necessary dependencies including keras an tensorflow. The default tensorflow (cpu) will be installed. If tensorflow GPU is needed, please install it prior to installation of maui.

The development version may be installed by cloning this repo and running python setup.py install, or, using pip directly from github:

pip install -e git+https://github.com/BIMSBbioinfo/maui.git#egg=maui

Optional dependencies

Survival analysis functionality supplied by lifelines 1. It may be installed directly from pip using pip install lifelines.

Usage

See the vignette, and check out the documentation.

Citation

Evaluation of colorectal cancer subtypes and cell lines using deep learning. Jonathan Ronen, Sikander Hayat, Altuna Akalin. Life Science Alliance Dec 2019, 2 (6) e201900517; DOI: 10.26508/lsa.201900517

Contributing

Open an issue, send us a pull request, or shoot us an e-mail.

License

maui is released under the GNU General Public License v3.0 or later.


@jonathanronen, BIMSBbioinfo, 2018

maui's People

Contributors

al2na avatar borauyar avatar jonathanronen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

maui's Issues

installation

Hi @jonathanronen ,

Tried to install maui with pip, it didn't produce any errors. However, when I try importing maui and maui.tools it fails. Any help is appreciated.

(py3.6) pcddas@beagle:~/SOFTWARES$ python
Python 3.6.0 | packaged by conda-forge | (default, Feb 9 2017, 14:36:55)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.

import maui
Traceback (most recent call last):
File "", line 1, in
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/maui/init.py", line 1, in
from .model import Maui
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/maui/model.py", line 8, in
from .autoencoders_architectures import stacked_vae, deep_vae, train_model
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/maui/autoencoders_architectures.py", line 7, in
from keras.models import Model
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/keras/init.py", line 25, in
from keras import models
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/keras/models.py", line 19, in
from keras import backend
File "/home/pcddas/miniconda2/envs/py3.6/lib/python3.6/site-packages/keras/backend.py", line 36, in
from tensorflow.python.eager.context import get_config
ImportError: cannot import name 'get_config'

return factor importance value rather than filtering them

Hey Jona.

Could you maybe provide an option to return the actual R^2 values of each LF from the function maui.utils.filter_factors_by_r2 rather than returning a filtered matrix. Or, could it be made an option?
The user could sort LFs by this value for downstream applications. When filtered, importance scores/rankings are lost.

Include trained model from preprint and release all input data

Hi,

I was following the vignette and noticed that it makes some simplifications compared to the manuscript.

From what I can see, the repo is currently missing the smoothed mutation input data, which means one cannot train the model as you did for the preprint. This would be quite nice, of course, as for instance the Kaplan-Meyer analysis yields no difference between clusters going by the vignette.

Also, for reproducibility, I think it would be good to include the final model used in the preprint.

Kind regards,
Clemens

Sigmoid activation with inputs out of [0,1] range

Hi Jona @jonathanronen,

I was having another look at maui and now I have another question :)

As the final activation function, the model uses a sigmoid, so all output values will fall between 0 and 1.
On the other hand, inputs from RNA-seq are scaled, and at least in the vignette that leads to values outside of this range.

Is there any theoretical justification for this choice or did you choose it because it performs well in this setting? Did you try anything else, like the MinMaxScaler to start with [0,1] intervals for each feature?

Best wishes and thanks!
Clemens

fail to run the example code

when I run the code "time z = maui_model.fit_transform({'mRNA': gex, 'Mutations': mut, 'CNV': cnv})",I get the error : TypeError: compile() missing 1 required positional argument: 'loss'
Could you know where the trouble is ?

Smoothing mutation data with PPI networks

  1. I can't find the parameter alpha that was used to run netsmooth to generate the smoothed mutation matrix. It would be nice to include it in the manuscript.

  2. The choice of protein-protein network to use is essentially another hyperparameter. Would another PPI network perform similar? And even more interesting, how would a random network of similar density perform? Looking at the methods part and the code, I am be a bit concerned that batch normalizing the binary mutation input features (batch size n=50 per default) could be problematic. Am I missing something?

manuscript typos

Hi,

I was reading the preprint and just wanted to let you know some typos to fix for the next revision of the manuscript :)

  • adn
  • deleteions

Clemens

I can't reproduce vignette results

maui_vignette.zip

I like maui and explore the possibility of VAEs in a cancer subtyping project, but I am having a hard time to reproduce your vignette results. The issue arises at plotting the losses, I pretty much have no recorded loss, and from then on everything goes south, ROC curves are deflated etc. I wonder if it has anything to do with me using Keras 2.3? I got this user warning:

miniconda3/envs/iomics/lib/python3.7/site-packages/keras/engine/training_utils.py:819: UserWarning: Output reconstruction missing from loss dictionary. We assume this was done on purpose. The fit and evaluate APIs will not be expecting any data to be passed to reconstruction.
  'be expecting any data to be passed to {0}.'.format(name))

It almost looks as the fit function doesn't update weights from one epoch to another. Have you encountered such error? I attach my vignette, went a bit past the loss plot and then I stopped testing, but at teh end you can see my pip freeze.

support for scipy >= 1.3

scipy 1.3 introduced a rewrite of stats.pearsonr 1 which broke the test_utils.test_correlate_factors_and_features test.

This needs to be investigated - should only the test be rewritten, or the whole "feature correlations" thing?

Fix some sklearn warnings

lib/python3.6/site-packages/sklearn/base.py:420: FutureWarning: The default value of multioutput (not exposed in score method) will change from 'variance_weighted' to 'uniform_average' in 0.23 to keep consistent with 'metrics.r2_score'. To specify the default value manually and avoid the warning, please either call 'metrics.r2_score' directly or make a custom scorer with 'metrics.make_scorer' (the built-in scorer 'r2' uses multioutput='uniform_average').
  "multioutput='uniform_average').", FutureWarning)

This happens in drop_unexplanatory_factors() or in merge_similar_latent_factors().

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.