kipoi / website Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 3.0 80.89 MB

http://kipoi.org

Python 4.92% HTML 8.38% CSS 27.15% JavaScript 55.05% Makefile 0.05% Shell 0.40% Dockerfile 0.01% Jupyter Notebook 4.04%

website's Introduction

Kipoi: Model zoo for genomics

This repository implements a python package and a command-line interface (CLI) to access and use models from Kipoi-compatible model zoo's.

Installation

Kipoi requires conda to manage model dependencies. Make sure you have either anaconda (download page) or miniconda (download page) installed. If you are using OSX, see Installing python on OSX. Maintained python versions: >=3.6<=3.10.

Install Kipoi using pip:

pip install kipoi

Known issue: h5py

For systems using python 3.6 and 3.7, pretrained kipoi models of type kipoi.model.KerasModel and kipoi.model.TensorflowModel which were saved with h5py <3.* are incompatible with h5py >= 3.*. Please downgrade h5py after installing kipoi

pip install h5py==2.10.0

This is not a problem with systems using python >=3.8<=3.10. More information available here

For systems using python >=3.8<=3.10, it is necessary to install hdf5 and pkgconfig prior to installing kipoi.

conda install --yes -c conda-forge hdf5 pkgconfig

Quick start

Explore available models on https://kipoi.org/groups/. Use-case oriented tutorials are available at https://github.com/kipoi/examples.

Installing all required model dependencies

Use kipoi env create <model> to create a new conda environment for the model. You can use the following two commands to create common environments suitable for multiple models.

kipoi env create shared/envs/kipoi-py3-keras2-tf1
kipoi env create shared/envs/kipoi-py3-keras2-tf2
kipoi env create shared/envs/kipoi-py3-keras1.2

Before using a model in any way, activate the right conda enviroment:

source activate $(kipoi env get <model>)

Using pre-made containers

Alternatively, you can use the Singularity or Docker containers with all dependencies installed. Singularity containers can be seamlessly used with the CLI by adding the --singularity flag to kipoi predict commands. For example: Look at the sigularity tab under this. Alternatively, you can use the docker containers directly. For more information: Look at the docker tab under any model web page on kipoi.org such as this. We are currently offering two types of docker images. A full sized version (under the banner Get the full sized docker image) comes with conda pre-installed along with model (group) specific dependencies. Use this if you plan to experiment with conda funcitonalities. A slim sized version (under the banner Get the docker image) comes with all dependencies installed. However, it does not come with a working conda packge manager. Use the slim versions if you plan to use it for kipoi related tasks only.

A note about installing singularity

Singularity has been renamed to Apptainer. However, it is also possible to use SingularityCE from Sylabs. Current versions of kipoi containers are compatible with the latest version of Apptainer (1.0.2) and SingularityCE 3.9. Install Apptainer from here or SingularityCE from here.

Python

Before using a model from python in any way, activate the right conda enviroment:

source activate $(kipoi env get <model>)

import kipoi

kipoi.list_models() # list available models

model = kipoi.get_model("Basset") # load the model

model = kipoi.get_model(  # load the model from a past commit
    "https://github.com/kipoi/models/tree/<commit>/<model>",
    source='github-permalink'
)

# main attributes
model.model # wrapped model (say keras.models.Model)
model.default_dataloader # dataloader
model.info # description, authors, paper link, ...

# main methods
model.predict_on_batch(x) # implemented by all the models regardless of the framework
model.pipeline.predict(dict(fasta_file="hg19.fa",
                            intervals_file="intervals.bed"))
# runs: raw files -[dataloader]-> numpy arrays -[model]-> predictions

For more information see: notebooks/python-api.ipynb and docs/using/python

Command-line

$ kipoi
usage: kipoi <command> [-h] ...

    # Kipoi model-zoo command line tool. Available sub-commands:
    # - using models:
    ls               List all the available models
    list_plugins     List all the available plugins
    info             Print dataloader keyword argument info
    get-example      Download example files
    predict          Run the model prediction
    pull             Download the directory associated with the model
    preproc          Run the dataloader and save the results to an hdf5 array
    env              Tools for managing Kipoi conda environments

    # - contributing models:
    init             Initialize a new Kipoi model
    test             Runs a set of unit-tests for the model
    test-source      Runs a set of unit-tests for many/all models in a source
    
    # - plugin commands:
    interpret        Model interpretation using feature importance scores like ISM, grad*input or DeepLIFT

# Run model predictions and save the results
# sequentially into an HDF5 file
kipoi predict <Model> --dataloader_args='{
  "intervals_file": "intervals.bed",
  "fasta_file": "hg38.fa"}' \
  --singularity \
  -o '<Model>.preds.h5'

Explore the CLI usage by running kipoi <command> -h. Also, see docs/using/cli/ for more information.

Configure Kipoi in `.kipoi/config.yaml`

You can add your own (private) model sources. See docs/using/03_Model_sources/.

Contributing models

See docs/contributing getting started and docs/tutorials/contributing/models for more information.

Plugins

Kipoi supports plug-ins which are published as additional python packages. Currently available plug-in is:

kipoi_interpret

Model interpretation plugin for Kipoi. Allows to use feature importance scores like in-silico mutagenesis (ISM), saliency maps or DeepLift with a wide range of Kipoi models. example notebook

pip install kipoi_interpret

Variant effect prediction with a subset of Kipoi models

Variant effect prediction allows to annotate a vcf file using model predictions for the reference and alternative alleles. The output is written to a new tsv file. For more information see https://github.com/kipoi/kipoi-veff2.

Documentation

kipoi.org/docs

Tutorials

https://github.com/kipoi/examples - Use-case oriented tutorials
notebooks

Citing Kipoi

If you use Kipoi for your research, please cite the publication of the model you are using (see model's cite_as entry) and the paper describing Kipoi: https://doi.org/10.1038/s41587-019-0140-0.

@article{kipoi,
  title={The Kipoi repository accelerates community exchange and reuse of predictive models for genomics},
  author={Avsec, Ziga and Kreuzhuber, Roman and Israeli, Johnny and Xu, Nancy and Cheng, Jun and Shrikumar, Avanti and Banerjee, Abhimanyu and Kim, Daniel S and Beier, Thorsten and Urban, Lara and others},
  journal={Nature biotechnology},
  pages={1},
  year={2019},
  publisher={Nature Publishing Group}
}

Development

If you want to help with the development of Kipoi, you are more than welcome to join in!

For the local setup for development, you should install all required dependencies using one of the provided dev-requirements(-py<36|37>).yml files

For systems using python 3.6/3.7:

conda env create -f dev-requirements-py36.yml --experimental-solver=libmamba
or
conda env create -f dev-requirements-py37.yml --experimental-solver=libmamba
conda activate kipoi-dev
pip install -e .
git lfs install

For systems using python >=3.8<=3.10:

conda create --name kipoi-dev python=3.8 (or 3.9, 3.10)
conda activate kipoi-dev
conda env update --name kipoi-dev --file dev-requirements.yml --experimental-solver=libmamba 
pip install -e .
conda install -c bioconda cyvcf2 pybigwig
git lfs install

A note about cyvcf2 and pybigwig

For python >= 3.10, cyvcf2 and pybigwig are not available in conda yet. Install them from source like here and here instead. I will recommend against installing them using pip as it may lead to unexpected inconsistencies.

You can test the package by running py.test.

If you wish to run tests in parallel, run py.test -n 6.

License

Kipoi is MIT-style licensed, as found in the LICENSE file.

website's People

Contributors

Stargazers

Watchers

Forkers

codeaudit lauradmartens paul-schaaf

website's Issues

Add basic information to the webpage

add overview figure and a short description of what Kipoi is, above the model
Have an about/ page
Have a links on top to:
- kipoi documentation - 'How to contribute?'
- github repo (models)
- github repo (kipoi)
- github repo (this webpage)

Create about link button in the nav bar linking to future about subpage

Add memcache

I propose to use Flask-Cache with the simply cache backend (all in memory):

http://pythonhosted.org/Flask-Cache/

We should also cache the slowest and most frequently used function:

kipoi.get_source("kipoi").list_models or kipoi.list_models
- https://github.com/kipoi/website/blob/master/app/models/views.py#L81

URL's in cite_as clickable in model and group views

The links are currently not clickable:

webpage not fully responsive on mobile

Have a front-page similar to encode

https://www.encodeproject.org/

Make charts on how many models there are in the zoo:

Number of models in the zoo: X
Number of model groups: z
Framework composition: x
Fraction of models supporting variant effect prediction y: ....
Fraction of model tags...

Logos in nav bar and on the landing page are not consistent

I think that both logos should have same orientation of the leaves.

Update the twitter feed on the website to the KipoiZoo user

Styling: Shadows

I would suggest taking away the shadow of the boxes in the detailed view of a model. especially with lower screen resolution this creates a lot of padding between the boxes. Also in my opinion less is more. Same for the idle state of "Type" and "Postprocessing" tags. I like the shadow on mouse-over though.

Add new ModelInfo fields to model list view and model detail view

The following PR - kipoi/kipoi#101 added new info fields to:

model.info (all strings) and pd.DataFrame returned by kipoi.list_models():
- cite_as (string)
- license (string)
- trained_on (string)
- training_procedure (string)
dataloader.info
- license (string)
model_group:
- cite_as (set of strings)
- license (set of strings)

Updates:

cite_as and license columns should be displayed on the model group page (both a set of strings so they should be concatenated with ,. They should appear after the Type column.
Also, cite_as and license should be displayed in the model list view after Type
Detail view
- top-right: add cite_as, license, trained_on to model arguments
- Dataloader: add license

Add a model list button

It would be nice if the user could choose between viewing all the models (just as listed with list_model()) and between viewing the grouped models.

Toggle could be implemented either via some toggle button, or by adding a button "Model List" to the top:

On very high-res laptop screens the nav bar is too small

Laptop: Lenovo 470s

Suggestion: Tree-like list representation of models

For models where the same architecture is used in combination with X different weights e.g. CpGenie it would be nicer to have them in the overview only as one CpGenie that can then be expanded to the individual models. Might be enough to do that in the list view, so no info page for the "meta-model" has to created. A rule as simple as, if there is a "/" in the model name then create a tree in the list view of models.

User stories

The description of this issue will contain user stories for the webpage.

User

I as a user, would like to:

..., so that I can, ...
see all the available models, so that I can choose the right one.
search the available models
see the additional information about the chosen model
- author,
- license,
- publication (with URL),
- description,
- input, output modalities,
- required input files (with downloadable examples)
- framework (e.g. Keras)
- dependencies
- size
copy the CLI and python code to pull a chosen model locally to quickly make predictions from the
write and read comments about the mdoel
see what functionality is available for this model (variant effect prediction, ...)
explore the hierarchical structure of the model zoo (already fulfilled by github, need a link to github)

Contributor

I as a contributor, would like to:

have the information about the model displayed correctly on the webpage
- all hyperlinks to the papers working

Front page is not rendering correctly

Model image is overlapping the text in certain browsers
Problem is addressed in #13 (comment)
Webpage is extended over the dimensions of screen size
Graphs are not responsive

Unify repositories link texts from nav bar and footer

Write the single-model view

Described in #1

Add information on how to install the dependencies to code snippets

Add information on how to install the dependencies to code snippets
add also source activate .. to kipoi env create
- run this before testing
add kipoi.pipeline.predict to the snippets (as for the example)
- to python and R
add score variants (optionally)

Update disqus url to kipoi.org

https://github.com/kipoi/website/blob/master/app/models/templates/models/model_details.html#L149

Host the webapp

write your proposal about where to host it
- I think we should choose something fully managed for the beginning (http://flask.pocoo.org/docs/0.12/deploying/):
  - heroku
  - google app engine?

domain: kipoi.org

Content - views

Regarding content: I think we should have 2 views:

Model list view

URL: /

lists all the models in a table
- table should have a search function - https://datatables.net/
- columns of the table would be similar to the output of kipoi.list_models() python-sdk.ipynb

It would be also nice to have the models represented in a hirearchical directory-like structure.

Single model view

URL: /models/<model_path>

The following information should be shown in individual <div> blocks:

Shows the information about the model.
- closely resembles the model.yaml file
- link to the github folder
- maybe show model the number of model downloads
- everything that can have a URL should have it (like other web pages, paper DOI or user's github username)
README.md of the model
Nicely rendered dataloader.yaml
Small code snippet of how to use it in bash or python (with button - copy to clipboard)
- kipoi predict <model> --source=kipoi ...
Comments (maybe require a github account to comment)

https://disqus.com

Twitter feed (content related to the model)

as on https://www.biorxiv.org/content/early/2017/12/19/171827

Comments

krrome's comment

Thanks @Avsecz I don't have much to add to that.. My thoughts:

A nice short introduction into what Kipoi is / what the aims are and what we think it should be used for. with a very simple flow chart

The serchable model overview table sounds great.

in the single model view:

Again, I agree with Ziga, plus:

we should have only one model on the page and not concatenate the detailed views of all models.

how about we generate (at least) most of the CLI commands for the respective model so that people can just click "copy to clipboard" and that's all they have to do run things.

Download tracker:

How can we keep track of model downloads? We would need to add that to the model API and have a database somewhere where we can dump that info... Alternative to the DB we could have a simple REST backend on the web server that saves download stats in a file..

Avsecz's comment:

Agree.

Download tracker: I don't know yet how, but a request call to the webserver for every "model pull" sounds like a great idea. We can setup a simple Flask REST API with Mongo backend.

Write the model-list view

searchable data.table

code snippets: Copy button doesn't work with `<-` use equal signs in R

Group by models function

implement a function that will return a pandas DataFrame for model groups to be displayed on the front page

Use external (bigger) search bar for filtering model list.

Ideally, a bigger search bar should be used. Maybe even in the navigation. Also add autofocus.

Add N_models to model groups table

@gagneur suggested changes for http://www.kipoi.org/groups/:

Model name -> Model group
Add a column Models containing the variable N_models

Use flask-frozen and host the website on github

Idea: we could freeze our web-app and host it only through github. This would allow us to do continuous updates and don't have to worry about updates

tags not displayed

Host the app with nginx

we should be able to handle at least 100-1000 users in parallel with ease

Detailed model descriptions

I like the detailed model descriptions. Some models are not complete yet (e.g. input/output missing), but I guess this is expected at this stage.

I find the color code a bit confusing, since blue and orange seem to be assigned randomly, or is there a reason behind it?

I would call it "Output" instead of "Targets" and "Scheme/Outline" instead of "Schema".

Perhaps, the dimensions of the dataframes ("Shape") could be more nicely described or visualised?

Is the abstract always written by one of us? If not, I guess we would have to refer to the respective source. In general, for already published models, it would be good to include the references.

I would avoid the two "buttons" "type" and "postprocessing" and just directly write the details so that they are visible from the beginning. This could be added to the input part of the "Schema"?

add google analytics

Make the barcharts clickable

logo + background selection for top bar

Using the exact same logo-blue as a background makes the top leaf invisible:

keeping the background and inverting of the top leaf looks horrendous:

but I think is selecting the logo colour as a background and darkening it by reducing brightness in the HSB colour space:

Problems with the responsiveness in certain views

Landing page html needs some refactoring
On group and model list view when search bar is shown the nav-bar is not responsive
On some screen sizes, the table overflows the desired width
Getting started title is out of alignment
Search bar wraps when navbar collapses and repository buttons are not placed correctly

Design update proposal

In case we go for this logo:

we could update the navbar to grey/brown color like here: https://mdbootstrap.com/previews/templates/portfolio/works-4-columns.html

color: #44474b

https://mdbootstrap.com/css/colors/#dark-theme

Upgrade to a larger server and stress-test with loader.io

Minor model-view improvements

Improvements

Title

Make a link on 'KIPOI project:' to the 'Model list'.
Rename 'KIPOI project' to 'Kipoi' in bold

Dataloader

Rename the title 'Dataloaders' to 'Dataloader'
Rename 'Default dataloader' to 'Relative path'
Highlight optional arguments
We should highlight somehow that some of the dataloader arguments are optional:

Shall we just use:
target_file (optional): .... ?

Create the twitter account and update the feed on the website

Remove `/kipoi/` from the url

This was historically done for allowing different model sources. However, since only one source at the time can be used, this url doesn't add much

Choose the bootstrap-based template

the style should be slick, not to flashy, something in the material design direction

Make models and docs more obvious as buttons

Pytorch has a nicer navbar: http://pytorch.org/

Maybe we could also use the highlighted button:

Tooltips not rendered correctly (Safari)

In safari the model-type and model postprocessing tool tips in the detailed model views are not rendered properly in Safari 9.1.2, the title is just displayed as it is defined in the <a> element.

I guess that relates to the following error:

jQuery.Deferred exception: $('[data-toggle="tooltip"]').tooltip is not a function. (In '$('[data-toggle="tooltip"]').tooltip()', '$('[data-toggle="tooltip"]').tooltip' is undefined) (2)
"http://ec2-54-87-147-83.compute-1.amazonaws.com/model/kipoi/Q3BHZW5pZS9BNTQ5X0VOQ1NSMDAwRERJ:394:49
j@http://ec2-54-87-147-83.compute-1.amazonaws.com/static/js/jquery-3.2.1.min.js:2:30004
http://ec2-54-87-147-83.compute-1.amazonaws.com/static/js/jquery-3.2.1.min.js:2:30314"
undefined

Setup the layout and design - base.html

Colors in the navbar and footer are not consistent

Fix: convert ": Model zoo for genomics" to text

in #48 on the main page the ": Model zoo for genomics" should ideally be transformed in the logo + in text ": Model zoo for genomics" in proper alignment.

specify the model source through the cli

this will allow hosting a private model-zoo website

Structuring/Explaining of the models

Most issues that I had with the homepage were already mentioned in an issue. I also think that it would be too complicated to see all detailed models at the beginning. This might deter biologists from using kipoi. It would be great to have an overview of the models (perhaps boxes in different colors that contain the name of the name model with a short subtitle) that you can then choose to see a detailed explanation of the biological background (i.e. not yet input/output dimension, model framework etc., but just what the model aims to deliver in terms of biological understanding). If you choose the model then, a structure similar to the one now could appear that will let you choose the final model with all the details.

Fix the navbar when scrolling

Technical requirements

code should be well structured and well documented
- another web-dev should be able to easily continue working on the project without any supervision
overview documentation should be written in the repo wiki
webpage should be easily deployable by someone else

Bug: Some model schemas don't render correctly

Example: http://www.kipoi.org/models/deepTarget/