rs239 / velorama Goto Github PK

Gene regulatory network inference for RNA velocity and pseudotime data

License: MIT License

Python 100.00%

velorama's Introduction

Velorama - Gene regulatory network inference for RNA velocity and pseudotime data

Velorama is a Python library for inferring gene regulatory networks from single-cell RNA-seq data

It is designed for the case where RNA velocity or pseudotime data is available. Here are some of the analyses that you can do with Velorama:

infer temporally-causal regulator-target links from RNA velocity cell-to-cell transition matrices.

infer over branching/merging trajectories using just pseudotime data without having to manually separate them.

estimate the relative speed of various regulators (i.e., how quickly they act on the target).

Velorama offers support for both pseudotime and RNA velocity data.

Velorama is based on a Granger causal approach and models the differentiation landscape as a directed acyclic graph (DAG) of cells, rather than as a linear total ordering required by previous approaches.

API Example Usage

Velorama is currently offered as a command line tool that operates on AnnData objects. [Ed. Note: We are working on a clean API compatible with the scanpy ecosystem.] First, prepare an AnnData object of the dataset to be analyzed with Velorama. If you have RNA velocity data, make sure it is in the layers as required by CellRank and scVelo, so that transition probabilities can be computed. We recommend performing standard single-cell normalization procedures (i.e. normalize counts to the median per-cell transcript count and log transform the normalized counts plus a pseudocount). Next, annotate the candidate regulators and targets in the var DataFrame of the AnnData object as follows.

adata.var['is_reg'] = [n in regulator_genes for n in adata.var.index.values]
adata.var['is_target'] = [n in target_genes for n in adata.var.index.values]

Here regulator_genes is the set of gene symbols or IDs for the candidate regulators, while target_genes indicates the set of gene symbols or IDs for the candidate target genes. This AnnData object should be saved as {dataset}.h5ad.

We provide an example dataset here: mouse endocrinogenesis. This dataset is from the scVelo vignette and is based on the study by Bergen et al. (2020).

The below command runs Velorama, which saves the inferred Granger causal interactions and interaction speeds to a given directory.

velorama -ds $dataset -dyn $dynamics -dev $device -l $L -hd $hidden -rd $rd

Here, $dataset is the name of the dataset associated with the saved AnnData object. $dynamics can be "rna_velocity" or "pseudotime", depending on which data the user desires to use to construct the DAG. $device is chosen to be either "cuda" or "cpu". $rd is the name of the root directory that contains the saved AnnData object and where the outputs will be saved. Among the optional arguments, $L refers to the maximum number of lags to consider (default=5). $hidden indicates the dimensionality of the hidden layers (default=32).

We encourage you to report issues at our Github page ; you can also create pull reports there to contribute your enhancements. If Velorama is useful for your research, please consider citing bioRxiv (2022).

velorama's People

Contributors

Stargazers

Watchers

Forkers

gavin-lijy cryptovec tenlives rohitsinghlab

velorama's Issues

utils module not found

hi there,

thanks for the package which I will test soon. Regarding the installation I'd have the following feedback:

the utils.py is not found , so somehow the linking is not properly working or you have to specify it differently in the import statements? One workaround is to add the package path to the python path (sys.path) or simply run velorama inside the site-package/velorama.....egg/velorama folder

cheers
Daniel

feature request

Hi there,

I'd have another feature request as I computed pseudotimes with other tools and not scVelo. Therefore I'd have everything in my adata object, even an integrated embedding after batch correction which I'd like to use.

So it would be great to be able to specify pre-existing pca's or other embeddings to be used as well as the option to use non scVelo pseudotimes (set the iroot to None) as otherwise an error will be thrown.

kind regards
Daniel

running velorama

Hi,

I've encountered several issues running velorama

precomputed pseudotime has to be saved in adata.obs['pseudotime'] though scVelo saves it either as dpt_pseudotime or velocity_pseudotime
the ray.init in run.py kills the process on a cluster as it tries to spawn as many processes as cpus are detected, if this differs from the number that were requested the worker process is killed, so here an additional parameter to define the number of cores to be used would be helpful . the same is true for the memory if the user supplies less than 10GB, this should be anothe parameter
when creating is_target and is_reg as suggested by the code in the github repo, it's an adata.var entry of the same length with logical values, in the code it's simply taking the shape, so the status message prints the dimension of adata.var and not the actual number of regulators and targets. hope this is just the print statement and not affecting downstream analysis. For the proper print statement rather use X.sum() and Y.sum()

best
Daniel

setup.py calls sklearn instead of scikit-learn and prevents install

Hi,

I am reporting a potential issue with install/setup script. Velorama will not install natively on Linux with the following error. Does setup.py call a depreciated version of sklearn?

Collecting sklearn
  Using cached sklearn-0.0.post10.tar.gz (3.6 kB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      The 'sklearn' PyPI package is deprecated, use 'scikit-learn'
      rather than 'sklearn' for pip commands.
      
      Here is how to fix this error in the main use cases:
      - use 'pip install scikit-learn' rather than 'pip install sklearn'
      - replace 'sklearn' by 'scikit-learn' in your pip requirements files
        (requirements.txt, setup.py, setup.cfg, Pipfile, etc ...)
      - if the 'sklearn' package is used by one of your dependencies,
        it would be great if you take some time to track which package uses
        'sklearn' instead of 'scikit-learn' and report it to their issue tracker
      - as a last resort, set the environment variable
        SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True to avoid this error
      
      More information is available at
      https://github.com/scikit-learn/sklearn-pypi-package
      
      If the previous advice does not cover your use case, feel free to report it at
      https://github.com/scikit-learn/sklearn-pypi-package/issues/new
      [end of output]

subprocess-exited-with-error

Thank you for this package, I tried installing Velorama using
pip install velorama

But I get this error:
error: subprocess-exited-with-error

python setup.py egg_info did not run successfully.
exit code: 1

[8 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\ahmed\AppData\Local\Temp\pip-install-sgrrhble\sklearn_fa38950c8574492da0c4cb7e252dd14d\setup.py", line 10, in
LONG_DESCRIPTION = f.read()
File "C:\Users\ahmed\miniconda3\envs\my_project\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 7: character maps to
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Encountered error while generating package metadata.

See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I tried to upgrade pip, wheel, setuptools, but it didn't work.
Any help will be appreciated
Thank you

SERGIO simulation data with technical noise or not？

Hi,
Very nice work! I noticed that you used SERGIO to generate simulated data for benchmarking, does the simulated data have technical noise added to it?

If you added technical noise, could you please provide specific parameters?
If you did not add technical noise, I would like to know whether this would have a big impact on the benchmark results？

rs239 / velorama Goto Github PK

velorama's Introduction

Velorama - Gene regulatory network inference for RNA velocity and pseudotime data

API Example Usage

velorama's People

Contributors

Stargazers

Watchers

Forkers

velorama's Issues

utils module not found

feature request

running velorama

setup.py calls sklearn instead of scikit-learn and prevents install

subprocess-exited-with-error

SERGIO simulation data with technical noise or not？

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent