Coder Social home page Coder Social logo

waldronlab / curatedmetagenomicdataanalyses Goto Github PK

View Code? Open in Web Editor NEW
21.0 10.0 8.0 244.73 MB

Analyses in R and Python Using curatedMetagenomicData

Home Page: https://waldronlab.io/curatedMetagenomicDataAnalyses/

License: Creative Commons Attribution 4.0 International

Python 91.87% R 2.55% Dockerfile 0.17% Shell 5.41%
r microbiome-analysis microbiome-data bioconductor

curatedmetagenomicdataanalyses's Introduction

curatedMetagenomicDataAnalyses

This repository provides biologically relevant analyses using the curatedMetagenomicData package, both using R/Bioconductor and using Python. You can run both R and Python analyses locally in the provided Docker container, or on the Cloud for free.

Running in the Cloud (free)

A machine with all dependencies, code from this repository, and Jupyterlab (with R and Python3) and RStudio running is available at http://app.orchestra.cancerdatasci.org/ (search for the Curated Metagenomic Analyses workshop). You can use these machines for up to 8 hours at a time.

Running locally using Docker

Requirements

You need Docker.

Getting Started

First build the image:

docker build -t "waldronlab/curatedmetagenomicanalyses" .

Then run a container based on the image with your password:

docker run -d -p 80:8888 --name cma \
  waldronlab/curatedmetagenomicanalyses

Visit localhost in your browser.

Running locally without Docker

Start with an installation of the current version of Bioconductor (see https://bioconductor.org/install/). Older versions probably will not work. Installation directly from GitHub requires first installing the remotes package, then:

BiocManager::install("waldronlab/curatedMetagenomicDataAnalyses", dependencies = TRUE)

Analyses

R Vignettes

Python Notebooks

Supplementary Materials

curatedmetagenomicdataanalyses's People

Contributors

d-golzato avatar gidave avatar jwokaty avatar lwaldron avatar schifferl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

curatedmetagenomicdataanalyses's Issues

conda install available ?

Hi, I was wondering if there is a conda install option available for your tool or if ever you're planning? I saw that there was for curatedMetagenomicData but not for *Analyses. Thank you!

README in docker

@lwaldron I want to clarify the content for the README file that will be in the file explorer on the left side of Jupyter Lab. Are these just the instructions to run the analyses?

We could just add that to the repository's README and then I could just copy that file up one level from the curatedMetagenomicAnalyses repository in the docker. If that's confusing, I can make a different README for the docker.

BiocManager::install("waldronlab/curatedMetagenomicAnalyses") failed

> BiocManager::install("waldronlab/curatedMetagenomicAnalyses")
Bioconductor version 3.12 (BiocManager 1.30.16), R 4.0.5 (2021-03-31)
Installing github package(s) 'waldronlab/curatedMetagenomicAnalyses'
Downloading GitHub repo waldronlab/curatedMetagenomicAnalyses@HEAD
Running `R CMD build`...
* checking for file ‘/tmp/Rtmpifo5W6/remotes2b43787ca6aa/waldronlab-curatedMetagenomicAnalyses-677f1be/DESCRIPTION’ ... OK
* preparing ‘curatedMetagenomicAnalyses’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘curatedMetagenomicAnalyses_0.4.0.tar.gz’
* installing *source* package ‘curatedMetagenomicAnalyses’ ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Error: object ‘returnSamples’ is not exported by 'namespace:curatedMetagenomicData'
Execution halted
ERROR: lazy loading failed for package ‘curatedMetagenomicAnalyses’
* removing ‘/public/home/sample_lib/ckzhu/miniconda3/envs/R_4.0.0/lib/R/library/curatedMetagenomicAnalyses’
Warning message:
In i.p(...) :
  installation of package ‘/tmp/Rtmpifo5W6/file2b4327817dad/curatedMetagenomicAnalyses_0.4.0.tar.gz’ had non-zero exit status

Duplicate row names in dataDump

When I run the example in dataDump.Rd, I get the following error:

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘MH0001’, ‘MH0002’, ‘MH0003’, ‘MH0004’, ‘MH0005’, ‘MH0006’, ‘MH0007’, ‘MH0008’, ‘MH0009’, ‘MH0010’, ‘MH0011’, ‘MH0012’, ‘MH0013’, ‘MH0014’, ‘MH0015’, ‘MH0016’, ‘MH0017’, ‘MH0018’, ‘MH0019’, ‘MH0020’, ‘MH0021’, ‘MH0022’, ‘MH0023’, ‘MH0024’, ‘MH0025’, ‘MH0026’, ‘MH0027’, ‘MH0028’, ‘MH0030’, ‘MH0031’, ‘MH0032’, ‘MH0033’, ‘MH0034’, ‘MH0035’, ‘MH0036’, ‘MH0037’, ‘MH0038’, ‘MH0039’, ‘MH0040’, ‘MH0041’, ‘MH0042’, ‘MH0043’, ‘MH0044’, ‘MH0045’, ‘MH0046’, ‘MH0047’, ‘MH0048’, ‘MH0049’, ‘MH0050’, ‘MH0051’, ‘MH0052’, ‘MH0053’, ‘MH0054’, ‘MH0055’, ‘MH0056’, ‘MH0057’, ‘MH0058’, ‘MH0059’, ‘MH0060’, ‘MH0061’, ‘MH0062’, ‘MH0063’, ‘MH0064’, ‘MH0065’, ‘MH0066’, ‘MH0067’, ‘MH0068’, ‘MH0069’, ‘M [... truncated] 

Error in access data

Hello to everyone,

I am currently trying to get access to the data contained at the curatedMetagenomicData repository but I get the following error message:

britol = BritoIL_2016.metaphlan_bugs_list.stool()
Error in UseMethod("filter_") : 
  no applicable method for 'filter_' applied to an object of class "c('tbl_SQLiteConnection', 'tbl_dbi', 'tbl_sql', 'tbl_lazy', 'tbl')"

How can me addressed this issue?
thank you a lot for you support

retrieve raw data

Hi,
Is there a way to get the raw metagenomics fastq files from the listed studies?

Add link to paper and fix figure

I think we're missing a reference to a paper @paolinomanghi's notebook:

The flag "-m" will attach the per-sample metadata available in curatedMetagenomicData 3 to their taxonomic 
profiles. We now switch to a python 3 set of instructions that can be used to perform the main analysis of 
Figure 2, panel a, of the paper "***".

And I think there's still some issue with the figure:

screen_shot_notebook_figure

Changes to Python analysis notebook

@paolinomanghi I wanted to make the following recommendations to help the notebook run and also to separate it into an analysis notebook and an installation notebook. I could probably write the installation notebook since I needed to install everything for the docker--should I do that?

Remove the installation instructions and the git clone line so that you have

This notebook contains the instructions to run a meta-analysis of sex-related contrasts in the human gut microbiome, using curatedMetagenomicDataCLI and a set of freely-available python programs. See `installation.ipynb` for installation instructions.

As described here, we are now going to: 
1) create a folder called **species_abundances_from_cMD3CLI**
2) go in that directory
3) download all the taxonomic profiles from the **curatedMetagenomicDataCLI** workflow

I thought it might be helpful to add the comment after 3 that "This step will take some time." or anywhere that it may take time to process.

When making species_abundances_from_cMD3CLI, I thought we could put the code outside of the repository. For example

%%bash
mkdir /home/waldronlab/species_abundances_from_cMD3CLI
cd /home/waldronlab/species_abundances_from_cMD3CLI
curatedMetagenomicData -m "*relative_abundance"

Later, when we import your python modules and tools, we append as follows:

sys.path.append("../python_modules/")
sys.path.append("../python_tools/")

I think for all the "help" sections I suggested that we make it runnable code to keep the notebook small and make it interactive. For meta_analysis_data you can do like the following:

%%bash
python ../python_tools/meta_analysis_data.py -h

We should also change the path in the params:

params = {
    'input_folder': "/home/waldronlab/species_abundances_from_cMD3CLI/",
    "output_dataset": "a_dataset_for_the_sex_contrast_in_gut_species.tsv",
    "min": ["age:16"],
    "max": [],
    "cat": ["study_condition:control", "body_site:stool"], 
    "multiple": -1,
    "min_perc": ["gender:25"],
    "cfd":["BMI"], 
    "iqr": [],
    "minmin": "gender:40",
    "study_identifier": "study_name", 
    "verbose": False, 
    "debug": False,
    "binary": [],
    "search": [],
    "exclude": []
}

Then I believe everything runs. Here is the notebook with some of those edits for reference:
curatedMetagenomicData 3 CLI interface, sex-contrast microbiome meta-analysis.zip

README.md links don't work from pkgdown

In the pkgdown site (https://waldronlab.io/curatedMetagenomicAnalyses/index.html) The following links from the README.md file lead to waldronlab.io/ etc when they should lead to github.com/waldronlab/curatedMetagenomicAnalyses, and give "no content found" messages as a result:

Analyses

R Vignettes

Python Notebooks

Supplementary Materials

possible to change repo name

Hi @jwokaty, would it be ok to change the repo name to curatedMetagenomicDataAnalyses? I've been trying to move everything towards consistent naming and it would help. Hope it's not too much to ask!

whether the study factor needs to be considered?

Thank you and your team for developing the curatedMetagenomicData package. I have encountered some confusion when using this package. Firstly, whether the relative abundance table has been standardized and does not need to consider the batch of the study? Is this the final relative abundance table?
Secondly, I downloaded a cancer data from different studies using curatedMetagenomicData, and whether the study factor needs to be considered when finding the significantly different microbiome using massLin. Or, what other analysis methods for finding the significantly different microbiome are recommended when using curatedMetagenomicData? Or, do I need to follow this tutorial (vignettes/Sex_metaanalysis_vignette.Rmd). Looking forward to your reply.

Error Assay.type

Hello everyone,
when I try to use the function to convert to phyloseq I get an error massage that I do not know how to menage:

 makePhyloseqFromTreeSummarizedExperiment(alcoholStudy, abund_values = "relative_abundance")
Error: 'assay.type' must be a valid name of assays(x)

How can be it solved?
thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.