Coder Social home page Coder Social logo

yingtongaamandawu / monkeyflower_ampliconseq_dna_data_codes Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 862 KB

This is repository of amplicon sequencing data and bioinformatics pipeline codes for the nectar microbiome of Monkey Flowers.

amplicon-sequencing bioinformatics-pipeline dada2 microbiome networkanalysis

monkeyflower_ampliconseq_dna_data_codes's Introduction

This is a repository of data, R scripts, and codes for Linux-based commands for the bioinformatics analyses of amplicon sequencing data from fungi and bacteria microbiomes within the nectar samples of Sticky Monkeyflower (Diplacus aurantiacus).

The repository is comprised of 4 folders:

01_Data folder includes two CSV files:

  1. "sampling_sheet_regional_survey_2015_final_corrected.csv": metadata for the DNA samples, documenting the site ID, plant ID, flower ID of each sample, as well as the corresponding concentration of fungi and bacteria unit forming colonies (CFUs) in each sample.
  2. "2015_survey_siteinfo_location_envi.csv": documents the environmental data and coordinates for each flower.
  3. "Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx": documents the biosample information for each flower for sample names started from B to N
  4. "Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx": documents the biosample information for each flower for sample names started from O to S
  5. "SRA_metadata_site_started_with_B_N.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_B_N_SUB13559541.xlsx".
  6. "SRA_metadata_site_started_with_O_S.xlsx": documents the sequencing information for each fastq file, as related to the biosample info in "Wu_Metagenome.environmental.1.0_O_S_SUB13567828.xlsx".

02_Rscripts includes R script used in the bioinformatics analyses:

  1. "make_Map_20230413.Rmd": R code that makes the Figure 1 map, showing the distributions and locations of samples.
  2. "Bioinformatics_ITS1_DADA2_CONSTAXtaxa_20230202.Rmd": R code that implements the Dada2 pipeline on fungi ITS1 sequences.
  3. "Bioinformatics_16S_DADA2_SILVAtaxa_20230207.Rmd": R code that implements the Dada2 pipeline on bacteria 16S sequences.
  4. "make_phyloseq_objects_&_run_CLAM_test_20220203.Rmd": R code that generates phyloseq objects for ITS1 and 16S sequences, respectively; the code also uses the data from plantings -- densities of bacteria and fungi colony-forming units (CFUs) to categorize whether the nectar samples are (1) bacteria-dominated flowers, (2) fungi-dominated flowers, (3) co-dominated flowers, and (4) flowers with too few microbes to be classified into any of the three other groups.
  5. "diversity_analyses_fungi_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of fungi sequences (ITS1); data analyses include pairwise two-sample permutation tests for alpha diversity, permutational multivariate ANOVA for species composition, differential abundance analyses, etc.
  6. "diversity_analyses_bacteria_threshold=clam_20230413.Rmd": R code that analyzes the alpha and beta diversity of bacteria sequences (16S); data analyses similar to those included in "diversity_analyses_fungi_threshold=clam_20230413.Rmd".
  7. "ASVlevel_Co_occurence_network_NetCoMi_pearson_sparcc_r0.1_clam_20220429.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using Pearson correlation network and SparCC (Sparse Correlations for Compositional data) network method.
  8. "ASVlevel_co_occurence_network_NetCoMi_spieceasi_r0.1_clam_20220504.rmd": R code that conducts co-occurrence network analyses for fungi and bacteria sequences using SPIEC-EASI (Sparse InversE Covariance estimation for Ecological Association and Statistical Inference) network method.

03_Output includes key output files from the bioinformatics pipeline:

  1. "ITS1.unpooled.ASVs.fa": fasta file that documents the representative sequence for each fungi ITS1 ASV (amplicon sequence variant);
  2. "16S.unpooled.ASVs.fa": fasta file that documents the representative sequence for each bacteria 16S ASV (amplicon sequence variant);
  3. "Appendix1_ITS1_fungi_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each fungi ITS1 ASV.
  4. "Appendix2_16S_bacteria_ASV.count.ordered.csv": csv file that documents the total counts of reads, relative abundance within all samples, and species taxonomy of each bacteria 16S ASV.

04_Docs includes a file that documents other bioinformatics steps not conducted in R:

  1. "Wu_Monkeyflower_bioinformatics_steps.docx": a docx file that records the bioinformatics steps from demultiplexing to species assignment, based in Linux environment.

monkeyflower_ampliconseq_dna_data_codes's People

Contributors

yingtongaamandawu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.