Coder Social home page Coder Social logo

cbi_reference_genomes's Introduction

CBI Reference Genomes

References

Species Source Build Contigs TopHat HiSat Notes
Homo sapiens UCSC hg19 25 iGenomes
hg38 195 Includes unlocalized (random) and unplaced (chrUn) contigs, plus EBV. iGenomes
hg38full 456 hg38 (above) plus alternate haplotypes (261 sequences)
HISAT2 hg19 93 Pre-built by CCB
HISAT2 hg38 455 Pre-built by CCB
Mus musculus Ensembl GRCm38 22 iGenomes
Ceratotherium simum (white rhino) UCSC cerSim1 3087

Plant References

Species Name Plant Type Source Build Contigs
Aegilops_tauschii Tausch's goatgrass Grass Gramene ASM34733v1 429892
Amborella_trichopoda Understory shrub Shrubs/Trees Gramene AMTR1.0 5745
Arabidopsis_lyrata Lyrate Rock Cress Flowering Gramene v.1.0 695
Arabidopsis_thaliana Thale Cress Flowering Gramene TAIR10 7
Beta_vulgaris Beet Herbaceous Gramene RefBeet-1.2.2 40245
Brachypodium_distachyon Purple False Brome Grass Gramene v1.0 83
Brassica_napus Rapeseeds Flowering Gramene AST_PRJEB5043_v1 20899
Brassica_oleracea Wild Cabbage Vegetable Gramene v2.1 32928
Brassica_rapa Field Mustard Herbaceous Gramene IVFCAASv1 40367
Chlamydomonas_reinhardtii Green Algae Algae Gramene v3.1 1558
Chondrus_crispus Red Algae/Carageen Algae Gramene ASM35022v2 926
Corchorus_capsularis White Jute Shrubs Gramene CCACVL1_1.0 16522
Cyanidioschyzon_merolae Red Algae Algae Gramene ASM9120v1 22
Galdieria_sulphuraria Red Algae Algae Gramene ASM34128v1 433
Glycine_max Soybean Bean Gramene V1.0 1168
Hordeum_vulgare Barley Grass Gramene Hv_IBSC_PGSB_v2 9
Leersia_perrieri Rice Grass Gramene Lperr_V1.4 12
Medicago_truncatula Barrel Clover Legume Gramene MedtrA17_4.0 2186
Musa_acuminata Banana Fruit Gramene MA1 12
Oryza_barthii African Wild Rice Grass/Rice Gramene O.barthii_v1 12
Oryza_brachyantha Wild Rice Grass/Rice Gramene Oryza_brachyantha.v1.4b 7485
Oryza_glaberrima African Rice Grass/Rice Gramene AGI1.1 1951
Oryza_glumaepatula S. American Wild Rice Grass/Rice Gramene ALNU02000000 12
Oryza_indica Asian Rice (Indian) Grass/Rice Gramene ASM465v1 10490
Oryza_longistaminata Long-Staminate Rice Grass/Rice Gramene O_longistaminata_v1.0 60198
Oryza_meridionalis Australian Wild Rice Grass/Rice Gramene Oryza_meridionalis_v1.3 12
Oryza_nivara Wild Rice (Indian) Grass/Rice Gramene AWHD00000000 12
Oryza_punctata Red Rice (African) Grass/Rice Gramene AVCL00000000 12
Oryza_rufipogon Brown Beard Rice Grass/Rice Gramene OR_W1943 12
Oryza_sativa Cultivated rice Grass/Rice Gramene IRGSP-1.0 61
Ostreococcus_lucimarinus Green Algae Algae Gramene ASM9206v1 21
Physcomitrella_patens Earthmoss Moss Gramene ASM242v1 2106
Populus_trichocarpa Black Cottonwood Tree Gramene JGI2.0 2518
Prunus_persica Peach Fruit Gramene Prupe1_0 202
Selaginella_moellendorffii Spikemoss Lycophyte Gramene v1.0 759
Setaria_italica Foxtail Millet Grass Gramene JGIv2.0 336
Solanum_lycopersicum Tomato Fruit Gramene SL2.50 3144
Solanum_tuberosum Potato Vegetable Gramene SolTub_3.0 13
Sorghum_bicolor Broom-corn Grass Gramene Sorghum_bicolor_v2 1535
Theobroma_cacao Cacao tree Tree Gramene Theobroma_cacao_20110822 711
Trifolium_pratense Red Clover Flowering Gramene Trpr 8595
Triticum_aestivum Common Wheat Grass Gramene TGACv1 735945
Triticum_urartu Red Wild Einkorn Grass Gramene ASM34745v1 499222
Vitis_vinifera Common Grape Fruiting Gramene IGGP_12x 33
Zea_mays Maize/Corn Vegetable Gramene AGPv4 267

Building References

Scripts for building reference genomes. See build_xx.sh

Directory Structure

References/

This directory contains the reference genome sequences, indexes, and annotations. The contents of this directory are kept in sync with /lustre/groups/cbi/shared/References using the rsync_lustre.py script.

The following directory structure is used for each build:

References/[Species]/[Source]/[Build]

For example, Human reference genome with UCSC annotations, build hg19 is located at References/Homo_sapiens/UCSC/hg19.

Within each build directory there are two optional subdirectories:

  • Sequence/: contains sequence data (FASTA) and sequence indexes, including BLAST, Bowtie, Hisat2, etc.
  • Annotation/: contains annotation data including SNVs, genes, Tophat2Index.

Archive/

Contains the original files used to build reference genomes, including iGenome archives and gzipped FASTA files.

scripts/

Utility scripts used for building reference genomes.

cbi_reference_genomes's People

Contributors

mlbendall avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.