Coder Social home page Coder Social logo

reference_data's Introduction

Reference data

Capture region BED files

Collects commonly used capture region BED files. These are installed and available for use in bcbio analyses. Includes files for hg19 (chr1, chr2, chr3... style naming) and GRCh37 (1, 2, 3... style naming).

Canonical transcripts

Files under transcripts/cancer_transcripts_*_ensembl.txt contain IDs of canonical (longest) transcripts that are used by SnpEff variant prediction tool when it run with the -canon flag (only in Ensembl-based versions of reference databases GRCh37.** and GRCh38.** in SnpEff notation). Since not all IDs in the list represent the most cancer-relevant isoforms, transcripts/canon_cancer_replacement.txt provides a map of transcripts for replacement with the -canonList option:

java -jar snpEff.jar GRCh37.75 test.vcf -canon -canonList transcripts/canon_cancer_replacement.txt

To use the canonical transcripts for variant annotation in bcbio, add the following into your configuration YAML file:

algorithm:
  effects_transcripts: canon

To use the cancer transcripts, use the following:

algorithm:
  effects_transcripts: canonical_cancer

The full list of genes with replaced transcripts:

AKT1     ENST00000555528
BRCA1    ENST00000357654
CD79B    ENST00000006750
CDKN2A   ENST00000304494
CHEK1    ENST00000534070
CHEK2    ENST00000328354
ESR1     ENST00000206249
FANCL    ENST00000233741
FGFR1    ENST00000447712
FGFR2    ENST00000457416
FGFR3    ENST00000440486
MET      ENST00000397752
MYD88    ENST00000396334
PPP2R2A  ENST00000380737
RAD51D   ENST00000345365
RAD54L   ENST00000371975
GNAS     ENST00000371085
TP53     ENST00000269305
ARID1B   ENST00000350026
TET2     ENST00000380013
CEBPA    ENST00000498907
PIK3C2G  ENST00000538779

reference_data's People

Contributors

chapmanb avatar mjafin avatar almiheenko avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.