Coder Social home page Coder Social logo

areebapatel / smcounter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xuchang116/smcounter

0.0 0.0 0.0 104.07 MB

smCounter: a versatile UMI-aware variant caller to detect both somatic and germline SNVs and indels. Published in article "Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller", BMC Genomics, 2017 18:5. https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-3425-4

License: MIT License

Python 97.36% R 1.98% Shell 0.66%

smcounter's Introduction

NOTE: the upgraded version, smCounter2, is available at https://github.com/qiaseq/qiaseq-dna. The paper is here smCounter2: an accurate low-frequency variant caller for targeted sequencing data with unique molecular identifiers, Bioinformatics, 06 September 2018

This repository contains scripts and data files supporting smCounter, a versatile UMI-aware variant caller that detects both somatic and germline SNVs and indels with high sensitivity and specificity. An example of running smCounter is included. The algorithm and validation results were published in "Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller", BMC Genomics, 2017 18:5.

File description

  • smCounter.py -- Python script for smCounter, a barcode aware somatic variant caller that integrates molecular barcode information into the variant calling algorithm. The script was developed and tested under Python v2.7.3. Python modules required: pysam, math, scipy, random, multiprocessing. Samtools v0.1.19 and Bedtools are also required.
  • run_log.py -- custom python script to direct stdout to log file
  • ds.mt.py -- Python script for downsampling barcode over the entire target region.
  • ds.reads.withinMT.py -- Python script for downsampling reads within barcodes.
  • ds.allele.fraction.py -- Python script for reducing the variant allele fraction at given variant loci.
  • primers.NA12878-194-genes-63-indels.10867.coding.bed -- Bed file for the target region of N0030 panel
  • SR_LC_SL.nochr.bed -- simple repeat, low complexity, satellite region
  • simpleRepeat.bed -- tandem repeat region
  • example -- The folder contains an example of running smCounter on Illumina paired end reads with UMIs (library prepared using QIAseq targeted DNA panels). The BAM file is a subset of N0030 data (see BMC Genomics paper) that covers part of BRCA1 gene

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.