Coder Social home page Coder Social logo

darlenewagner / ngs_plot_widgets Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 2.26 MB

Provides graphical and .json output for paired-end Next-Generation Sequencing (NGS) data as well as genome coverage data from .bed files. Requires Python 3.9.1 or higher.

Python 100.00%
json matplotlib python biopython fastq-files fastq-format

ngs_plot_widgets's Introduction

NGS_Plot_Widgets

1. Installation Instructions

A Python virtual environment is recommended for installation of modules. The Python version should be 3.9 or higher. After changing directory (cd) into the subfolder where you want the Python virtual environment, write the following command:

virtualenv -p /apps/x86_64/python/3.9.1/bin/python ./

The specific path for virtualenv may differ according to where your python binary is installed on your system. Next, install the three prerequisite Python modules: Biopython, Matplotlib, and pandas:

bin/pip install biopython
bin/pip install matplotlib
bin/pip install pandas

Then, install NGS_Plot_Widgets by git clone:

git clone https://github.com/darlenewagner/NGS_Plot_Widgets.git

Finally, test fullPlotShuffledFastq.py using the included test fastq.gz:

bin/python NGS_Plot_Widgets/fullPlotShuffledFastq.py NGS_Plot_Widgets/EnterovirusD70_SRR13402413_Pairs.fastq.gz

2. Description and usage of fullPlotShuffledFastq.py

fullPlotShuffledFastq.py computes sequence lengths and average PHRED for shuffled paired reads in fastq.
It expects a single fastq(.gz) input and outputs a Readstatistics.README.txt, a Readstatistics.json, and a .png image showing PHRED quality histograms for forward (R1) and reverse (R2) reads, all in a folder named after the input filename.fastq(.gz). Number and location of output files can be varied by --outputType. --outputType F for full output,... J for .json only, and N for no image.

python fullPlotShuffledFastq.py filepath/filename.fastq(.gz) --outputType [F/J/N]

3. Description of plotBedCoverage.py

plotBedCoverage.py creates a line plot .png image from a 3-column .bedGraph file created by bedtools genomecov.
Plotting window can be varied by entering an integer after the optional '--window' parameter.

4. Venn diagram plotting utility for single nucleotide polymorphisms (SNPs) positions

vennDiagramPlotColumn.py creates a 2-set Venn diagram from two input files containing unique SNPs positions. The script relies upon matplotlib-venn, which is separate from matplotlib. In the usage example below, --outputType P determines that a matplotlib_venn plot will be created as output, --title "my title" is a user-supplied string for annotating both the plot and its filename, while --plotScale [W/U] give either a weighted or unweighted Venn diagram, respectively.

bin/python vennDiagramPlotColumn.py SC2_MiSeq_SNPs.tsv SC2_iSeq_SNPs.tsv --outputType P --title "Coronavirus Venn" --plotScale U

The files, SC2_MiSeq_SNPs.tsv and SC2_iSeq_SNPs.tsv are based upon output from the following command line processing of .vcf files:

bcftools query -f '%CHROM\t%POS\t%REF\t%ALT\t%QUAL\t%INFO/AO\t%INFO/DP\n' sample_1.vcf.gz | perl -ne '@F=split(/\s+/, $_); printf "%\s\t%\d\t%\s\t%\s\t%\d\t%\d\t%0.4f\n", $F[0], $F[1], $F[2], $F[3], $F[4], $F[6], $F[5]/$F[6]' >> input1.table.tsv

ngs_plot_widgets's People

Contributors

darlenewagner avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.