Coder Social home page Coder Social logo

bracken_plot's Introduction

bracken_plot

Example output of bracken_plot app

The bracken_plot application allows for quick and easy visualization of merged Bracken data with stacked bar plots. This repository contains a how-to guide, example files, and the app.R source code.

If you want more control over the plot style and parameters, you can download and run the plotting function locally.

Getting Bracken data

Bracken is a companion program to Kraken that allows for estimation of relative abundance at any taxonomic level. For information regarding installation of Bracken and Kraken, see their GitHub pages:

https://github.com/DerrickWood/kraken2
https://github.com/jenniferlu717/Bracken

I recommend creating a conda enviroment and installing both from Bioconda:

https://anaconda.org/bioconda/kraken2
https://anaconda.org/bioconda/bracken

Here is an example script for running Kraken and Bracken on paired-end reads:

name=sample_A
kdb=/home/refdbs/kraken/Standard_DB
fq=/workdir/fastq
export OMP_NUM_THREADS=8
source /home/miniconda3/bin/activate
conda activate kraken2

mkdir -p ${name}
cd ${name}

kraken2 \
        --gzip-compressed \
        --paired \
        --report ${name}.report.txt \
        --db $kdb \
        --threads $OMP_NUM_THREADS \
        --output ${name}.out.txt \
        ${fq}/${name}_R1.fastq.gz ${fq}/${name}_R2.fastq.gz

conda activate bracken

levels=P,C,O,F,G,S,S1
for level in $(echo $levels | sed "s/,/ /g"); do

    bracken \
            -d $kdb \
            -i ${name}.report.txt \
            -o ${name}.bracken_${level}.txt \
            -r 75 \
            -l ${level}

done

Once you have Bracken reports for each sample at the desired taxonomic levels, reports can be combined by level using combine_bracken_outputs.py:

source /home/miniconda3/bin/activate
conda activate bracken

levels=P,C,O,F,G,S,S1
for level in $(echo $levels | sed "s/,/ /g"); do

    combine_bracken_outputs.py \
    --files ./*/*.bracken_${level}.txt \
    --names sample_A,sample_B,sample_C \
    --output ./merged_bracken_${level}.txt

done

Note that globbing expansion processes files alphanumerically, so the sample identifiers supplied in the --names option need to be in order or the columns of the merged file will be mislabeled.

Using the app

Upload your merged Bracken file and click "Create Plot". To plot an example, you can download Bracken output files from this repository. The app will automatically detect the taxonomic level and print a stacked bar plot showing the relative abundance of each taxon. Often, there are many taxa with near-zero abundances, and plotting all taxa results in ambiguous labeling. If this is the case, use the "Maximum number of taxa to plot" field to subsample the dataset. Subsampling will reduce the number of taxa plotted to the n taxa with the greatest median relative abundances across samples. The relative abundances of all taxa not in the subset are summed and plotted as "other". Once a plot is rendered, click "Get PDF" to download a pdf version.

Custom color palettes can be added as a string of comma-separated hexadecimal values without spaces or # characters. Colors are recycled in cases where the number of taxa exceeds the number of colors in a palette. If subsampling taxa, make sure that custom palettes do not contain the color used for the "other" label (gray 808080 by default). Some example palettes:

Default Palette

5c2751,ef798a,f7a9a8,00798c,6457a6,9dacff,76e5fc,a30000,ff7700,f5b841

Default palette

Alternate Palette 1

05a8aa,b8d5b8,d7b49e,dc602e,bc412b,791e94,2f4858,293f14,386c0b,550527

Alternate palette 1

Alternate Palette 2

99d5c9,6c969d,645e9d,392b58,2d0320,f9c784,fcaf58,ff8c42,cc2936,ebbab9

Alternate Palette 2

Custom palettes

Coolors.co is great for manually picking your own palettes. fbparis's palette tool attempts to maximize the perceived distinctness between colors and is a good option if a large number of colors is desired.

Troubleshooting

Feel free to open an issue if you experience errors or would like to see specific features implemented in future updates. If you want to create and manipulate bracken relative abundance plots as vector images, please download and run the plotting function provided in the Rmarkdown document.

Note: bracken_plot is currently hosted on shinyapps.io under a free account, which means the app is restricted to 25 active hours per month.

bracken_plot's People

Contributors

acvill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

lindyguo

bracken_plot's Issues

Taxonomy IDs not matching for species XXX: (216819 190338)

Dear bracken_plot development team,

One error occurred during processing Family and Genus level files.

Processing Output File 1_F_nt.bracken:: Sample 3
Taxonomy IDs not matching for species Xylophagaidae: (92613 183505)
PROGRAM START TIME: 02-02-2023 21:27:09
Processing Output File 10_G_nt.bracken:: Sample 1
Taxonomy IDs not matching for species Cyclophora: (216819 190338)
PROGRAM START TIME: 02-02-2023 21:27:09

Do you have any idea how to deal with this problem?

Best,

Bing

Is it possible to use it to plot krakenuniq outputs too

here is how a krakenuniq report looks like

# KrakenUniq v1.0.3 DATE:2023-08-06T00:57:24Z DB:/gss/work/asga9989/krakenuniq/microbial-nt DB_SIZE:706079654188 WD:/gss/work/asga9989/all_samples_run
# CL:/cm/shared/uniol/software/8.3/KrakenUniq/1.0.3-intel-2019b/bin/krakenuniq --db /gss/work/asga9989/krakenuniq/microbial-nt --threads 20 --fastq-input --gzip-compressed --paired --output 07_TAXONOMY_references_removed/_PL17_99indbel.krakenuniq --report-file 07_TAXONOMY_references_removed/_PL17_99indbel-krakenuniq-report.tsv 01_QC/_PL17_99indbel-FILTERED_R1.fastq.gz 01_QC/_PL17_99indbel-FILTERED_R2.fastq.gz
%       reads   taxReads        kmers   dup     cov     taxID   rank    taxName
43.16   783047  783047  132171775       3.13    NA      0       no rank unclassified
56.84   1031262 3084    504258  49.4    8.57e-06        1       no rank root
56.47   1024531 4231    491054  20.4    8.674e-06       131567  no rank   cellular organisms
54.03   980284  199152  475276  13.8    1.16e-05        2       superkingdom        Bacteria
31.88   578351  139858  424790  6.27    2.301e-05       1224    phylum        Pseudomonadota
12.22   221695  129781  337596  4.34    3.528e-05       1236    class           Gammaproteobacteria
3.011   54630   7       1869    62.1    7.733e-07       91347   order             Enterobacterales

Plot not generated

Hello acvill,
I have tried to run bracken_plot, however no plot is being generated.
image
image

I am getting this outcome.

Please look into this.

With regrads,
In2S

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.