Coder Social home page Coder Social logo

bit's Introduction

DOI

Bioinformatics Tools (bit)


Overview

There are of course several great and widely used packages of bioinformatics helper programs out there. Some of these include the likes of seqtk, fastX-toolkit, and bbtools โ€“ all of which I use regularly and have helped me do the things I was trying to get done. But there are always more tasks that crop up that may not yet have a helper program or script already written that we can find.

bit is a collection of one-liners, short scripts, and programs that run in a Unix-like command-line environment that I have been adding to over several years. Anytime I need to write something to perform a task that has more than a one-off, ad hoc use, I consider adding it here. This includes things like:

Purpose Script(s)
quickly summarizing nucleotide assemblies bit-summarize-assembly
splitting a fasta file based on headers bit-parse-fasta-by-headers
renaming sequences in a fasta bit-rename-fasta-headers
re-ordering a fasta file bit-reorder-fasta
pulling out sequences from a fasta by their coordinates bit-extract-seqs-by-coords
pulling amino-acid or nucleotide sequences out of a GenBank file bit-genbank-to-AA-seqs, bit-genbank-to-fasta
counting the number of bases per sequence in a fasta file bit-count-bases-per-seq
calculating variation in each column of a multiple-sequence alignment bit-calc-variation-in-msa
filtering a table based on wanted IDs bit-filter-table
downloading NCBI assemblies in different formats by just providing accession numbers bit-dl-ncbi-assemblies
searching the (stellar) Genome Taxonomy Database by taxonomy and getting their NCBI accessions bit-get-accessions-from-GTDB
getting full lineage info from a list of taxon IDs (making use of the also stellar TaxonKit) bit-get-lineage-from-taxids
filtering KOFamScan results bit-filter-KOFamScan-results
getting information about a specific GO term bit-get-go-term-info
summarizing GO annotations bit-summarize-go-annotations
summarizing kraken2 outputs in a table with counts of full taxonomic lineages, and combining multiple samples bit-kraken2-to-taxon-summaries, bit-combine-kraken2-taxon-summaries
combining bracken outputs and adding full taxonomic lineage info bit-combine-bracken-and-add-lineage
generating color/mapping/data files for use with trees being viewed on the Interactive Tree of Life site bit-gen-iToL-map, bit-gen-iToL-colorstrip, bit-gen-iToL-text-dataset, bit-gen-iToL-binary-dataset

And other just convenient things that are nice to have handy, like removing soft line wraps that some fasta files have (bit-remove-wraps), and printing out the column names of a TSV with numbers (bit-colnames) to quickly see which columns we want to provide to things like cut or awk ๐Ÿ™‚

Each command has a help menu accessible by either entering the command alone or by providing -h as the only argument. Once installed, you can see all available commands by entering bit- and pressing tab twice.

bit runs in a Unix-like environment and is recommended to be installed with conda as shown below.


Conda install

If you are new to the wonderful world of conda and want to learn more, one place you can start learning about it is here ๐Ÿ™‚

Due to increasing program restrictions as bit has grown, it's easiest to install it in its own environment as shown below (though I still put it in my base environment when I can given how much I rely on it ยฏ\_(ใƒ„)_/ยฏ):

conda create -n bit -c conda-forge -c bioconda -c defaults -c astrobiomike bit
conda activate bit

Each command has a help menu accessible by either entering the command alone or by providing -h as the only argument. Once installed, you can see all available commands by entering bit- and pressing tab twice.


Citation info

If you happen to find bit useful in your work, please be sure to cite it ๐Ÿ™‚

Lee M. bit: a multipurpose collection of bioinformatics tools. F1000Research 2022, 11:122. https://doi.org/10.12688/f1000research.79530.1

You can get the version you are using by running bit-version.

If you are using a program in bit that also leverages another program, please be sure to cite them too. For instance, bit-get-lineage-from-taxids uses TaxonKit, and bit-slim-down-go-terms used goatools. For cases where a bit script relies on other programs like those, it will be indicated in the help menu of the bit program.


Shameless plug

For phylogenomics, checkout GToTree ๐Ÿ™‚


bit's People

Contributors

astrobiomike avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.