Coder Social home page Coder Social logo

hivdrg's Introduction

hivdrg

overview

hivdrg is a R package to enable synonymous / non-synonymous characterisation of HIV genetic data for the PANGEA project. Also provides HIV antiviral drug resistance genotyping. Accepted inputs are FASTA (whole genomes & fragments) which will be mapped to selected reference NGS variant data assembled to a supported reference is accepted in VCF >= ver4.0 & Varscan2 tab formats.

Database

A text database extracted in July 2020 from the Stanford resistance database. Shafer RW(2006). Rationale and Uses of a Public HIV Drug-Resistance Database. Journal of Infectious Diseases 194 Suppl 1:S51-8

Installation

You can install the current version from with: GitHub

Dependencies for FASTA file handling are MAFFT and SNP-Sites available preferably via conda. snp-sites >= 2.3 has been tested.

conda config --add channels bioconda
conda install snp-sites
conda install mafft

Usage - resistance genotyping

note: there are 5 supported variants in hivdrg, vcf and tab files must be assembled against one of these. fasta files will be aligned to the chosen reference and variants called.

library("hivdrg")

# select a fasta, vcf, or varscan tab file
my_sample = system.file("testdata", "example.vcf", package = "hivdrg")

# hivdrg provides the following function to return a table of annotated variants
data = call_resistance(infile = my_sample, all_mutations = F, ref = 5)
#> [1] "ref should be an integer between 1 and 5, used to identify the HIV reference genome below"
#> [1] "AG_L39106.1"  "C_AF067155.1" "G_U88826.1"   "JX239390.1"   "K03455.1"


head(data[,c("GENEID","aachange", "freq", "phenotype")])
#>   GENEID aachange   freq phenotype
#> 1     rt    K103N 91.86% Resistant
#> 2     rt    K238T   1.9% Resistant
#> 3     rt    M184V 56.97% Resistant
#> 4     rt    P225H 20.85% Resistant

Usage - synonymous / non-synonymous characterisation

## call all variants

mutations_all = call_resistance(infile = my_sample, all_mutations = T,ref = 5)
#> [1] "ref should be an integer between 1 and 5, used to identify the HIV reference genome below"
#> [1] "AG_L39106.1"  "C_AF067155.1" "G_U88826.1"   "JX239390.1"   "K03455.1"


# are there any non-synonymous (DNA variants that result in a change of amino acid) variants in resistance genes
mutations_nonsyn = mutations_all[mutations_all$CONSEQUENCE == "nonsynonymous",]


# here the top 3 mutations are nonsynonymous, with no identified resistance effect.
head(mutations_nonsyn[,c("GENEID","aachange", "freq", "CONSEQUENCE","phenotype")])
#>    GENEID aachange   freq   CONSEQUENCE phenotype
#> 2      rt    D250E 59.64% nonsynonymous      <NA>
#> 5      rt    I178V  1.27% nonsynonymous      <NA>
#> 7      rt    K103N 91.86% nonsynonymous Resistant
#> 8      rt    K103R  1.01% nonsynonymous      <NA>
#> 11     rt    K238T   1.9% nonsynonymous Resistant
#> 14     rt    M184V 56.97% nonsynonymous Resistant

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on the GitHub Issues page. For questions and other discussions feel free to contact. Oscar Charles - Developer

hivdrg's People

Contributors

ojcharles avatar steven-kemp avatar

Stargazers

 avatar

hivdrg's Issues

refactor res db

in future:

  • we will store the stanford db text files
  • write a parser to allow use in add_resistance_info.R
  • Look for combinations in an additional step after add_resistance_info.R
  • i.e. by initially calling only lines without a "+", then looking for combinations in lines with a "+"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.