Coder Social home page Coder Social logo

pathoscore's People

Contributors

brentp avatar davemcg avatar jimhavrilla avatar ryanlayer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pathoscore's Issues

better options for missing scores

currently, in evaluate missing scores are essentially ignored. there should be a penalty for this.
I think one option is to scale each value in the roc output by the percent of scored variants, so if only 20% of variants are scored, then the maximum ROC curve will be 0.8

ftp for clinvar true set has changed

There is a new version of the vcf in clinvar, the file with the 20170802 data is archived. Here is the new line of code from the make.sh file with the new location

wget ftp://ftp.ncbi.nih.gov/pub/clinvar/vcf_GRCh37/vcf_2.0/archive/2017/clinvar_${date}.vcf.gz

Script to make gnomad benign truth set (filtered on ExAC) and filter out variants on AA change

https://github.com/quinlan-lab/regionanalysis/blob/master/parvarfilter.py

Is the filter script. Frequency and genes can also be filtered with https://github.com/quinlan-lab/regionanalysis/blob/master/secondfilter.py

Can use it like:

python parvarfilter.py -x $DATA/clinvar-gnomad.txt -n clinvar -c -s patho -e gnomad -d genescreens/ad_genecards_clean.txt -f

Creates a file called $DATA/clinvar-patho-gnomad.txt ( you have to add back a vcf header, but that's an easy fix ).

python parvarfilter.py -x $DATA/gnomad-exac.txt -n gnomad -s benign -e exac -d genescreens/ad_genecards_clean.txt -f

Creates a set of gnomad benigns called gnomad-benign-exac.txt (gnomad, benign set, filtered on exac). Filters on AA change/allele matching. Also, optionally on AD gene set.

as in:
https://github.com/quinlan-lab/regionanalysis/blob/master/pathocompare.sh

measure and plot of most constrained variants

in practice, the most interesting variants will be in the very top percentile.

pathoscore should report a confusion matrix for the top-N or top-p-percentile variants by score. This will match how a user might filter.

text output

pathoscore should output 1 row per method with columns:

  • method
  • J
  • se(J)
  • TPR @ J
  • FPR @ J
  • AUC
  • TP
  • FP
  • TN
  • FN

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.