Coder Social home page Coder Social logo

biona001 / ghostknockoffgwas Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 11.71 MB

Package for analyzing GWAS summary statistics data

License: MIT License

Julia 84.15% R 15.85%
conditional-independence false-discovery-rate fdr genomics gwas knockoffs summary-statistics variable-selection

ghostknockoffgwas's Introduction

Here are some of my past and ongoing projects, which I think are pretty cool. Not all projects are being actively developed, but I will certainly respond to issues and pull requests.

I am the main developer for (more recent projects are listed first):

  • GhostKnockoffGWAS: Package for performing knockoff-based analysis for GWAS summary statistics data
  • Ghostbasil.jl: (WIP) Provides Julia wrappers to the C++ code of ghostbasil
  • Knockoffs.jl: Implements the knockoff filter framework for variable selection, which performs conditional independence testing and controls the FDR (false discovery rate)
  • groupknockoffs: Simple app to solve group knockoff optimization, without Julia installed!
  • EasyLD.jl: Julia utilities for downloading and reading LD (linkage disequilibrium) matrices stored in Hail's BlockMatrix format
  • knockoffspy: A Python package that provides a direct wrapper over Knockoffs.jl
  • knockoffsr: A R package that provides a direct wrapper over Knockoffs.jl
  • MendelIHT.jl: Implements iterative hard thresholding (l0 penalized regression solver). It is highly optimized for handling compressed (binary PLINK) genotype data
  • MendelImpute.jl: A package for genotype imputation, phasing, and (global/local) ancestry inference utilizing a reference haplotype panel. It is significantly faster than existing methods but slightly less accurate
  • Thyrosim.jl: An updated version of THYROSIM, Thyrosim.jl produces individualized thyroid hormone predictions (TT4/TT3/TSH) based on a rather complicated ODE model
  • VCFTools.jl: Julia utilities for handling VCF (Variant Call Format) files
  • fastPHASE.jl: Julia wrapper for the famous fastPHASE genetics software. The primary use case is to allow the original program to run on binary PLINK data.

I am a contributor for (at least 5 commits):

  • QuasiCopula.jl: Implements a new class of distribution (Quasi-Copulas) that captures correlation among non-Gaussian random variables efficiently
  • SnpArrays.jl: Julia package for handling binary PLINK formatted data. It has the fastest (matrix)-(vector) multiplication routine for compressed PLINK files as far as I know.
  • MendelKinship.jl: Calculates various empirical and theoretical kinship coefficients, based on pedigree or genotype data.

ghostknockoffgwas's People

Contributors

biona001 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

xuelei-dai

ghostknockoffgwas's Issues

Error in CMplot

As reported by @GuJQ5, an error appears when drawing the manhattan plot:

Error in CMplot(x1t, plot.type = "m", LOG10 = FALSE, col = c("grey30",  :

  unused argument (bin.range = c(0, 500))

The reason is that the CMplot package has been update on Jan 19, 2024, where the argument bin.range is removed in the latest version.

very slow manhattan plots

It seems like when there are a lot of discoveries (e.g. > 100), it takes a really long time (>20h) to generate the resulting manhattan plots

FileIO error

As reported by @GuJQ5, there is some problem with reading .h5 formatted LD files.

$ app_linux_x86/bin/GhostKnockoffGWAS --zfile UCB_GWAS.txt.gz --LD-files EUR --N 44161 --genome-build 19 --out UCB_GWAS_results

┌ Warning: Error requiring `UnPack` from `JLD2`
│ exception = 
│ IOError: stat("/home/groups/sabatti/.julia/packages/JLD2/ryhNR/ext/UnPackExt.jl"): permission denied (EACCES) 
│ Stacktrace: 
│ [1] uv_error 
│ @ ./libuv.jl:100 [inlined] 
│ [2] stat(path::String) 
│ @ Base.Filesystem ./stat.jl:152 
│ [3] isfile(path::String) 
│ @ Base.Filesystem ./stat.jl:461 
│ [4] macro expansion 
│ @ /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/Requires.jl:37 [inlined] 
│ [5] top-level scope 
│ @ /home/groups/sabatti/.julia/packages/JLD2/ryhNR/src/JLD2.jl:607 
│ [6] eval 
│ @ ./boot.jl:370 [inlined] 
│ [7] eval 
│ @ /home/groups/sabatti/.julia/packages/JLD2/ryhNR/src/JLD2.jl:1 [inlined] 
 [8] (::JLD2.var"#131#134")() 
│ @ JLD2 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:101 
│ [9] macro expansion 
│ @ ./timing.jl:393 [inlined] 
│ [10] err(f::Any, listener::Module, modname::String, file::String, line::Any) 
│ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:47 
│ [11] (::JLD2.var"#130#133")() 
│ @ JLD2 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:100 
│ [12] withpath(f::Any, path::String) 
│ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:37 
│ [13] (::JLD2.var"#129#132")() 
│ @ JLD2 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:99 
│ [14] listenpkg(f::Any, pkg::Base.PkgId) 
│ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:20 
│ [15] macro expansion 
│ @ /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:98 [inlined] 
│ [16] __init__() 
│ @ JLD2 /home/groups/sabatti/.julia/packages/JLD2/ryhNR/src/JLD2.jl:606 
└ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:51 

┌ Warning: Error requiring `FileIO` from `HDF5` 
│ exception = 
│ IOError: stat("/home/groups/sabatti/.julia/packages/HDF5/HtnQZ/src/fileio.jl"): permission denied (EACCES) 
│ Stacktrace: 
│ [1] uv_error 
│ @ ./libuv.jl:100 [inlined] 
│ [2] stat(path::String) 
│ @ Base.Filesystem ./stat.jl:152 
│ [3] isfile(path::String) 
│ @ Base.Filesystem ./stat.jl:461 
│ [4] top-level scope 
│ @ /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/Requires.jl:37 
│ [5] eval 
│ @ ./boot.jl:370 [inlined] 
│ [6] eval 
│ @ /home/groups/sabatti/.julia/packages/HDF5/HtnQZ/src/HDF5.jl:1 [inlined] 
│ [7] (::HDF5.var"#115#121")() 
│ @ HDF5 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:101 
│ [8] macro expansion 
│ @ ./timing.jl:393 [inlined] 
│ [9] err(f::Any, listener::Module, modname::String, file::String, line::Any) 
│ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:47 
│ [10] (::HDF5.var"#114#120")() 
│ @ HDF5 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:100 
│ [11] withpath(f::Any, path::String) 
│ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:37 
│ [12] (::HDF5.var"#113#119")() 
│ @ HDF5 /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:99 
│ [13] listenpkg 
│ @ /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:20 [inlined] 
│ [14] macro expansion 
│ @ /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:98 [inlined] 
│ [15] __init__() 
│ @ HDF5 /home/groups/sabatti/.julia/packages/HDF5/HtnQZ/src/HDF5.jl:119 
└ @ Requires /home/groups/sabatti/.julia/packages/Requires/Z8rfN/src/require.jl:51 

Welcome to GhostKnockoffGWAS analysis! 
You have specified the following options: 
zfile = /oak/stanford/groups/zihuai/UCB data/Mar_13_James/UCB_GWAS.txt.gz 
LD_files = /oak/stanford/groups/zihuai/UCB data/Mar_13_James/EUR 
N (sample size) = 44161 
hg_build = 19 
outdir = /oak/stanford/groups/zihuai/UCB data/Mar_13_James/ 
outfile = /oak/stanford/groups/zihuai/UCB data/Mar_13_James/UCB_GWAS_results seed = 2023 
verbose = true 
random_shuffle = true 
skip_shrinkage_check = false 

count_matchable_snps processed chr 1, cumulative SNPs = 46145 
count_matchable_snps processed chr 2, cumulative SNPs = 94450 
count_matchable_snps processed chr 3, cumulative SNPs = 135010 
count_matchable_snps processed chr 4, cumulative SNPs = 173294 
count_matchable_snps processed chr 5, cumulative SNPs = 209902 
count_matchable_snps processed chr 6, cumulative SNPs = 253171 
count_matchable_snps processed chr 7, cumulative SNPs = 285925 
count_matchable_snps processed chr 8, cumulative SNPs = 316982 
count_matchable_snps processed chr 9, cumulative SNPs = 343328 
count_matchable_snps processed chr 10, cumulative SNPs = 373313 
count_matchable_snps processed chr 11, cumulative SNPs = 403207 
count_matchable_snps processed chr 12, cumulative SNPs = 431788 
count_matchable_snps processed chr 13, cumulative SNPs = 452614 
count_matchable_snps processed chr 14, cumulative SNPs = 471730 
count_matchable_snps processed chr 15, cumulative SNPs = 489924 
count_matchable_snps processed chr 16, cumulative SNPs = 510232 
count_matchable_snps processed chr 17, cumulative SNPs = 528617 
count_matchable_snps processed chr 18, cumulative SNPs = 546334 
count_matchable_snps processed chr 19, cumulative SNPs = 561544 
count_matchable_snps processed chr 20, cumulative SNPs = 576703 
count_matchable_snps processed chr 21, cumulative SNPs = 585317 
count_matchable_snps processed chr 22, cumulative SNPs = 594385 
Error encountered while load FileIO.File{FileIO.DataFormat{:HDF5}, String}("EUR/chr1/LD_start100826405_end102041015.h5"). 

Fatal error: 
ERROR: HDF5 load error: neither load nor fileio_load is defined 
due to FileIO.SpecError(HDF5, :load) 
Will try next loader. 

Stacktrace: 
[1] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::FileIO.Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:209 
[2] action 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:196 [inlined] 
[3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 
[4] action 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 [inlined] 
[5] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:113 
[6] load 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:109 [inlined] 
[7] macro expansion 
@ /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/ghostbasil_parallel.jl:131 [inlined] 
[8] macro expansion 
@ ./timing.jl:393 [inlined] 
[9] ghostknockoffgwas(LD_files::String, z::Vector{Float64}, chr::Vector{Int64}, pos::Vector{Int64}, effect_allele::Vector{String}, non_effect_allele::Vector{String}, N::Int64, hg_build::Int64, outdir::String; outname::String, seed::Int64, target_chrs::Vector{Int64}, A_scaling_factor::Float64, kappa_lasso::Float64, LD_shrinkage::Bool, target_fdrs::Vector{Float64}, verbose::Bool, skip_shrinkage_check::Bool, random_shuffle::Bool) 
@ GhostKnockoffGWAS /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/ghostbasil_parallel.jl:130 
[10] macro expansion 
@ /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/app.jl:43 [inlined] 
[11] macro expansion 
@ ./timing.jl:393 [inlined] 
[12] julia_main() 
@ GhostKnockoffGWAS /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/app.jl:42 
[13] top-level scope 
@ none:1

Stacktrace: 
[1] handle_error(e::FileIO.LoaderError, q::Base.PkgId, bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/error_handling.jl:61 
[2] handle_exceptions(exceptions::Vector{Tuple{Any, Union{Base.PkgId, Module}, Vector}}, action::String) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/error_handling.jl:56
[3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::FileIO.Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:228 
[4] action 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:196 [inlined] 
[5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 
[6] action 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:185 [inlined] 
[7] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) 
@ FileIO /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:113 
[8] load 
@ /home/groups/sabatti/.julia/packages/FileIO/BE7iZ/src/loadsave.jl:109 [inlined] 
[9] macro expansion 
@ /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/ghostbasil_parallel.jl:131 [inlined] 
[10] macro expansion 
@ ./timing.jl:393 [inlined] 
[11] ghostknockoffgwas(LD_files::String, z::Vector{Float64}, chr::Vector{Int64}, pos::Vector{Int64}, effect_allele::Vector{String}, non_effect_allele::Vector{String}, N::Int64, hg_build::Int64, outdir::String; outname::String, seed::Int64, target_chrs::Vector{Int64}, A_scaling_factor::Float64, kappa_lasso::Float64, LD_shrinkage::Bool, target_fdrs::Vector{Float64}, verbose::Bool, skip_shrinkage_check::Bool, random_shuffle::Bool) 
@ GhostKnockoffGWAS /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/ghostbasil_parallel.jl:130 
[12] macro expansion 
@ /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/app.jl:43 [inlined] 
[13] macro expansion 
@ ./timing.jl:393 [inlined] 
[14] julia_main() 
@ GhostKnockoffGWAS /home/groups/sabatti/.julia/dev/GhostKnockoffGWAS/src/app.jl:42 
[15] top-level scope 
@ none:1

Improve documentation

  • Readme should describe GhostKnockoffGWAS in relation to mainstream GWAS methods, as most users won't be familiar with knockoff based inference.
  • The developer documentation is a little obsolete and should be revamped. In particular, the syntax for calling BlockGroupGhostMatrix (which depended on R package ghostbasil) should be updated to the syntax used by Ghostbasil.jl.
  • Gallery: We should include a few manhattan plots comparing marginal vs knockoff analysis

Roadmap until 1.0 release

Features to add:

  1. Add knockoff q-values to output (see eq19 of this paper)
  2. Ability to read different columns in zfile, based on user input
  3. Add code to make manhattan plot and show this in documentation
  4. Within-group knockoff filter (see this paper)

Non-linux support?

Rather than compiling binaries for GhostKnockoffGWAS via PackageCompiler.jl, consider providing a docker image, see this example. Dockers provide an alternative way to run GhostKnockoffGWAS, even on non-linux machines without Julia installed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.