Coder Social home page Coder Social logo

xtonyjiang / gnova Goto Github PK

View Code? Open in Web Editor NEW
22.0 22.0 12.0 75 KB

A principled framework to estimate annotation-stratified genetic covariance using GWAS summary statistics.

Home Page: http://www.cell.com/ajhg/abstract/S0002-9297(17)30453-6

License: GNU General Public License v3.0

Python 100.00%

gnova's People

Contributors

cinnamonfish avatar rlpowles avatar xtonyjiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gnova's Issues

default sumstats args

Running this as a module in my code I got this error:

Traceback (most recent call last):
File "ldsc_thin.py", line 451, in
df = pr.pre_function(args)
File "/net/zhao/rlp48/ldsc_test/ldsc/custom/prep.py", line 51, in pre_function
dfs.append(munge_sumstats.munge_sumstats(args, p=False))
File "/net/zhao/rlp48/ldsc_test/ldsc/custom/munge_sumstats.py", line 523, in munge_sumstats
if args.no_alleles and args.merge_alleles:
AttributeError: 'Namespace' object has no attribute 'no_alleles'

I think the problem here is the munge_sumstats parser doesn't get called when the script isn't called as main. You'll need to add some code to set any default arg attributes that the munge_sumstats parser sets (starts at line 444 of sumstats.py). You might need to do some more thorough testing of your code to make sure that munge_sumstats is going to run how you expect it to.

document ldscore format

Hello,
I have my own ld scores computed with the ldsc software, and was hoping to use them with GNOVA; however, the GNOVA CSV format for LD files seems to be different than LDSC output. Could you document your LD file format, or provide an example?

Problem with NaNs

Hi Tony,

I've been trying to use GNOVA in my project. For some of the sumstats files I'm using, I get the following error:

Traceback (most recent call last):
  File "gnova.py", line 86, in <module>
    pipeline(parser.parse_args())
  File "gnova.py", line 47, in pipeline
    out = calculate(gwas_snps, ld_scores, annots, N1, N2)
  File "/Volumes/BD/GNOVA/calculate.py", line 72, in calculate
    m1 = linear_model.LinearRegression().fit(ld_scores, pd.DataFrame((Z_x) ** 2), sample_weight=w1)
  File "/Volumes/Users/Library/Python/2.7/lib/python/site-packages/sklearn/linear_model/base.py", line 458, in fit
    y_numeric=True, multi_output=True)
  File "/Volumes/Users/Library/Python/2.7/lib/python/site-packages/sklearn/utils/validation.py", line 750, in check_X_y
    dtype=None)
  File "/Volumes/Users/Library/Python/2.7/lib/python/site-packages/sklearn/utils/validation.py", line 568, in check_array
    allow_nan=force_all_finite == 'allow-nan')
  File "/Volumes/Users/Library/Python/2.7/lib/python/site-packages/sklearn/utils/validation.py", line 56, in _assert_all_finite
    raise ValueError(msg_err.format(type_err, X.dtype))
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

I tracked it down to the prep.py, where the

df = pd.merge(bim, dfs[1], on=['SNP']).merge(dfs[0], on=['SNP'])

produces "NaN". I think this shouldn't be the case. Could this be a matter of the pandas version used?
My workaround was to introduce a

df.dropna(inplace=True)

but I don't think that's how it's meant to be.

Rainer

MAF stratification

Could you give any information on your MAF stratified annotations from 1000g, what were the MAF cut-offs for each quartile? I haven't been able to find this anywhere (including in the AJHG paper). Sorry in advance if I've missed it!

TIA!

Reference and annotation files not available

Hi!

Thank you for such an amazing tool. I'm trying to access the plink reference file and the annotation files but it seems that they are not available. Would it be a temporal error or they won't be accessible anymore.

Thanks!

Judit

Problem preparing files for gnova

Hi,

I am having this error repeteadly, even when I have formatted files according to readme specifications. Could you help me with this issue? Thanks!

Preparing files for analysis...
Traceback (most recent call last):
File "./gnova.py", line 85, in
pipeline(parser.parse_args())
File "./gnova.py", line 34, in pipeline
args.sumstats2)
File "/Users/usuario/Desktop/pipeline/prep.py", line 67, in prep
len_b, len_a = len(bim_files), len(annot_files)
TypeError: object of type 'NoneType' has no len()

Running error

TypeError:init() got multiple values for keyword argument 'keep_snps'

Bug

TypeError: init( ) got multiple values for keyword argument 'keep_snps'

Multi-chromosome pattern match too liberal

Replacing the @ sign with a * wildcard could cause problems if there are multiple versions of the bim file in the directory. For example you'd end up globbing chr1.bim and chr1_nosex.bim in the call and have multiple copies of each chromosome. You need a regex here that will specifically match one or two digits only (we may also want to avoid picking up sex chromosome files for now, I'll check with Qiongshi).

Interpreting corrected p-value

I understood that the corrected p-value is the p-value adjusted for sample overlap. However, in my case I am sure that there is no sample overlap. Therefore, could I only use the raw p-value? or does the corrected p-value take into account something else besides the sample overlap?
Besides, sometimes the raw p-value is not significant, but the corrected p-value (not existing a real sample overlap between my data) it is. How is it possible?

Thank you very much in advance!

Program frozen for a long time without error/output

Hi,

Thanks for this tool.

I ran the code to estimate the rg between two traits, but the code got stuck for more than 3 hours, without outputting any files.

Preparing files for analysis... Calculating LD scores... ~/software/gnova/lib/python2.7/site-packages/numpy/core/fromnumeric.py:56: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead return getattr(obj, method)(*args, **kwds)

Is this normal???

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.