secastel / afc Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 11.0 38 KB

Calculates allelic Fold Change (aFC) using standard input files for fastQTL.

Python 100.00%

afc's People

Contributors

Stargazers

Watchers

Forkers

xtmgah francois-a eflynn90 evanbiederstedt pejlab danielduyvo yfarjoun richardslab zerland

afc's Issues

Estimating aFC with multiple causal variants

Not sure if this tool is still supported but I was interested in using it to calculate aFC in some eQTL data I am working with. Some of the genes in this data set are predicted to have multiple causal variants, and I'm wondering if you have any recommendations for how best to calculate the aFC of each causal variant.

In your manuscript, I see that you calculate aFC for GTEx genes with two eQTLs, but I wasn't sure how you actually went about doing that, and I don't see any way to handle this use case in the tool. A naive approach would be just to run the tool as is, which would provide aFC estimates of each variant, but would not account for the additive effects of the variants.

One thought I had is that for each tested eQTL, I could include the other eQTLs for that gene as covariates and regress them out along with the other covariates before estimating aFC. But I'm not sure that would properly handle the discrepancies between cases where the "high expression" alleles are on the same vs. different haplotypes.

Any guidance is greatly appreciated.

Question about ASE calculation

Hello:

We are preparing a manuscript and want to use the same method to show the correlation between log2aFC and ASE-based estimate (similar to figure 5 b/c in Genome Research paper). We currently already calculated the log2aFC using eQTL data (fastQTL input files). So, can you let us know how to calculate the ASE-based estimate, and then we can calculate the correlation? Thanks. If you can provide a script to do this, it will be very helpful to us. We will cite your paper in future publication. Thanks.

Predicting CPU utilization?

I'm trying to run aFC on a very large dataset - literally millions of QTLs. I've broken the work up into small chunk and am running 100s of parallel jobs on a compute cluster. The problem I'm running into is that I'm having a very hard time predicting the CPU usage of the jobs.

What I'm seeing is that many of the jobs run at ~100% of one CPU for most of their runtime. Then once in a while a bunch of jobs will spike up significantly, consuming anywhere from 300%-1200% CPU (i.e. 3-12 cores). This is causing quite a problem for me because I'm left with the choice fo either scheduling the jobs with 1 cpu each and dealing with the mayhem that ensues when a non-trivial number of jobs spike, or scheduling multiple cpus per job and watching my compute farm sit half or more idle most of the time.

I've taken a brief read through the source code and can't see any references to multi-processing, threads or parallelism, but I'm also not experienced with numpy/pandas, so it's very possible I'm missing something.

Any pointer or insight into what might be causing the CPU spikes and how to deal with them would be great appreciated.

Generate the QTL file from the VCF (genotypes) and BED (phenotypes)

@secastel
I was looking at the (GTEx portal)[https://gtexportal.org/home/datasets]) and I don't see a comprehensive list of QTLs (SNP id + gene id). I am thinking I can create my own from the VCF (genotypes) and BED (phenotypes) by simply finding all pairs of overlapping SNPs and genes.

This may produce a file with multiple rows having the same gene id [pid] since multiple SNPs may overlap a single gene.
If there are genes/phenotypes [pid] that overlap, this would mean that there may be multiple rows having the same SNP id [sid].

Does this seem like a valid strategy to create a comprehensive QTL file if I want to test ALL QTLs?

Can I used aFC to calculate the effect size of trans-eQTL

Hi,
Thank you for developing this software. I know this tool is applied to the cis-eQTLs, but I want to use it to calculate the allelic Fold Change for trans-eQTLs, do you think it is right? I'm afraid it's not suitable for trans-eQTLs. In addition, my expression value has been transformed to the
quantiles of the standard normal distribution, then should I used the "--log_base" parameter? Thank you very much.

Best wishes

secastel / afc Goto Github PK

afc's People

Contributors

Stargazers

Watchers

Forkers

afc's Issues

Estimating aFC with multiple causal variants

Question about ASE calculation

Predicting CPU utilization?

Generate the QTL file from the VCF (genotypes) and BED (phenotypes)

Can I used aFC to calculate the effect size of trans-eQTL

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent