secastel / afc Goto Github PK
View Code? Open in Web Editor NEWCalculates allelic Fold Change (aFC) using standard input files for fastQTL.
Calculates allelic Fold Change (aFC) using standard input files for fastQTL.
Not sure if this tool is still supported but I was interested in using it to calculate aFC in some eQTL data I am working with. Some of the genes in this data set are predicted to have multiple causal variants, and I'm wondering if you have any recommendations for how best to calculate the aFC of each causal variant.
In your manuscript, I see that you calculate aFC for GTEx genes with two eQTLs, but I wasn't sure how you actually went about doing that, and I don't see any way to handle this use case in the tool. A naive approach would be just to run the tool as is, which would provide aFC estimates of each variant, but would not account for the additive effects of the variants.
One thought I had is that for each tested eQTL, I could include the other eQTLs for that gene as covariates and regress them out along with the other covariates before estimating aFC. But I'm not sure that would properly handle the discrepancies between cases where the "high expression" alleles are on the same vs. different haplotypes.
Any guidance is greatly appreciated.
Hello:
We are preparing a manuscript and want to use the same method to show the correlation between log2aFC and ASE-based estimate (similar to figure 5 b/c in Genome Research paper). We currently already calculated the log2aFC using eQTL data (fastQTL input files). So, can you let us know how to calculate the ASE-based estimate, and then we can calculate the correlation? Thanks. If you can provide a script to do this, it will be very helpful to us. We will cite your paper in future publication. Thanks.
I'm trying to run aFC on a very large dataset - literally millions of QTLs. I've broken the work up into small chunk and am running 100s of parallel jobs on a compute cluster. The problem I'm running into is that I'm having a very hard time predicting the CPU usage of the jobs.
What I'm seeing is that many of the jobs run at ~100% of one CPU for most of their runtime. Then once in a while a bunch of jobs will spike up significantly, consuming anywhere from 300%-1200% CPU (i.e. 3-12 cores). This is causing quite a problem for me because I'm left with the choice fo either scheduling the jobs with 1 cpu each and dealing with the mayhem that ensues when a non-trivial number of jobs spike, or scheduling multiple cpus per job and watching my compute farm sit half or more idle most of the time.
I've taken a brief read through the source code and can't see any references to multi-processing, threads or parallelism, but I'm also not experienced with numpy/pandas, so it's very possible I'm missing something.
Any pointer or insight into what might be causing the CPU spikes and how to deal with them would be great appreciated.
@secastel
I was looking at the (GTEx portal)[https://gtexportal.org/home/datasets]) and I don't see a comprehensive list of QTLs (SNP id + gene id). I am thinking I can create my own from the VCF (genotypes) and BED (phenotypes) by simply finding all pairs of overlapping SNPs and genes.
Does this seem like a valid strategy to create a comprehensive QTL file if I want to test ALL QTLs?
Hi,
Thank you for developing this software. I know this tool is applied to the cis-eQTLs, but I want to use it to calculate the allelic Fold Change for trans-eQTLs, do you think it is right? I'm afraid it's not suitable for trans-eQTLs. In addition, my expression value has been transformed to the
quantiles of the standard normal distribution, then should I used the "--log_base" parameter? Thank you very much.
Best wishes
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.