Coder Social home page Coder Social logo

afc's People

Contributors

brentp avatar danielduyvo avatar eflynn90 avatar evanbiederstedt avatar francois-a avatar secastel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

afc's Issues

Estimating aFC with multiple causal variants

Not sure if this tool is still supported but I was interested in using it to calculate aFC in some eQTL data I am working with. Some of the genes in this data set are predicted to have multiple causal variants, and I'm wondering if you have any recommendations for how best to calculate the aFC of each causal variant.

In your manuscript, I see that you calculate aFC for GTEx genes with two eQTLs, but I wasn't sure how you actually went about doing that, and I don't see any way to handle this use case in the tool. A naive approach would be just to run the tool as is, which would provide aFC estimates of each variant, but would not account for the additive effects of the variants.

One thought I had is that for each tested eQTL, I could include the other eQTLs for that gene as covariates and regress them out along with the other covariates before estimating aFC. But I'm not sure that would properly handle the discrepancies between cases where the "high expression" alleles are on the same vs. different haplotypes.

Any guidance is greatly appreciated.

Question about ASE calculation

Hello:

We are preparing a manuscript and want to use the same method to show the correlation between log2aFC and ASE-based estimate (similar to figure 5 b/c in Genome Research paper). We currently already calculated the log2aFC using eQTL data (fastQTL input files). So, can you let us know how to calculate the ASE-based estimate, and then we can calculate the correlation? Thanks. If you can provide a script to do this, it will be very helpful to us. We will cite your paper in future publication. Thanks.

Predicting CPU utilization?

I'm trying to run aFC on a very large dataset - literally millions of QTLs. I've broken the work up into small chunk and am running 100s of parallel jobs on a compute cluster. The problem I'm running into is that I'm having a very hard time predicting the CPU usage of the jobs.

What I'm seeing is that many of the jobs run at ~100% of one CPU for most of their runtime. Then once in a while a bunch of jobs will spike up significantly, consuming anywhere from 300%-1200% CPU (i.e. 3-12 cores). This is causing quite a problem for me because I'm left with the choice fo either scheduling the jobs with 1 cpu each and dealing with the mayhem that ensues when a non-trivial number of jobs spike, or scheduling multiple cpus per job and watching my compute farm sit half or more idle most of the time.

I've taken a brief read through the source code and can't see any references to multi-processing, threads or parallelism, but I'm also not experienced with numpy/pandas, so it's very possible I'm missing something.

Any pointer or insight into what might be causing the CPU spikes and how to deal with them would be great appreciated.

Generate the QTL file from the VCF (genotypes) and BED (phenotypes)

@secastel
I was looking at the (GTEx portal)[https://gtexportal.org/home/datasets]) and I don't see a comprehensive list of QTLs (SNP id + gene id). I am thinking I can create my own from the VCF (genotypes) and BED (phenotypes) by simply finding all pairs of overlapping SNPs and genes.

  • This may produce a file with multiple rows having the same gene id [pid] since multiple SNPs may overlap a single gene.
  • If there are genes/phenotypes [pid] that overlap, this would mean that there may be multiple rows having the same SNP id [sid].

Does this seem like a valid strategy to create a comprehensive QTL file if I want to test ALL QTLs?

Can I used aFC to calculate the effect size of trans-eQTL

Hi,
Thank you for developing this software. I know this tool is applied to the cis-eQTLs, but I want to use it to calculate the allelic Fold Change for trans-eQTLs, do you think it is right? I'm afraid it's not suitable for trans-eQTLs. In addition, my expression value has been transformed to the
quantiles of the standard normal distribution, then should I used the "--log_base" parameter? Thank you very much.

Best wishes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.