Comments (5)
Hi,
This is an issue with the G statistic Im not sure was ever resolved. The equation is taken directly from Magwene et al., 2011. So it's not really my code error or something.
However, I think that in general unless you have amazingly perfect bulking, this rarely happens in real data. And in any case you are smoothing over many SNPs and so they all shouldn't be perfectly zero.
I've thought of adding 0.5 or something to the obs if they are zero but I didn't want to change the method.
Let me know your thoughts.
Ben
from qtlseqr.
hi,Ben
Thanks for your so fast reply, And thank you for your code. Many of my friends are using it. It's very practical! I just want to discuss and learning with you.
If I understand correctly, "LowRef" represents the "REF_depth" of a SNP site in the low pool. If any element in obs is equal to zer0 ,such as LowRef=0 , log (0) will occur;
This is my unserstand of the formula ,I am not sure, Am I right?
Thanks
from qtlseqr.
Thanks for the comments :-) I hope the package is useful to you all!
This is a good discussion and I've been meaning to raise it with the authors of the original paper (Magwene et al., 2011)
Here's two screenshots from the paper
And my code for G stat:
Lines 29 to 44 in 5e76137
The function takes in the allele depth values for the above table (ie ni) and first calculates a vector of the expected values for the denominator (roughly half of the read depth).
It then takes the observed values for each of ni and puts it in the numerator.
Thus if any ni is exactly zero. log0 will occur as you say.
Again as far as I can see this is an imperfection with how Magwene et al implement the G statistic.
Unless my code or interpretation of their formula is terribly wrong (in which case please let me know!)
I do however stand by my previous comments that due to Poisson noise in sequencing and imperfect bulking this is unlikely to happen in large chunks of SNPs. So that these NAN values eventually get ignored during the smoothing process of G.
There are options for zero-substitution procedures but after playing with some of them I found them to affect G in ways that depended on read depth etc. and so I opted to not implement them.
-Ben
from qtlseqr.
OK, you are such a responsible guy, ha ha
I understand. Thank you very much.
from qtlseqr.
Great! no problem.
I will leave the issue open in case anyone wants to contribute.
from qtlseqr.
Related Issues (20)
- Error: Problem with `mutate()` column `tricubeDeltaSNP` HOT 1
- argument 'x' contains missing values HOT 1
- windowSize change, popStruc change HOT 3
- Help me please... HOT 8
- df <- importFromTable() troubles HOT 3
- QTLseqR on GBS data HOT 1
- No QTL signal when running runQTLseqAnalysis() but a clear signal when running runGprimeAnalysis() HOT 3
- cant install this R package HOT 2
- runQTLseqAnalysis error HOT 4
- install error!! HOT 1
- Issue with CI names HOT 1
- ImportFromGATK refuses to take in non-numerical chromosomes HOT 8
- Help interpreting odd G'value plot HOT 1
- Out of vertex space: Problem adjusting maxk value in locfit.raw HOT 1
- No License HOT 1
- how to define Chroms HOT 2
- Analysis of inbread strains: Help needed HOT 2
- License Request HOT 2
- runGprimeAnalysis - Error in dplyr::mutate() HOT 1
- Error while running runQTLseqAnalysis HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qtlseqr.