Coder Social home page Coder Social logo

General guidelines for Pgen about igor HOT 3 OPEN

jeremycfd avatar jeremycfd commented on August 26, 2024
General guidelines for Pgen

from igor.

Comments (3)

qmarcou avatar qmarcou commented on August 26, 2024

Hi Jeremy,
Sorry for the late reply. These are two interesting questions, see my answers below:

  1. From what we have observed only gene usage and/or alleles sequences vary among the recombination machinery of different individuals. If you were to have a few hundreds of out of frame sequences I would have recommended to re-learn only the gene usage distributions. Now in your case the best would probably be to use the provided model as such. Anyway, these gene usage variability is not what's controlling most of Pgen variations.

  2. There are two different things here: the number of scenarios explored by IGoR and the number of scenarios that IGoR outputs. Even by specifying --scenarios 50 IGoR will explore many more of them, however only 50 of them will be written into file in the output directory. What is controlling the number of scenarios IGoR explores during an Expectation-Maximization step are the --P_ratio_thresh and/or --MLSO commands. In theory the more scenarios have been explored the best, in practice there is a balance with runtime, but the probability ratio threshold should not be set too high.

Hope this answers your questions

from igor.

jeremycfd avatar jeremycfd commented on August 26, 2024

Thanks for the explanation! I find that setting --P_ratio_thresh to 0.0 causes issues (every Pgen comes back as nan) but I can set it to extremely small values (e.g., 1E-10) without issue. What is the default P_ratio_thresh? (Perhaps that could be added to man igor).

Cheers.

from igor.

qmarcou avatar qmarcou commented on August 26, 2024

Mmm that is odd, as explained in here setting it to 0.0 should explore all possible scenarios (yielding a very slow execution time) at first thought I don't see why this should return nan. Could you attach a sample of the pgen, and inference_logs files for debugging purposes?
The default value for this parameter is 10^{-5}, I actually thought it was in the README, this will be added, thanks for pointing this out!
Thanks!

from igor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.