Coder Social home page Coder Social logo

Minimap2 vs Last running time about nanosim HOT 5 CLOSED

bcgsc avatar bcgsc commented on May 30, 2024
Minimap2 vs Last running time

from nanosim.

Comments (5)

cheny19 avatar cheny19 commented on May 30, 2024

Hi Natasha,

When NanoSim enters model fitting phase, it has nothing to do with the aligner any more. read_analysis.py was trying to find the best parameter combinations for the error model and it took nearly 10 hours for the alignment results from minimap2. Could you send me your alignment files or raw data so I can test? I'm curious why it took so long.

Thanks,
Chen

from nanosim.

npavlovikj avatar npavlovikj commented on May 30, 2024

Thanks for your reply and explanation @cheny19 .

I used two different datasets, and in both cases the run time with "minimap2" was significantly longer.
Please find the datasets uploaded here, https://drive.google.com/drive/folders/1p9OSIXseyGoXoKv9oNYaoP8PhaLqYygI?usp=sharing.
I used "nanosim-2.0.0", "minimap2, v2.10-r761" with 4 threads, and "last, v876".

Please let me know if you need any additional information.

Thank you,
Natasha

from nanosim.

cheny19 avatar cheny19 commented on May 30, 2024

Hi Natasha,

Sorry I didn't reply until now. Last few weeks I looked into the code and I think the runtime is heavily dragged down by R. So I re-wrote the model fitting part in Python (which is faster than R) and it now supports multiprocessing. The model fitting stage can be finished within a hour now. Please download the latest commit and have a try. I haven't made the new release yet, because I have more testing to do, but on your dataset, it works fine.

The error model is a bit different between minimap2 and LAST, and the original proposed model may not fit very well on errors inferred from minimap2. That is also why it ran for so long in previous versions. NanoSim will throw a warning if the fitted model cannot pass statistical test, but don't worry, it's still close to the emprical errors. I'll keep looking and see if there are better models.

Thanks for pointing this out!

from nanosim.

npavlovikj avatar npavlovikj commented on May 30, 2024

Hi @cheny19 - all this sounds great - thank you so much for improving NanoSim!
I will have some time next week to test this out if that is ok.
In the meantime, does this mean that I can still use the simulated reads generated by Minimap2 in the previous version, or I need to re-run the simulation now?

Thank you,
Natasha

from nanosim.

cheny19 avatar cheny19 commented on May 30, 2024

You can still use them.

from nanosim.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.