Coder Social home page Coder Social logo

Comments (2)

nextgenusfs avatar nextgenusfs commented on August 20, 2024

Yeah -- there is some basic/rudimentary processing in the codebase already -- so demultiplexing the reads is easy and already done. The most difficult part will be dealing with the error rates and de novo clustering, ie with ONT simplex reads at say 96% accurate -- you have a hard time splitting those raw data into appropriate bins to create OTUs. I tried several things awhile ago and wasn't too happy with the data, although I had old ONT data so that is also part of the issue. Newer data, ie from the LSK112/R10.4 setup is much better single read accuracy. I also tried some clustering with isONClust that works sort of okay (its built for clustering transcript data), but its still quite difficult to sort through the noisy reads and get reliable clusters/OTUs. Perhaps generating duplex reads would have high enough accuracy where you could just cluster with something like uclust/vsearch.

What sequencing kits did you run these with? If you know what should be in these samples (ie high quality Sanger data for every specimen)then yes would help immensely in trying to figure out a de novo approach.

from amptk.

hyphaltip avatar hyphaltip commented on August 20, 2024

PeterKennedy showed those early ONT results at MSA22 that you had worked on @nextgenusfs - def limited value for the older amplicons, but we also discussed with others doing PacBio on amplicons getting really good results. I think newer data def worth a look.

Also Ryan Wick's twitter post on doing short-read assembly with ONT also demonstrates how the accuracy is improved in R10.4 https://twitter.com/rrwick/status/1548926644085108738

It might be worth looking at a different clustering approach since error model for usearch/vsearch might not be able to really model the ONT error as well.

from amptk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.