Coder Social home page Coder Social logo

Comments (3)

lara-whoi avatar lara-whoi commented on July 18, 2024 1

Hi again -
After digging deeper it seems I should just be able to trim within DADA2 to have identical length sequences...

HEre's one approach:
I'll trim each run to the same region (i.e. same trimLeft for merged paired end data, and same trimLeft and truncLen for single-end data) to allow merging later.

I plan to use the script
mergeSequenceTables
mergetab <- mergeSequenceTables(seqtab1, seqtab2, seqtab3)

For reference - Samples Include:
• DNA:sequenced region: 16S rRNA V1+2+3 PRIMERS 27F/519R (region length: ~453 bases; with primers removed)

With the Illumina PE run – I’ll have all 453 bp covered (entire fragment, primers removed). But will trim to 325 bp
With the 1st 454 SE runs – I’ve trimmed to 325 bp (maintaining a quality score ~30)
With the 2nd 454 SE runs – I’ll trim to 325 bp (to maintain a quality score of ~25)

Thank you very much for your time and help!

from astrobiomike.github.io.

AstrobioMike avatar AstrobioMike commented on July 18, 2024

Hey there, Lara!

There is no reason i know of that you shouldn't use bbmap/bbduk. It has an insane amount of functionality and is a great tool/set of tools. I tend to use more user-friendly things where possible is all, and cutadapt seems more straightforward to me. That particular example dataset from the page is a little extra tricky with there being full overlap and therefore needing to look for both forward and reverse primers in either orientation on each read, and i remember just finding bbmap being able to handle it first at the time i initially put it together. Then later when i revisited cutadapt for something i realized they had the functionality and it seemed more intuitive to me is all :)

Whatever you use, just make sure you look at some of your reads after cutting to be sure that the primers were removed as expected.

Just a passing thought on the merging datasets, keep in mind if they were done with different primers you might have differences because of that in the mix too. It might not matter at all, but you could also miss something totally with one set of primers vs another. To be clear, I don't think you need to do anything differently, I would just keep that present in your mental landscape so when you start trying to find signals in all this data, if something very strong stands out and correlates with the primer differences (if there are any), you can be thinking about it in a more full context – and can report it as such and/or try to figure out if that's likely to be the cause :)

from astrobiomike.github.io.

lara-whoi avatar lara-whoi commented on July 18, 2024

Thanks so much Mike!
Good to know.
And thank you for the thoughts on primer selectivity - all three runs were with the same primers 27F/519R (just different platforms), so we should be in the clear.
The quality was certainly lower on 454 runs than illumina!

Next step is to cut all sequences at the same location so I can merge the 3 data sets in phyloseq.
Because I have SE 454 reads - I'll have to cut everything at around 225bp to maintain a quality score >25.

Would you just look for a conserved region in the area to be cut and essentially design a primer for that region and then use bbduk (or cutadapt possibly)? I'm in new territory here..

thanks a bunch -
Lara

from astrobiomike.github.io.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.