Hi Mike - I am getting back to a series of datasets that I had processed months ba

HappyBelly workflow change: cut adapt vs bbmap/bbduk about astrobiomike.github.io HOT 3 CLOSED

lara-whoi commented on July 18, 2024

HappyBelly workflow change: cut adapt vs bbmap/bbduk

from astrobiomike.github.io.

Comments (3)

lara-whoi commented on July 18, 2024 1

Hi again -
After digging deeper it seems I should just be able to trim within DADA2 to have identical length sequences...

HEre's one approach:
I'll trim each run to the same region (i.e. same trimLeft for merged paired end data, and same trimLeft and truncLen for single-end data) to allow merging later.

I plan to use the script
mergeSequenceTables
mergetab <- mergeSequenceTables(seqtab1, seqtab2, seqtab3)

For reference - Samples Include:
• DNA:sequenced region: 16S rRNA V1+2+3 PRIMERS 27F/519R (region length: ~453 bases; with primers removed)

With the Illumina PE run – I’ll have all 453 bp covered (entire fragment, primers removed). But will trim to 325 bp
With the 1st 454 SE runs – I’ve trimmed to 325 bp (maintaining a quality score ~30)
With the 2nd 454 SE runs – I’ll trim to 325 bp (to maintain a quality score of ~25)

Thank you very much for your time and help!

from astrobiomike.github.io.

AstrobioMike commented on July 18, 2024

Hey there, Lara!

There is no reason i know of that you shouldn't use bbmap/bbduk. It has an insane amount of functionality and is a great tool/set of tools. I tend to use more user-friendly things where possible is all, and cutadapt seems more straightforward to me. That particular example dataset from the page is a little extra tricky with there being full overlap and therefore needing to look for both forward and reverse primers in either orientation on each read, and i remember just finding bbmap being able to handle it first at the time i initially put it together. Then later when i revisited cutadapt for something i realized they had the functionality and it seemed more intuitive to me is all :)

Whatever you use, just make sure you look at some of your reads after cutting to be sure that the primers were removed as expected.

Just a passing thought on the merging datasets, keep in mind if they were done with different primers you might have differences because of that in the mix too. It might not matter at all, but you could also miss something totally with one set of primers vs another. To be clear, I don't think you need to do anything differently, I would just keep that present in your mental landscape so when you start trying to find signals in all this data, if something very strong stands out and correlates with the primer differences (if there are any), you can be thinking about it in a more full context – and can report it as such and/or try to figure out if that's likely to be the cause :)

from astrobiomike.github.io.

lara-whoi commented on July 18, 2024

Thanks so much Mike!
Good to know.
And thank you for the thoughts on primer selectivity - all three runs were with the same primers 27F/519R (just different platforms), so we should be in the clear.
The quality was certainly lower on 454 runs than illumina!

Next step is to cut all sequences at the same location so I can merge the 3 data sets in phyloseq.
Because I have SE 454 reads - I'll have to cut everything at around 225bp to maintain a quality score >25.

Would you just look for a conserved region in the area to be cut and essentially design a primer for that region and then use bbduk (or cutadapt possibly)? I'm in new territory here..

thanks a bunch -
Lara

from astrobiomike.github.io.

HappyBelly workflow change: cut adapt vs bbmap/bbduk about astrobiomike.github.io HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent