Hi Jon, I would like to use amptk filter</

amptk filter - structure of otutable.demux.fq for new OTU table about amptk HOT 3 CLOSED

nextgenusfs commented on July 20, 2024

amptk filter - structure of otutable.demux.fq for new OTU table

from amptk.

Comments (3)

nextgenusfs commented on July 20, 2024 1

You can demux each of the files using amptk illumina2 and then concatenate them together, i.e. would look something like:

amptk illumina2 -i bcl2fastq_output1_R1.fastq --reverse bcl2fastq_output1_R2.fastq \
     -f GTGARTCATCGARTCTTTG -r ITS4 --barcode_fasta forwardTags.fa  \
     --reverse_barcode reverseTags.fa -o output1

So if you run something like these for each of the sequencing center demultiplexed files and then concatenate the output1.demux.fq output2.demux.fq, etc then should be all set for any of the downstream AMPtk steps. The forward and reverse tags need to be in a multi-fasta file i.e.

>Tag1F
GATCCGATA
>Tag2F
GATTTTAAG
...
...

Other settings to consider would the the --trim_len parameter, default is 300 so since you have 2x250 you may want to reduce this to something like 240 bp so that you can rescue forward reads if merging of the PE reads isn't successful. But if reverse reads are high quality should be okay, keep in mind that if you specify a --trim_len larger than your read length then only reads that are properly merged will be used. If the files are too large to be merged with 32-bit usearch (free version) you can flip on PE merging of reads using vsearch by adding --merge_method vsearch to the above commands. If it was a MiSeq run then you shouldn't have to do this.

from amptk.

nextgenusfs commented on July 20, 2024

What format is your data in now? Probably the most important data analysis is actually the pre-processing of reads, so I would caution against doing primer stripping and quality filtering with other methods. The pre-processing scripts are fairly versatile and might work with the way you have your raw data. And I'm willing to add another method if it is a commonly used format.

You can generate a compatible OTU table with vsearch, the FASTQ headers need to contain which sampleID the read originated from. AMPtk uses ;barcodelabel=sample; format.

from amptk.

jack1120 commented on July 20, 2024

My samples were prepared (MiSeq, PE, 2x250bp) with unique combinations of tags on the 5' end of both the forward (gITS7) and reverse (ITS4) primers. These samples were split into pools and illumina indices were ligated onto the constructs (TruSeq PCR-Free LT Library Prep Kit), with a different set of indices used for each pool. The final construct looks like:

P5-Index1-For_Tag:For_Primer-Amplicon-Rev_Primer:Rev_Tag-Index2-P7

The sequencing center demultiplexed the sequences by the indices such that I received files for each of my pooled samples that, after merging, would theoretically look like:

For_Tag1:For_Primer-Amplicon-Rev_Primer:Rev_Tag1
For_Tag2:For_Primer-Amplicon-Rev_Primer:Rev_Tag2
For_Tag3:For_Primer-Amplicon-Rev-Primer:Rev_Tag3
etc.

Where each unique tag combination represents amplicons from a given sample (but all in a single fastq file). Although these are Illumina sequences, the structure is more similar to what you describe for processing Ion Torrent or 454 data, but with an additional tag attached to the reverse primer.

I am currently demultiplexing the R1 (using the For_Tags) and R2 (using the Rev_Tags) files independently, looking in both the 5'-3' and 3'-5' orientations (how does amptk handle mixed orientations?). The tags and primers are then trimmed, and the sequences are fed into DADA2 where they are quality filtered, merged, and denoised. The resulting OTU tables are then re-formatted into a vsearch-compatible format for further processing.

from amptk.

amptk filter - structure of otutable.demux.fq for new OTU table about amptk HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent