Describe the bug hello, sorry to disturb you again. I tri

[BUG] DADA2 ERROR while running my data,about quadram-institute-bioscience/dadaist2

Comments (4)

telatin commented on August 29, 2024

Howdy!
the problem must be fixed at the source of the problem: too many filtered reads.
One way of course is just lowering the threshold to allow an aggressive filtering, but in this case 4% of the totals looks worth investigating and maybe adjusting the parameters (truncQ, maxee, trunc...) to have less reads filtered in the first place.
Different providers or sequencing core can have very different output: once you can tune some parameters based on your usual supplier, you should be able to adjust the pipeline quickly.

A way to investigate the biggest loss is checking the dada2-stats file where you'll see the number of reads retained at each step. Can you please post it?

from dadaist2.

najibveto commented on August 29, 2024

thank you for your reply.
the dada2-stats file is a follow:

	input	filtered	denoised	merged	non-chimeric
W2111_R1.fastq.gz	71411	19455	19455	13308	2653
W2201_R1.fastq.gz	76186	23861	23861	20669	2085
W2202_R1.fastq.gz	69819	25303	25303	21829	2245
W2203_R1.fastq.gz	91891	46711	46711	34512	6685

i tried to lower the loss to 5% as suggested but i got the same error.

from dadaist2.

telatin commented on August 29, 2024

From the stats, there is a significant loss in the filtering step, but not dramatic. A further significant loss is in the non-chimeric step. Maybe relaxing the initial filtering can improve the process, if the chimaera detection is right, maybe the library was amplified a lot (or there could be other sources?)

While I do not recommend lowering the loss parameter to bypass the issue, as this is not a bug but a sanity check to prevent misinterpreting potentially noisy results, if the message is DADA2 filtered too many reads: 4.7926%, you should try with 4% (or simply 1% to keep it disabled :) )

from dadaist2.

najibveto commented on August 29, 2024

hello,
I tried both percentage for loss (1% and 4%) and both of them worked and i got the phyloseq object as well the MicrobiomeAnalyst files, when i tried with 5% loss, i got the usual error.
i put the one with 4% loss.


╭╴ at  ~ via  v3.9.13 via 🅒  dadaist
╰─ dadaist2  --max-loss 0.04 -i metagenome/16S/ -o water -m metadata.tsv -d ~/refs/silva_nr_v138_train_set.fa.gz
    ____            __      _      __ ___
   / __ \____ _____/ /___ _(_)____/ /|__ \
  / / / / __ `/ __  / __ `/ / ___/ __/_/ /
 / /_/ / /_/ / /_/ / /_/ / (__  ) /_/ __/
/_____/\__,_/\__,_/\__,_/_/____/\__/____/

1.2.5

[WARNING] Output directory found.
 This is a warning but in future releases this might require to specify --force to proceed.
[2022-08-16 09:20:21] Ready to log in /home/najib/water/dadaist.log
[2022-08-16 09:20:21] dadaist2 1.2.5
[2022-08-16 09:20:21] Taxonomy database found: /home/najib/refs/silva_nr_v138_train_set.fa.gz
[2022-08-16 09:20:21] Parameter: taxonomy-type: dada2
[2022-08-16 09:20:21] Parameter: taxonomy-db: /home/najib/refs/silva_nr_v138_train_set.fa.gz
 * Input directory: metagenome/16S/
 * Output directory: /home/najib/water/
 * Metadata: metadata.tsv
 * Reference database: /home/najib/refs/silva_nr_v138_train_set.fa.gz
 * Threads: 6
 * Temporary directory: /tmp/dadaist2_1fIjRN
 * QC strategy: skip
[2022-08-16 09:20:21] QC: Checking quality profile with SeqFu
[2022-08-16 09:20:22] SeqFu quality truncation at (trunc-len-1 and trunc-len-2): 290 - 231
[2022-08-16 09:20:22] Checking dependencies
 * RScript: R scripting front-end version 4.0.5 (2021-03-31)
 * Taxonomy: dadaist2-assigntax 1.1.3
 * assign-taxonomy: dadaist2-assigntax 1.1.3
 * clustalo: 1.2.4
 * dada2 (lib): <pass>
 * exporter: dadaist2-exporter 1.4.0
 * fastp: fastp 0.23.2
 * fasttree: FastTree version 2.1.11 Double precision (No SSE3):
 * fu-primers: fu-primers 1.12.0
[2022-08-16 09:20:27] Temporary directory: /tmp/dadaist2_1fIjRN
[2022-08-16 09:20:27] Threads: 6
[2022-08-16 09:20:27] Output directory: /home/najib/water/
[2022-08-16 09:20:27] Checked metadata for autumn
[2022-08-16 09:20:27] Checked metadata for spirng
[2022-08-16 09:20:27] Checked metadata for summer
[2022-08-16 09:20:27] Checked metadata for winter
[2022-08-16 09:20:27] Input directory "metagenome/16S/": 4 found (paired-end)
[2022-08-16 09:20:27] (1/4) Processing autumn: skip
[2022-08-16 09:20:27] Copying input reads for DADA2
[2022-08-16 09:20:27] (2/4) Processing spirng: skip
[2022-08-16 09:20:27] Copying input reads for DADA2
[2022-08-16 09:20:27] (3/4) Processing summer: skip
[2022-08-16 09:20:27] Copying input reads for DADA2
[2022-08-16 09:20:27] (4/4) Processing winter: skip
[2022-08-16 09:20:27] Copying input reads for DADA2
[2022-08-16 09:20:27] Running DADA2...
[2022-08-16 09:20:27] Dada2 script parameters:
 * [1] forward_reads: /tmp/dadaist2_1fIjRN/for
 * [2] reverse_reads: /tmp/dadaist2_1fIjRN/rev
 * [3] feature_table_output: /tmp/dadaist2_1fIjRN/dada2/dada2.tsv
 * [4] stats_output: /tmp/dadaist2_1fIjRN/dada2/stats.tsv
 * [5] filt_forward: /tmp/dadaist2_1fIjRN/for/filtered
 * [6] filt_reverse: /tmp/dadaist2_1fIjRN/rev/filtered
 * [7] truncLenF: 290
 * [8] truncLenR: 231
 * [9] trimLeftF: 0
 * [10] trimLeftR: 0
 * [11] maxEEF: 1
 * [12] maxEER: 1.5
 * [13] truncQ: 10
 * [14] chimeraMethod: consensus
 * [15] minFold: 1
 * [16] threads: 6
 * [17] nreads_learn: 0
 * [18] baseDir: /tmp/dadaist2_1fIjRN
 * [19] doPlots: do_plots
 * [20] taxonomyDb: /home/najib/refs/silva_nr_v138_train_set.fa.gz
 * [21] saveRDS: no
 * [22] noMerge: 0
 * [23] processPool: 0
[2022-08-16 09:28:44] DADA2 Finished.
[2022-08-16 09:28:44] Converting dada2 taxonomy output: /tmp/dadaist2_1fIjRN/taxonomy.tsv
[2022-08-16 09:28:44] 922 representative sequences found.
[2022-08-16 09:28:44] DADA2 filtered 4.7926% from total 486266 to 23305
[2022-08-16 09:28:44] Multiple sequence alignment and tree generation
[2022-08-16 09:29:20] Feature tree generated
[2022-08-16 09:29:20] Exporting MicrobiomeAnalyst
[2022-08-16 09:29:24] Generating PhyloSeq object
[2022-08-16 09:29:26] Rhea normalization/alpha finished.
[2022-08-16 09:29:26] Dadaist finished, output files saved:
 * dada-taxonomy-table: /home/najib/water/taxonomy.txt
 * feature-table: /home/najib/water/feature-table.tsv
 * features-tree: /home/najib/water/rep-seqs.tree
 * mba-files: /home/najib/water/MicrobiomeAnalyst
 * multiple-alignment: /home/najib/water/rep-seqs.msa
 * phyloseq: /home/najib/water/R/phyloseq.rds
 * rep-seqs: /home/najib/water/rep-seqs.fasta
 * rhea: /home/najib/water/Rhea

from dadaist2.

[BUG] DADA2 ERROR while running my data about dadaist2 HOT 4 CLOSED

Comments (4)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent