hildebra / lotus2 Goto Github PK
View Code? Open in Web Editor NEWAmplicon sequencing pipelines suitable for SSU (16S, 18S), LSU (23S, 28S) and ITS.
Home Page: http://lotus2.earlham.ac.uk/
License: GNU General Public License v3.0
Amplicon sequencing pipelines suitable for SSU (16S, 18S), LSU (23S, 28S) and ITS.
Home Page: http://lotus2.earlham.ac.uk/
License: GNU General Public License v3.0
Dear all,
we are testing Lotus2 for some analysis, but we would like to know if it is possible to set pool=TRUE when running DADA2, or if pooling is already set by default?
Best,
Ramiro
I am using header name from hashing ASV sequences to integrate ASV table from different datasets. I found that there were duplicated ASV sequences. Why?
Here is my lotus2 command:
lotus2 -i $PWD -m $PWD/1_miSeqMap.sm.txt
-s /mnt/d/Myfile/DATA/beforework/lotus2/1sdm_miSeq.txt
-o lotus2_output
-p miSeq -amplicon_type SSU -tax_group bacteria
-forwardPrimer $front_f
-reversePrimer $front_r
-CL dada2 -refDB SLV -taxAligner lambda
-rdp_thr 0.7 -buildPhylo 0 -t 6 -sdmThreads 6
The problem still happened when closing LULU option. (-lulu 0)
I discovered that Lotus2 cannot delete the primer when I specify the primer sequence using a mapping file, but it works correctly when I specify the primer sequence by command line.
Hi Lotus team.
Using data that has run perfectly on Lotus2 before (with both RDP and SLV DBs), I changed the clustering option to dada2 as I wanted to have ASVs rather than OTUs. However upon adding -CL dada2, a got a huge range of errors according to the progout log. Am i just missing something in my command (below) or is something wrong?
perl ./lotus2 -i Will -o Will/outputSLV_DADA2 -m WillMAP.txt -refDB SLV -taxAligner blast -CL dada2
many thanks
Will
Hi, thank you for this easy-to-use amplicon read analysis pipeline. I was wondering if there is a way to get the number of reads passed through each step. DADA2 provides read counts across samples for each step of processing through:
getN <- function(x) sum(getUniques(x))
track <- cbind(out, sapply(dadaFs, getN), sapply(mergers, getN), rowSums(seqtab),
rowSums(seqtab.nochim))
# If processing a single sample, remove the sapply calls: e.g. replace sapply(dadaFs, getN) with getN(dadaFs)
colnames(track) <- c("input", "filtered", "denoised", "merged", "tabled", "nonchim")
rownames(track) <- sample.names
head(track)
It would be very nice if lotus2 could do that.
I'm running lotus2 mostly with the default settings, but for the RDPclassifier I get an error in the following line
systemL "mv $outdir/hierachy_cnt.tax $outdir/cnadjusted_hierachy_cnt.tax $extendedLogD/;";
Lotus2 stops because cnadjusted_hierachy_cnt.tax does not exist.
When looking through the output folder the hierachy_cnt.tax file is located in a subfolder (ExtraFiles).
If I paste cnadjusted_hierachy_cnt.tax in the output folder (before lotus2 executes RDP) it works.
SDM requires some libraries which are not in the default yum repos. Is there a workaround for the installation on CentOS?
lotus2/bin/sdm: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by lotus2/bin/sdm)
lotus2/bin/sdm: /lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by lotus2/bin/sdm)
lotus2/bin/sdm: /lib64/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by lotus2/bin/sdm)
lotus2/bin/sdm: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by lotus2/bin/sdm)
lotus2/bin/sdm: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by lotus2/bin/sdm)
Thanks in advance :)
Ulrike
Hi,
I am processing a set of samples (miseq) and I want to compare OTUs and zOTUs. I have added usearch11 to LotuS and run two commands:
perl lotus2 -i /media/fulgencio/DATOS/species_divergence -m para_LOTUS.txt -o zotus_output-s sdm_miSeqDEF.txt -threads 24 -p miSeq -clustering unoise3 -refDB SLV --taxAligner 1 -derepMin 0 -lulu 0 -buildPhylo 0 -verbosity 3
perl lotus2 -i /media/fulgencio/DATOS/species_divergence -m para_LOTUS.txt -o otus_output -s sdm_miSeqDEF.txt -threads 24 -p miSeq -refDB SLV --taxAligner 1 -derepMin 0 -lulu 0 -buildPhylo 0 -verbosity 3
When I look to LotuS_run.log I can see in both cases that OTU id=0.97, and when running the unoise3 the message says that "UNOISE core routine Cluster at 97% "Should I pass -id parameter as 1?
Another question is when looking at demulti.log I see that most of my reverse reads are rejected, and reading that log I cannot understand why this is happening?
Reads processed: 11,632,069; 11,632,069 (pair 1;pair 2)
Rejected: 4,335,473; 10,141,773
Below I paste my smd file
Thank you very much in advance.
Manuel
#sdm options file to control sequence quality filtering, demultiplexing and preparation (can also be used without demultiplexing)
#* indicates alternative quality filtering options, saved in *.add.fna etc. files separately from initial quality filtered dataset
#sequence length refers to sequence length AFTER removal of Primers, Barcodes and trimming. this ensures that downstream analyis tools will have appropiate sequence information
#options with a star in front are lenient parameters for mid qual sequences (only used for estimating OTU abundance, not for OTU building itself).
minSeqLength 250
maxSeqLength 256
minAvgQuality 27
*minSeqLength 170
*minAvgQuality 20
#truncate total Sequence length to X (length after Barcode, Adapter and Primer removals, set to -1 to deactivate)
TruncateSequenceLength -1
#Ambiguous bases in Sequence
maxAmbiguousNT 0
*maxAmbiguousNT 1
#sequence is discarded if a homonucleotide run in sequence is longer
maxHomonucleotide 8
#Filter whole sequence if one window of quality scores is below average
QualWindowWidth 50
QualWindowThreshhold 25
#Trim the end of a sequence if a window falls below quality threshhold. Useful for removing low qulaity trailing ends of sequence
TrimWindowWidth 20
TrimWindowThreshhold 25
#Probabilistic max number of accumulated sequencing errors. After this length, the rest of the sequence will be deleted. Complimentary to TrimWindowThreshhold. (-1) deactivates this option.
maxAccumulatedError 0.75
*maxAccumulatedError -1
#Binomial error model of expected errors per sequence (see https://github.com/fpusan/moira), to deactivate, set BinErrorModelAlpha to -1
BinErrorModelMaxExpError 2.5
BinErrorModelAlpha -1
#Max Barcode Errors
maxBarcodeErrs 0
maxPrimerErrs 0
#keep Barcode / Primer Sequence in the output fasta file - in a normal 16S analysis this should be deactivated (0) for Barcode and de-activated (0) for primer
keepBarcodeSeq 0
keepPrimerSeq 0
#set fastqVersion to 1 if you use Sanger, Illumina 1.8+ or NCBI SRA files. Set fastqVersion to 2, if you use Illumina 1.3+ - 1.7+ or Solexa fastq files. "auto" will look for typical characteristics of either of these and choose the quality offset score automatically.
fastqVersion auto
#if one or more files have a technical adapter still included (e.g. TCAG 454) this can be removed by setting this option
TechnicalAdapter
#delete X NTs (e.g. if the first 5 bases are known to have strange biases)
TrimStartNTs 0
#correct PE header format (1/2) this is to accomodate the illumina miSeq paired end annotations 2="@xxx 1:0:4" insteand of 1="@XXX/1". Note that the format will be automatically detected
PEheaderPairFmt 1
#sets if sequences without match to reverse primer (ReversePrimer) will be accepted (T=reject ; F=accept all); default=F
RejectSeqWithoutRevPrim F
#*RejectSeqWithoutRevPrim F
#sets if sequences without a forward (LinkerPrimerSequence) primer will be accepted (T=reject ; F=accept all); default=F
RejectSeqWithoutFwdPrim F
#*RejectSeqWithoutFwdPrim F
#this option should be "T" if your amplicons are possibly shorter than a single read in a paired end sequencing run (e.g. if the 16S amplicon length is 200bp in a 250x2 miSeq run, set this to "T"). This option increases runtime by 10%, if in doubt just set to "T". Requires LinkerPrimerSequence and ReversePrimer to be defined in mapping file.
AmpliconShortPE T
#options for difficulties during sequencing library construction
#checks if pair1 and pair2 were switched (ignore if single read data)
CheckForMixedPairs F
#checks if whole amplicon was reverse-transcribed sequenced (not switched, just reverse translated)
CheckForReversedSeqs F
First of all, happy new year!
I reinstalled lotus2 after trying to use the update function in lotus, which was first not successful and then deleted the autoInstall.pl during that process to work with KSGP.
However, after reinstalling, I get the following error running lotus which seems to be related to pigz. Anny suggestions how to handle this?
pigz: abort: missing parameter after -p
sh: line 1: 108829 Aborted lotus2/bin/lambda3 searchn -t 40 --percent-identity 75 --num-matches 200 --e-value 1e-8 -q Lotus2_KSGP_Alex120124/OTU.fa -i lotus2//DB//KSGP_v1.0.fasta.lba.gz -o Lotus2_KSGP/tmpFiles//tax.m8 --output-columns 'qseqid sseqid pident length mismatch gapopen qstart qend sstart send qlen' >> Lotus2_KSGP/LotuSLogS/LotuS_progout.log 2>&1
CMD failed: lotus2//bin//lambda3 searchn -t 40 --percent-identity 75 --num-matches 200 --e-value 1e-8 -q Lotus2_KSGP/OTU.fa -i lotus2//DB//KSGP_v1.0.fasta.lba.gz -o Lotus2_KSGP/tmpFiles//tax.m8 --output-columns 'qseqid sseqid pident length mismatch gapopen qstart qend sstart send qlen'
same for SLV:
Building LAMBDA index anew forlotus2//DB//SLV_138.1_SSU.fasta (this only happens the first time you use this ref DB, it may take several hours to build)..
CMD failed:
pigz -p -1 lotus2//DB//SLV_138.1_SSU.fasta.lba
see Lotus2_SLV_Alex120124/LotuSLogS/LotuS_progout.log for error log
(base) [uloeber]$ tail Lotus2_SLV_Alex120124/LotuSLogS/LotuS_progout.log
[M::mm_idx_stat] kmer size: 21; skip: 11; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.010*4.92] distinct minimizers: 900 (100.00% are singletons); average occurrences: 1.000; average spacing: 5.984
[M::worker_pipeline::0.016*4.51] mapped 526 sequences
[M::main] Version: 2.17-r941
[M::main] CMD:lotus2//bin//minimap2-2.17_x64-linux/minimap2 -x sr --sr -u both --secondary=no -N 30 -c -t 40 -o Lotus2_SLV_Alex120124/tmpFiles//otu_seeds.fna.phiX.0.cont_hit.paf lotus2//DB//phiX.fasta Lotus2_SLV_Alex120124/tmpFiles//otu_seeds.fna
[M::main] Real time: 0.017 sec; CPU: 0.074 sec; Peak RSS: 0.003 GB
Loading Subject Sequences and Ids... done.
Generating Index... done.
Writing Index to disk... done.
pigz: abort: missing parameter after -p
Thanks in advance! Cheers,
Ulrike
Question: Is lOTUs using the HMM database shipped in http://software.microbiome.ch/vxtractor.zip (linked from https://www.microbiome.ch/software ) ? It seems the main ones in http://lotus2.earlham.ac.uk/lotus/packs/VXtractor/HMMs.zip are neither the LSU or SSU ones.
I am considering what would be best for the lOTUs conda package, do we need a conda package for V-Xtractor ? Just the Perl script or also the HMM database(s)?
I have tried to run the example and got errors everytime. I managed to troubleshoot all of those related to DADA2 ASV clustering but now I get an error with building LAMDA index:
'Building LAMBDA index anew (may take up to an hour first time)..
CMD failed: /home/assem/Downloads/lotus2/lotus2//bin//lambda/lambda_indexer -p blastn -t 1 -d /home/assem/Downloads/lotus2/lotus2//DB//SLV_138.1_SSU.fasta
see myTestRun2/LotuSLogS/LotuS_progout.log for error log'
Attached is the progout.log and that's its last part:
'Writing OTU matrix to myTestRun2/OTU.txt
Recruited 214 reads in OTU matrix
Done
Time taken: : 3ms
[M::mm_idx_gen::0.0021.02] collected minimizers
[M::mm_idx_gen::0.0031.02] sorted minimizers
[M::main::0.0031.02] loaded/built the index for 1 target sequence(s)
[M::mm_mapopt_update::0.0031.02] mid_occ = 1000
[M::mm_idx_stat] kmer size: 21; skip: 11; is_hpc: 0; #seq: 1
[M::mm_idx_stat::0.0031.02] distinct minimizers: 900 (100.00% are singletons); average occurrences: 1.000; average spacing: 5.984
[M::worker_pipeline::0.0031.02] mapped 15 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: /home/assem/Downloads/lotus2/lotus2//bin//minimap2-2.17_x64-linux/minimap2 -x sr --sr -u both --secondary=no -N 30 -c -t 1 -o myTestRun2/tmpFiles//otu_seeds.fna.phiX.0.cont_hit.paf /home/assem/Downloads/lotus2/lotus2//DB//phiX.fasta myTestRun2/tmpFiles//otu_seeds.fna
[M::main] Real time: 0.004 sec; CPU: 0.004 sec; Peak RSS: 0.003 GB
Loading Subject Sequences and Ids... done.
Dumping Subj Ids... done.
No Seg-File specified, no masking will take place.
Dumping binary seqan mask file... done.
Dumping unreduced Subj Sequences... done.
Generating 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
(1) SuffixArray |Killed'
Dear all,
We are trying to se Lotus2, with dada2, to analyse some Illumina 16S data. We have two questions:
Thanks for any help,
Ramiro
are not correctly set by autoInstall.pl (they are left to paths starting with /hpc-home/hildebra/dev/lotus//bin/ ).
Lotus2 is a great tool!
I am recently curious that the table generated by Lotus2 shows more shared ASV between samples than other tools like QIIME2. Do you notice or have this phenomenon?
Are they false positve or real shared between samples? If so, changing minimap2 parameters may alleviate this problem. I guess the threshold needed to be tuned for different Lotus2 setting (maping reads for ASV and 97% OTU intuitively requried different parameters)
Sorry for the redundance, this is a shorter report related to #31
I can analyse the amplicons V1-V9 using lotus 2 on the same reads but when I try to get lotus2 look only to the central part (V3-V4) it fails
I was hoping that specifying primer sequences for V3 and V4 (rev-comp) would take care of trimming the first 300bps (until forward primer in V3) and long back tail (after V4 primer) and analyze the resulting 444bps fragment from the 1500bps original amplicon.
The sequences all end up in the (Mid qual) class while for the full amplicon run they were all in (High qual)
Is it possible that the triming of such long ends is not possible using lotus the way I tried?
If not possible, this will also answer ticket #31, and I need to trim my reads externally before running lotus2
Best regards,
Stephane
$ mapping_file_V3V4.tsv
#SampleID fastqFile ForwardPrimer ReversePrimer
4170_bc1005--bc1096 4170_bc1005--bc1096.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4356_bc1005--bc1112 4356_bc1005--bc1112.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4285_bc1022--bc1107 4285_bc1022--bc1107.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4296_bc1022--bc1060 4296_bc1022--bc1060.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4356_bc1012--bc1098 4356_bc1012--bc1098.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4112_bc1008--bc1075 4112_bc1008--bc1075.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4128_bc1005--bc1107 4128_bc1005--bc1107.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
Using Silva SSU ref seq database.
--------------------------------------------------------------------------------
00:00:00 LotuS 2.23
COMMAND
perl /opt/miniconda3/envs/lotus2.23/bin/lotus2 -i /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
-m mapping_file_V3V4.tsv -o lotus2_pacbio_V3V4 -tmp /data/analyses/Zymo-SequelIIe-Hifi-V3V4/tmp
-s sdm_PacBio_LSSU_V3V4.txt -p PacBio -t 80 -amplicon_type
SSU -CL cdhit -refDB SLV -taxAligner lambda -useVsearch 1
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Reading mapping file
Sequence files are indicated in mapping file.
--------------------------------------------------------------------------------
------------ I/O configuration --------------
Input /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
Output lotus2_pacbio_V3V4
SDM options sdm_PacBio_LSSU_V3V4.txt
TempDir /data/analyses/Zymo-SequelIIe-Hifi-V3V4/tmp
------------ Configuration LotuS --------------
de novo sequence clustering with CD-HIT into OTU's
Sequencing platform pacbio
Amplicon target bacteria, SSU
Dereplication filter 0
Clustering algorithm CD-HIT into OTU's
Read mapping (non tax) minimap2
OTU nt id 0.97
Precluster read merging No
Ref Chimera checking Yes (DB=/opt/miniconda3/envs/lotus2.23/share/lotus2-2.23-0//DB//rdp_gold.fa, -chim_skew 2)
deNovo Chimera check Yes
Tax assignment Lambda (-LCA_frac 0.8, -LCA_cover 0.5, -LCA_idthresh 97,95,93,91,88,78,0)
ReferenceDatabase SILVA
RefDB location /opt/miniconda3/envs/lotus2.23/share/lotus2-2.23-0//DB//SLV_138.1_SSU.fasta
OTU phylogeny Yes (mafft, fasttree2)
Unclassified OTU's Kept in matrix
--------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Demultiplexing, filtering, dereplicating input files, this
might take some time..
check progress at lotus2_pacbio_V3V4/LotuSLogS/LotuS_progout.log
00:00:12 Finished primary read processing with sdm:
Reads processed: 255,918
Accepted (High qual): 0 (4,953 end-trimmed)
Accepted (Mid qual): 252,175
Rejected: 3,743
Dereplication block 0: 0 unique sequences (avg size -nan; 0 counts)
For an extensive report see lotus2_pacbio_V3V4/LotuSLogS//demulti.log
--------------------------------------------------------------------------------
The sdm dereplicated output file was either empty or not existing, aborting lotus.
/data/analyses/Zymo-SequelIIe-Hifi-V3V4/tmp/derep.fas
%@#%@#%@#%@%@#@%#@%#@#%@#%@#%@#@%#@%#@%#@#%@#%@#%@##
LotuS2 encounterend an error:
The sdm dereplicated output file was either empty or not existing, aborting lotus.
/data/analyses/Zymo-SequelIIe-Hifi-V3V4/tmp/derep.fas
First check if the last error occurred in a program called by LotuS2
"tail lotus2_pacbio_V3V4/LotuSLogS/LotuS_progout.log"
, if there is an obvious solution (e.g. external program breaking, this we can't fix). To see (and execute) the last commands by the pipeline, run
"tail lotus2_pacbio_V3V4/LotuSLogS/LotuS_cmds.log".
In case you decide to contact us on "https://github.com/hildebra/lotus2/", please try to include information from these approaches in your message, this will increase our response time. Thank you.
%@#%@#%@#%@%@#@%#@%#@#%@#%@#%@#@%#@%#@%#@#%@#%@#%@##
Dr Falk Hildebrand,
Thank you for Lotus2 - as a great addition to lotus.
I have a question -- after installation (I used docker a "penanalytics/r-base" as Ubuntu basis for lotus2 installation) and run example an example via
"perl ./lotus2.pl -i Example/ -m Example/miSeqMap.sm.txt -s configs/sdm_miSeq.txt -p miSeq -o /mydir/myTestRun"
the script does not perform analysis and stops at an error --
"Can't open /mydir/myTestRun/tmpFiles//RDPotus.tax :
No such file or directory at ./lotus2.pl line 4078."
I tried reinstalling several times, and also used the desktop UBUNTU version image. But the error occurred every time.
Could you help with this issue?
Sincerely, Dmitry
run log:
Can't open /mydir/myTestRun/tmpFiles//RDPotus.tax :
No such file or directory at ./lotus2.pl line 4078.
Hi,
I tried to use lotus2 for my data (18s sequncese targeting protists,not bacteria and fungi) and I noticed that there are two options for the -tax_group .
My question is , what is the difference between the "bacteria|fungi" for tax_group ? Does -tax_group only influence annotation or also effect the upstream asv-generating steps?The results (asv/otu numbers and annotation results) using different tax_group settings seem different.
Hi,
I see you use UNITE v8 as a database for ITS/fungi. When comparing results with the latest version of the unite database, v9, taxonomic identification results are very different for many OTUs, and less OTUs get an identification at species or genus level. Is that the reason you work with UNITE v8 in Lotus2?
Best,
Sam
I wonder how I can Install dada2 to use it when I am using Lotus2, I tried to install this via autoInstall.pl but It did not work. Thanks!
I am getting below error while using SILVA as database. However earlier it worked well with custom database.
CMD failed: /home/rjain/miniconda3/envs/lotus2/bin/vsearch --usearch_global /zfs/omics/personal/rjain/AMF/output_SLV_lotus2//OTU.fna --db /home/rjain/miniconda3/envs/lotus2/share/lotus2-2.25-0//DB//SLV_138.1_SSU.fasta.vudb --id 0.75 --query_cov 0.5 -userfields query+target+id+alnlen+mism+opens+qlo+qhi+tlo+thi+ql -userout /zfs/omics/personal/rjain/AMF/output_SLV_lotus2//tmpFiles//tax.0.blast --maxaccepts 100 --maxrejects 100 -strand both --threads 24
these are the last few line of Lotus_progout.log
[M::main] CMD: /home/rjain/miniconda3/envs/lotus2/bin/minimap2 -x sr --sr -u both --secondary=no -N 30 -c -t 24 -o /zfs/omics/personal/rjain/AMF/output_SLV_lotus2//tmpFiles//otu_seeds.fna.phiX.0.cont_hit.paf /home/rjain/miniconda3/envs/lotus2/share/lotus2-2.25-0//DB//phiX.fasta /zfs/omics/personal/rjain/AMF/output_SLV_lotus2//tmpFiles//otu_seeds.fna
[M::main] Real time: 0.072 sec; CPU: 0.043 sec; Peak RSS: 0.003 GB
vsearch v2.23.0_linux_x86_64, 251.8GB RAM, 64 cores
https://github.com/torognes/vsearch
Reading UDB file /home/rjain/miniconda3/envs/lotus2/share/lotus2-2.25-0//DB//SLV_138.1_SSU.fasta.vudb
Fatal error: Unable to read from UDB file or invalid UDB file
Any suggestions are much appreciated. Thanks in advance!
Hi, i have been using Lotus2 with no issue for some time now clustering via OTUs (standard 16s microbiome data). I have tried to change from UPARSE to DADA2. i have followed the suggestions on the lotus website but it doesn't seem to run. Is there a clear error in my input? (code below)
./lotus2 -i Will -o output_SLV_DADA2_2024 -m WillMAP.txt -refDB SLV -CL dada2 -taxAligner lambda
all help is appreciated
-Will
I created a fresh conda env for lotus2, but attempting installation yield the error
~# conda install -c bioconda lotus2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort. failed UnsatisfiableError:
Installing through git and perl works ok, though
Hi @hildebra,
Thank you and your team for developing the LotuS2 pipeline. Recently, I came across this pipeline as it has many options for clustering and denoising algorithms. I am interested in DADA2 denoising algorithm. I read the manuscript and thought of an interesting conceptual question. It is about the midQual reads after the quality filtering will be used for 'Backmapping onto ASVs'. My question is, doesn't it inflate the abundances of the ASVs which might be PCR or sequencing errors ? Further, this will affect the downstream microbial ecological metrics. Two of our collaborators used DADA2 denoising on the same data but with different pipelines, one with stand alone DADA2 and other group with LotuS2. The results are different no. of sequences, ASVs and finally different diversity estimates. Some of us are confused, which pipeline to go for and actually which one is correct ? I would appreciate your reply and suggestions. Thank you!
Best reagards,
BalaVeera
Dear,
I am trying to simulate what analysing a shorter amplicon (V3V4 444bps) from Sequel IIe HiFi reads would give.
All files are replicates of the Zymo mock community V1V9 full length amplicon.
For that I produced a new file from the provided (and working) sdm_PacBio_LSSU.txt
In both cases I use the V1V9 HiFi reads as input which include the V3V4 region as well, is this a problem here?
For the complete V1V9 I used the following mapping file
#SampleID fastqFile ForwardPrimer ReversePrimer
4170_bc1005--bc1096 4170_bc1005--bc1096.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4356_bc1005--bc1112 4356_bc1005--bc1112.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4285_bc1022--bc1107 4285_bc1022--bc1107.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4296_bc1022--bc1060 4296_bc1022--bc1060.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4356_bc1012--bc1098 4356_bc1012--bc1098.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4112_bc1008--bc1075 4112_bc1008--bc1075.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
4128_bc1005--bc1107 4128_bc1005--bc1107.fastq.gz AGRGTTYGATYMTGGCTCAG RGYTACCTTGTTACGACTT
For the V3V4 analysis I adapted it to
#SampleID fastqFile ForwardPrimer ReversePrimer
4170_bc1005--bc1096 4170_bc1005--bc1096.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4356_bc1005--bc1112 4356_bc1005--bc1112.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4285_bc1022--bc1107 4285_bc1022--bc1107.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4296_bc1022--bc1060 4296_bc1022--bc1060.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4356_bc1012--bc1098 4356_bc1012--bc1098.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4112_bc1008--bc1075 4112_bc1008--bc1075.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
4128_bc1005--bc1107 4128_bc1005--bc1107.fastq.gz CCTACGGGNGGCWGCAG GGATTAGATACCCBDGTAGTC
In both cases, the reverse primer is reverse-complemented to match the read sequence in direct since we work here with single end reads (correct right?)
The V1V9 run succeeds
lotus2 -i /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads -o runV1V9 -s sdm_PacBio_LSSU_V1V9.txt -t 40 -m mapping_file_V1V9.tsv
RefDB SLV requested, but -taxAligner set to "0": therefore RDP classification of reads will be done
--------------------------------------------------------------------------------
00:00:00 LotuS 2.22
COMMAND
perl /opt/miniconda3/envs/lotus2/bin/lotus2 -i /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
-o runV1V9 -s sdm_PacBio_LSSU_V1V9.txt -t 40 -m mapping_file_V1V9.tsv
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Reading mapping file
Sequence files are indicated in mapping file.
--------------------------------------------------------------------------------
------------ I/O configuration --------------
Input /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
Output runV1V9
SDM options sdm_PacBio_LSSU_V1V9.txt
TempDir runV1V9/tmpFiles/
------------ Configuration LotuS --------------
de novo sequence clustering with UPARSE into OTU's
Sequencing platform miseq
Amplicon target bacteria, SSU
Dereplication filter 8:1,4:2,3:3
Clustering algorithm UPARSE into OTU's
Read mapping (non tax) minimap2
Precluster read merging No
Ref Chimera checking Yes (DB=/opt/miniconda3/envs/lotus2/share/lotus2-2.22-0//DB//rdp_gold.fa, -chim_skew 2)
deNovo Chimera check Yes
Tax assignment RDPclassifier (-rdp_thr 0.8)
OTU phylogeny Yes (mafft, fasttree2)
Unclassified OTU's Kept in matrix
--------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Demultiplexing, filtering, dereplicating input files, this
might take some time..
check progress at runV1V9/LotuSLogS/LotuS_progout.log
00:00:14 Finished primary read processing with sdm:
Reads processed: 255,918
Accepted (High qual): 191,016 (5,449 end-trimmed)
Accepted (Mid qual): 61,667
Rejected: 3,235
Dereplication block 0: 2,386 unique sequences (avg size
43; 101,408 counts)
For an extensive report see runV1V9/LotuSLogS//demulti.log
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:14 UPARSE OTU clustering
Cluster at 97
00:00:14 Finished
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:14 Starting backmapping of
- low-abundant dereplicated Reads
- mid-quality reads
to OTU's using minimap2
00:00:55 Backmapping mid qual reads:
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:55 Extending and merging pairs of OTU Seeds
--------------------------------------------------------------------------------
No ref based chimera detection
--------------------------------------------------------------------------------
00:00:56 Found 0 OTU's using minimap2 (phiX.0: /opt/miniconda3/envs/lotus2/share/lotus2-2.22-0//DB//phiX.fasta)
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:56 Postfilter:
Extended logs active, contaminant and chimeric matrix will be created.
After filtering 8 OTU's (229130 reads) remaining in matrix.
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:56 Assigning taxonomy with RDP
--------------------------------------------------------------------------------
Removed 0 tax annotations ("")
--------------------------------------------------------------------------------
00:01:00 Calculating Taxonomic Abundance Tables from RDP
classifier assignments, Confidence 0.8
--------------------------------------------------------------------------------
Calculating higher abundance levels
Adding 0 unclassified OTU's to output matrices
Total reads in matrix: 229130
TaxLvl %Assigned_Reads %Assigned_OTUs
Phylum 100 100
Class 100 100
Order 100 100
Family 100 100
Genus 91 87
Species 0 0
--------------------------------------------------------------------------------
00:01:00 Building tree (fasttree) and aligning (mafft) OTUs
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:01:06 LotuS2 finished. Output in:
runV1V9
Next steps:
- Phyloseq: load runV1V9/phyloseq.Rdata directly with the
phyloseq package in R
- Phylogeny: OTU phylogentic tree available in runV1V9/OTUphylo.nwk
- .biom: runV1V9/OTU.biom contains biom formatted output
- Alpha diversity/rarefaction curves: rtk (available as
R package or in bin/rtk)
- LotuSLogS/ contains run statistics (useful for describing
data/amount of reads/quality and citations to programs used
- Tutorial: Visit http://lotus2.earlham.ac.uk for a numerical
ecology tutorial
--------------------------------------------------------------------------------
But the V3V4 run fails
Could you please help me fix this
Thanks in advance
Stephane
lotus2 -i /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads -o runV3V4 -s sdm_PacBio_LSSU_V3V4.txt -t 40 -m mapping_file_V3V4.tsv
RefDB SLV requested, but -taxAligner set to "0": therefore RDP classification of reads will be done
--------------------------------------------------------------------------------
00:00:00 LotuS 2.22
COMMAND
perl /opt/miniconda3/envs/lotus2/bin/lotus2 -i /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
-o runV3V4 -s sdm_PacBio_LSSU_V3V4.txt -t 40 -m mapping_file_V3V4.tsv
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Reading mapping file
Sequence files are indicated in mapping file.
--------------------------------------------------------------------------------
------------ I/O configuration --------------
Input /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads
Output runV3V4
SDM options sdm_PacBio_LSSU_V3V4.txt
TempDir runV3V4/tmpFiles/
------------ Configuration LotuS --------------
de novo sequence clustering with UPARSE into OTU's
Sequencing platform miseq
Amplicon target bacteria, SSU
Dereplication filter 8:1,4:2,3:3
Clustering algorithm UPARSE into OTU's
Read mapping (non tax) minimap2
Precluster read merging No
Ref Chimera checking Yes (DB=/opt/miniconda3/envs/lotus2/share/lotus2-2.22-0//DB//rdp_gold.fa, -chim_skew 2)
deNovo Chimera check Yes
Tax assignment RDPclassifier (-rdp_thr 0.8)
OTU phylogeny Yes (mafft, fasttree2)
Unclassified OTU's Kept in matrix
--------------------------------------------
--------------------------------------------------------------------------------
00:00:00 Demultiplexing, filtering, dereplicating input files, this
might take some time..
check progress at runV3V4/LotuSLogS/LotuS_progout.log
00:00:12 Finished primary read processing with sdm:
Reads processed: 255,918
Accepted (High qual): 0 (4,953 end-trimmed)
Accepted (Mid qual): 252,175
Rejected: 3,743
Dereplication block 0: 0 unique sequences (avg size -nan; 0 counts)
For an extensive report see runV3V4/LotuSLogS//demulti.log
--------------------------------------------------------------------------------
The sdm dereplicated output file was either empty or not existing, aborting lotus.
runV3V4/tmpFiles//derep.fas
%@#%@#%@#%@%@#@%#@%#@#%@#%@#%@#@%#@%#@%#@#%@#%@#%@##
LotuS2 encounterend an error:
The sdm dereplicated output file was either empty or not existing, aborting lotus.
runV3V4/tmpFiles//derep.fas
First check if the last error occurred in a program called by LotuS2
"tail runV3V4/LotuSLogS/LotuS_progout.log"
, if there is an obvious solution (e.g. external program breaking, this we can't fix). To see (and execute) the last commands by the pipeline, run
"tail runV3V4/LotuSLogS/LotuS_cmds.log".
In case you decide to contact us on "https://github.com/hildebra/lotus2/", please try to include information from these approaches in your message, this will increase our response time. Thank you.
%@#%@#%@#%@%@#@%#@%#@#%@#%@#%@#@%#@%#@%#@#%@#%@#%@##
tail runV3V4/LotuSLogS/LotuS_cmds.log
[cmd] rm -f -r runV3V4/tmpFiles/
[cmd] mkdir -p runV3V4/tmpFiles/
[cmd] cp sdm_PacBio_LSSU_V3V4.txt runV3V4/primary
[cmd] /opt/miniconda3/envs/lotus2/bin/sdm -i_path /data/analyses/Zymo-SequelIIe-Hifi-V3V4/reads -o_fna runV3V4/tmpFiles//demulti.fna -o_fna2 runV3V4/tmpFiles//demulti.add.fna -sample_sep ___ -log runV3V4/LotuSLogS//demulti.log -map runV3V4/primary/in.map -options sdm_PacBio_LSSU_V3V4.txt -o_dereplicate runV3V4/tmpFiles//derep.fas -dere_size_fmt 0 -min_derep_copies 8:1,4:2,3:3 -suppressOutput 1 -o_qual_offset 33 -paired 1 -oneLineFastaFormat 1 -threads 6
[cmd] mkdir -p runV3V4/LotuSLogS//SDMperFile/
[cmd] mv runV3V4/LotuSLogS//demulti.log0* runV3V4/LotuSLogS//SDMperFile/
[cmd] rm -f runV3V4/tmpFiles//finalOTU.uc runV3V4/tmpFiles//finalOTU.ADD.paf runV3V4/tmpFiles//finalOTU.ADDREF.paf runV3V4/tmpFiles//finalOTU.REST.paf runV3V4/tmpFiles//finalOTU.RESTREF.paf
cat runV3V4/LotuSLogS//demulti.log
sdm 2.05 beta
Input File: several
Output File: runV3V4/tmpFiles//demulti.fna
Reads processed: 255,918
13 reads reverse-translated
Rejected: 3,743
Accepted (High qual): 0 (4,953 end-trimmed)
Accepted (Mid qual): 252,175
Bad Reads recovered with dereplication: 0
Short amplicon mode.
Min/Avg/Max stats Pair 1
- sequence Length : 0/-nan/0
- Quality : 0/-nan/0
- Median sequence Length : 0, Quality : 0
- Accum. Error -nan
Trimmed due to:
> 25 avg qual_ in 20 bp windows : 0
Rejected due to:
< min Sequence length (250) : 0
< avg Quality (27) : 3,737
< window (50 nt) avg. Quality (25) : 3,275
> max Sequence length (550) : 0
> (16) homo-nt run : 6
> (2) amb. Bases : 0
Specific sequence searches:
-With fwd Primer remaining (<= 0 mismatches, required) : 0
-With rev Primer remaining (<= 0 mismatches) : 0
-Barcode unidentified (max 0 errors) : 0
SampleID Barcode Instances
4112_bc1008--bc1075 0
4128_bc1005--bc1107 0
4170_bc1005--bc1096 0
4285_bc1022--bc1107 0
4296_bc1022--bc1060 0
4356_bc1005--bc1112 0
4356_bc1012--bc1098 0
Dear lotus2 team,
after you uploaded v 2.07 with the long read fix I guess, I tried to process some pacbio data using cd-hit for clustering
perl ~/lotus2/lotus2 -i ./ -m lotus2_cleaned.map -s ~/lotus2/configs/sdm_PacBio.txt -o lotus2_SLV138/ -threads 30 -refDB SLV -CL
and unfortunately get the following error:
685605 finished 108831 clusters
Apprixmated maximum memory consumption: 1896M
writing new database
writing clustering information
program completed !
Total CPU time 40411.72
This is sdm (simple demultiplexer) 1.85 beta.
sdm run in No Map Mode.
Could not open uc file
lotus2_SLV138//tmpFiles//finalOTU.uc
Do you have any suggestions what might have caused this issue?
Best,
Ulrike
p.s. minor typos in the log: Apprixmated; remove space after completed :)
I have installed lotus2 with bioconda and cloned the github repo. When I try to run the example with the following command
lotus2 -i Example/ -m Example/miSeqMap.sm.txt -o myTestRun
It doesn't generate any OTU table in the output directory rather I get several warnings and errors.
but is now required to run lotus2.pl
Providing option for hashed OTU ids may help integrate ASV/OTU tables from different studies or datasets.
I am trying lotus on a set of simulated reads. For this, I have a fasta with my sequences (fragment of 16S, depending on the primer pair), that I use to simulate reads:
for file in *fa; do art_illumina -amp -p -l 250 -f 500 -ss MSv3 -i $file -o $(basename $file .fa). -m 300 -s 10; done && rm *aln && pigz *fq
My mapping file looks like this:
#SampleID fastqFile ForwardPrimer ReversePrimer
314F-806R 314F-806R.1.fq.gz,314F-806R.2.fq.gz CCTAYGGGRBGCASCAG GGACTACNNGGGTATCTAAT
515F-806R 515F-806R.1.fq.gz,515F-806R.2.fq.gz GTGCCAGCMGCCGCGGTAA GGACTACNNGGGTATCTAAT
515F-907R 515F-907R.1.fq.gz,515F-907R.2.fq.gz GTGCCAGCMGCCGCGGTAA CCGTCAATTCCTTTGAGTTT
799F-1193R 799F-1193R.1.fq.gz,799F-1193R.2.fq.gz AACMGGATTAGATACCCKG ACGTCATCCCCACCTTCC
And I am running lotus as
./lotus2/lotus2 -i . -m map.txt -o mytest -amplicon_type SSU -CL dada2 -refDB SLV
However, I am getting the following error in the dada2 step;
SampleID Barcode Instances
314F-806R 2
515F-806R 17
515F-907R 21
799F-1193R 17
Time taken: : 502ms
Error: Can't find files for block a expected files such as
Aborting dada2 run
Execution halted
Running it without dada2 works fine:
./lotus2/lotus2 -i . -m map.txt -o mytest
./lotus2/lotus2 -i . -m map.txt -o mytest2 -amplicon_type SSU -refDB SLV
Would be amazing if you could add a COI Reference DB in future updates.
Is there a workaround for now or more detailed information on how I can use a custom DB?
E.g. a tool to create the tax4refDB file?
/home/miniconda3/envs/lotus2/share/lotus2-2.21-0/lotus2 -create_map 01_map
Use of uninitialized value $first in concatenation (.) or string at /home/miniconda3/envs/lotus2/share/lotus2-2.21-0/lotus2 line 5667.
common prefix
Map is 01_map
Please check that all files required are present in map 01_map.
Use of uninitialized value $pathPre in concatenation (.) or string at /home/miniconda3/envs/lotus2/share/lotus2-2.21-0/lotus2 line 5756.
5667: print "$first common prefix\n";
5756: print "==========================\nTo start analysis:\nlotus2 -m $ofile -i $pathPre/ -o [outdir] [further parameters if desired]\n==========================\n";
I am not familiar enough with Perl to fully understand what is going on, but I am curuious as to why there would be any issue with the script itself when I did not edit in in any way. Am I missing something in terms of my input? I have run this from the input directory containing 100 files (R1 and R2 of 50 samples) in fastq format.
Hi,
When I run:
lotus2 -taxOnly OTUs.fasta -o lotustax -refDB KSGP_v1.0.fasta -tax4refDB KSGP_v1.0.tax -taxAligner usearch -ITSx 0 -t 48
I get:
TaxOnly option specified, but not an output dir. Assumming:
mkdir: cannot create directory ‘/primary/’: Permission denied
mkdir: missing operand
Try 'mkdir --help' for more information.
mkdir: cannot create directory ‘/LotuSLogS/’: Permission denied
mkdir: cannot create directory ‘/ExtraFiles/’: Permission denied
Can't open Logfile /LotuSLogS/LotuS_run.log
I also tried with the absolute path to the output directory etc, but I keep getting the same error message.
Any idea what I am doing wrong?
Dear Falk, dear Joachim,
could you check the autoInstall.pl re sdm path?
Do you want to
(1) install utax taxonomic classification databases (16S, ITS)?
(0) no utax related databases
Answer:1
Can't open sdm file lotus2//sdm_src/IO.h
Linux xxx 5.4.0-56-generic #62-Ubuntu SMP Mon Nov 23 19:20:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Thanks in advance! I used the linear version provided on the lotus2 webpage.
Best,
Ulrike
I would like to use the blast output in MEGAN, but the conversion from blast to rma fails.
The blast output should be tab separated and with 12 columns. I get 11 columns (without bitscore) from Lotus2?
Dear lotus2 team,
I have 16S data that was already demultiplexed & cleaned (i.e., no primer parts left).
I created a basic sample map using lotus -create_map and then ran lotus2.
Running lotus2 this way creates a warning: No forward PCR primer for amplicon found in mapping file (column header "ForwardPrimer". This might invalidate chimera checks).
However, when I add the primer sequences (i.e., using the -forwardPrimer and -reversePrimer arguments), demultiplexing fails with an empty output file.
Is it safe to just run lotus2 without specifying the original primers? If not: is there a way to allow reads to pass QC even if the primer is not present anymore?
Hi,
Unless I missed it, we have now 'only' files in the config folder which set a number of default parameters for several platforms (using '-c', which is great already but not very easy to parse).
Would it be possible to also allow yaml import of complex parameter sets, made of your >60 available flags, so that users can easily reuse complex commands and share them in papers or protocols?
Another great addition would be to have a file created after submitting a manual command at CLI and that would include all user-defined AND default parameters for that command. This file could be used as template for the next command to increase reproducibility as in my previous paragraph.
If I had the choice, I would favor yaml over xml or jason for readability but the later would also be OK as they can be converted with yq like tools.
Thanks
Stephane
Hello,
I am trying to use Lotus2 to process ITS2 reads. I have tried to set the taxAligner to blast but it does not want to read it, have you encountered this issue? It seems that Lotus2 is not reading multiple of my commands? My code is below.
Error: "RefDB UNITE requested, but -taxAligner set to "0": therefore RDP classification of reads will be done"
It also puts out this error sometimes:
zsh: command not found: -amplicon_type
zsh: command not found: -tax_group
zsh: command not found: -taxAligner
zsh: command not found: -clustering
My Code:
lotus2 -i Seqs
-o lotus2
-m TestMap.txt
-refDB UNITE \
-amplicon_type ITS2 \
-tax_group fungi \
-taxAligner blast
-clustering vsearch
-id 0.97
Hi,
If I run lotus2 with -saveDemultiplex 3
it results in an error. Can't say exactly what is happening.
I don't find the log file where the error should be described?
The lotus2 version I use was installed with conda (installed today, 05.01.24)
Presently the autoInstall.pl
script requires user input from the console.
To create a conda package for lOTUs we can skip calling autoInstall.pl
for the software dependencies (since these will be installed via package requirements), but the required databases need to be installed/uninstalled as part of the lotus package installation/uninstallation, i.e. via post-link.sh
/pre-unlink.sh
scripts distributed with the conda package. In these scripts we could just run autoInstall.pl
with some combination of parameters, but the script must be updated to be callable non-interactively.
Hi Falk Hildebrand,
I wonder if I can parse the single read sequences using lotus2? If so, how can I set the codes? Thanks,
Junhui
Hi, very new to Lotus2 so this might be trivial. I have been trying to use the AF_full_region database produced by the Anaerobic Fungi Network (https://anaerobicfungi.org/databases/) but i got the error below
[M::main] CMD: /root/lotus2//bin//minimap2-2.17_x64-linux/minimap2 -x sr --sr -u both --secondary=no -N 30 -c -t 1 -o Will_AF/output_AFN/tmpFiles//otu_seeds.fna.phiX.0.cont_hit.paf /root/lotus2//DB//phiX.fasta Will_AF/output_AFN/tmpFiles//otu_seeds.fna
[M::main] Real time: 0.005 sec; CPU: 0.004 sec; Peak RSS: 0.004 GB
Loading Subject Sequences and Ids...
ParseError thrown: Unexpected character '-' found.
Make sure that the file is standards compliant. If you get an unexpected character warning make sure you have set the right program parameter (-p), i.e. Lambda expected nucleic acid alphabet, maybe the file was protein?
in response I substituted the '-' for '' thinking Lotus2 couldn't understand '-'. this still didn't work the same error appeared but with
ParseError thrown: Unexpected character '' found.
is the problem the characters in my DB files? if so what character can i substitute with that Lotus2 can read.
Many thanks :)
(i cant submit the AF_full_region FASTA file to github apologies)
I am testing lotus2 on my amplicon sequencing data. I used lima for that but I don't like its demultiplexing report. I read your tutorial but didn't find anywhere stating if mismatches in barcodes can be accommodated.
Would you please help with this question?
Thanks.
Following info on the related issue 24, I tried the following:
conda create -c conda-forge -c bioconda --strict-channel-priority -n lotus2
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /opt/miniconda3/envs/lotus2
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate lotus2
#
# To deactivate an active environment, use
#
# $ conda deactivate
Retrieving notices: ...working... done
I tried installing with conda but it failed
$ conda activate lotus2
$ conda install -c bioconda lotus2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: |
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError:
I then tried with mamba
$ mamba install -c conda-forge -c bioconda lotus2
__ __ __ __
/ \ / \ / \ / \
/ \/ \/ \/ \
███████████████/ /██/ /██/ /██/ /████████████████████████
/ / \ / \ / \ / \ \____
/ / \_/ \_/ \_/ \ o \__,
/ _/ \_____/ `
|/
███╗ ███╗ █████╗ ███╗ ███╗██████╗ █████╗
████╗ ████║██╔══██╗████╗ ████║██╔══██╗██╔══██╗
██╔████╔██║███████║██╔████╔██║██████╔╝███████║
██║╚██╔╝██║██╔══██║██║╚██╔╝██║██╔══██╗██╔══██║
██║ ╚═╝ ██║██║ ██║██║ ╚═╝ ██║██████╔╝██║ ██║
╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝ ╚═╝
mamba (1.0.0) supported by @QuantStack
GitHub: https://github.com/mamba-org/mamba
Twitter: https://twitter.com/QuantStack
█████████████████████████████████████████████████████████████
Looking for: ['lotus2']
pkgs/main/osx-64 No change
bioconda/noarch No change
bioconda/osx-64 No change
pkgs/r/noarch No change
pkgs/main/noarch No change
pkgs/r/osx-64 No change
conda-forge/noarch 10.2MB @ 3.8MB/s 3.2s
conda-forge/osx-64 25.2MB @ 2.3MB/s 12.3s
Encountered problems while solving:
- nothing provides lambda <2 needed by lotus2-2.01-0
The error is not very verbose, any idea how to fix this?
Thanks in advance
I found that Lotus2(v2.23) or RDP classifier is failed to identify contamination of mitochondrion, for example, some OTU actually are mitochondrion sequence, but the taxonomy assignment for these OTU are still bacteria.
The blow is the blast result for a OTU:
The command line I used: lotus2 -t 50 -i data -m 16s_map.txt -s sdm_miSeq.txt -o uparse -CL uparse
sdm_miSeq.txt
Hi,
I was wondering what version of the PR2 database is used in Lotus2? I noticed the taxonomy file of the PR2 DB version that is used with the latest github version of Lotus2 has 7 taxonomic levels, while I read on the PR2 website that:
Version 5.0 and above
9 levels : Domain / Supergroup / Division / Subdivision / Class / Order / Family / Genus / Species
Version 4.14.1 and below
8 levels : Kingdom / Supergroup / Division / Class / Order / Family / Genus / Species
Just wondering how results compare to taxonomic identifications I previously did of the same sequences with other tools.
Best,
Sam
Hi,
When I run lotus2 -taxOnly
with
lotus2 -taxOnly /kyukon/scratch/gent/vo/001/gvo00123/vsc46214/CRABS/otus92.fa -o lotustax -refDB Olig01_Annelida_crabs_db.fasta -tax4refDB Olig01_Annelida_crabs_db.tax -taxAligner blast -ITSx 0 -LCA_idthresh 94,80,75,70,65,60 -lulu 0 -t 64
I only get 99 of 152 OTUs back in the resulting otus92.fa.hier file. The other ones are completely missing from the file
Do you have any idea why this is happening?
Best,
Sam
Hi All,
It seems that the step: "Building tree (fasttree) and aligning (mafft) OTUs" (run by default) runs on a single thread for quite some time now while I have plenty of free cores available
ps shows:
/opt/miniconda3/envs/lotus2.23/bin/FastTreeMP -nt -gtr -no2nd -spr 4 -log lotus2_pacbio_V1V9/LotuSLogS//fasttree.log -quiet -out lotus2_pacbio_V1V9/OTUphylo.nwk lotus2_pacbio_V1V9/ExtraFiles//OTU.MSA.fna
The name FastTreeMP suggests that this could be multithreaded, is it so and if yes can we speed up that step?
Thanks
Stephane
lotus2.23
FastTree 2.1.11 Double precision (No SSE3), OpenMP (88 threads)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.