zheminzhou / etoki Goto Github PK
View Code? Open in Web Editor NEWall methods related to Enterobase
Home Page: https://enterobase.warwick.ac.uk
License: GNU General Public License v3.0
all methods related to Enterobase
Home Page: https://enterobase.warwick.ac.uk
License: GNU General Public License v3.0
It would be much nicer if all the compenents were "sub commands" of a single tool.
Having understandable names would be even better.
Thanks ๐
Hi, could you clarify in the documentation which things I can put into --path
? Like if I have the entire spades package downloaded, do I just run it with --path spades.py=$(which spades.py)
? For blast, do I need to consider multiple executables like makeblastdb
and blastn
?
Thank you!
I am taking some notes on how I ran cgMLST, and I hope you can add documentation for it.
Create database: this took a very long time
# Downloaded the cgMLST scheme from enterobase FTP into Salmonella.cgMLSTv2.enterobase (undocumented)
\ls -f1 Salmonella.cgMLSTv2.enterobase/*.fasta | \
grep -v cgMLST_v2_ref.fasta `# ignore already-established reference file` | \
xargs seqtk seq -l 0 `# cat out all the fasta contents and two-line fasta format` | \
perl -lane '
# get the id with '>' and the seq on the next line since it is in a two-line fasta format
$id=$F[0];
$seq=<>;
chomp($seq);
# I don't think this will matter but just avoid any infinite loops by quitting if we see the same sequence
my %seen;
if($seen{$id}++){print STDERR "Already seen $id. Done."; last;}
# Avoid deflines that might be problematic
if($id =~ /[^_>0-9a-zA-Z]/){
print STDERR "Skipping ".$id;
next;
}
print "$id\n$seq";
' > enterobase.filtered.fasta
Good morning,
I hope it is relevant here.
Could EToKi be used to search for an AA motif in the EnteroBase db?
Thank you in advance
El
Hi, I am wondering if it will work if I download vsearch
and rename it to usearch
in my PATH. I want to package EToKi into a container but the usearch license is prohibitive.
http://cab.spbu.ru/ website, where the "SPAdes-3.13.0-Linux.tar.gz" file is downloaded from, is down.
Hello,
I installed EToKi and tried the example as showed in the README file but I got the following error.
./EToKi.py MLSTdb -i examples/Escherichia.Achtman.alleles.fasta -r examples/Escherichia.Achtman.references.fasta -d examples/Escherichia.Achtman.convert.tab
2020-07-29 20:45:04.997470 Exemplar sequences in ./NS_2085r4ap/clsFna.clust.exemplar
2020-07-29 20:45:04.997569 Clusters in ./NS_2085r4ap/clsFna.clust.tab
2020-07-29 20:45:05.030599 Run BLASTn starts
2020-07-29 20:45:05.384751 Run BLASTn finishes. Got 971 alignments
2020-07-29 20:45:05.384888 Run diamond starts
2020-07-29 20:45:06.047658 Run diamond finishes. Got 966 alignments
2020-07-29 20:45:06.375628 removed 0 paralogous sites.
2020-07-29 20:45:06.375682 obtained 5530 alleles and 39 references alleles
2020-07-29 20:45:06.378881 A file of reference alleles has been generated: examples/Escherichia.Achtman.references.fasta
Traceback (most recent call last):
File "EToKi.py", line 47, in
etoki()
File "EToKi.py", line 41, in etoki
eval(arg.cmd)(sys.argv[2:])
File "/mnt/data/disk1/biotools/EToKi/modules/MLSTdb.py", line 164, in MLSTdb
conversion[0].append(get_md5(allele['value']))
TypeError: string indices must be integers
I installed in a conda env with python 3.6.
Some ideas how to fix it?
Hi @zheminzhou ,
am trying to recreate the SNP analysis that would be performed on EnteroBase, given a directory of assemblies.
Would it be possible to add a note to the README about what programs correspond to the steps listed in the EnteroBase docs?
For what I can tell, it seems like:
refMasker
~ RecHMM
refMapper
~ align
refMapper_matrix
~ RecFilter
matrix_phylogeny
~ phylo
... but if that's the case, I'm not sure where the SNP matrix required by RecHMM
would come from; the docs describe refMasker
as identifying recombination regions from a reference genome, while RecHMM
identifies regions from a SNP matrix.
Any help would be appreciated!
Thanks,
~Nick
It would be very helpful to be able to pip3 install EnSuit
.
requests is required by package, but not mentioned in readme.
python EToKi/EToKi.py -h
Doesn't handle Help when no module is given
Traceback (most recent call last):
File "EToKi/EToKi.py", line 55, in <module>
etoki()
File "EToKi/EToKi.py", line 31, in etoki
exec('from modules.{0} import {0}'.format(sys.argv[1]))
File "<string>", line 1
from modules.-h import -h
With phylo.py, the core genome output doesn't match expectations. Assuming I'm correct in thinking that the .matrix is the right output.
With my alignment and a threshold of 0.95, 4,800,000/5,000,000 sites should be retained. Instead, 4,781,411 are output.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.