miwipe / ngslca Goto Github PK
View Code? Open in Web Editor NEWLicense: GNU General Public License v3.0
License: GNU General Public License v3.0
I think we should add examples of all the different types of plots the Rscript produces.
print query sequence in stdout
Make ngsLCA output as Kaiju output file style (https://github.com/bioinformatics-centre/kaiju) that gives you the counts only for the lowest ranks, rather than Kraken2 or Metaphaln that sum up the counts to the highest ranks.
Which is compatible with Phyloseq (https://joey711.github.io/phyloseq/) and Mia (https://microbiome.github.io/OMA/).
We still need a output format description of the .lca files, and the Rscript outputted files.
see below parsed seqs which doesn't fit 100% to reference sequence
samtools view CHL_155_12485.sort.bam | grep -e 'HISEQ:50:C6NJJANXX:1:1108:14286:4677' -e 'HISEQ:50:C6NJJANXX:1:1108:14285:32643' -e 'HISEQ:50:C6NJJANXX:1:1108:14287:23673' -e 'HISEQ:50:C6NJJANXX:1:1108:14289:31539' -e 'HISEQ:50:C6NJJANXX:1:1108:14289:54817' | less -S
Only use the maps with lowest editdistance
As of f118505, after having succesfully compiled ngsLCA on Ubuntu 20.04, I get a segmentation fault and empty output files while running sam2lca on of the example file in the bam_files
:
See log below:
$ ./ngsLCA -names ncbi_tax_dmp/names.dmp -nodes ncbi_tax_dmp/nodes.dmp -acc2tax ncbi_tax_dmp/nucl_gb.accession2taxid -bam bam_files/SPL_015_1444.fq.plastids.sorted.bam -outnames SPL_015_1444
-> Will output lca results in file: 'SPL_015_1444.lca'
-> [thread1] Will read header
-> Will output lca weight in file: 'SPL_015_1444.wlca'
-> Will output log info (problems) in file: 'SPL_015_1444.log'
-> [thread1] Done reading header: 0.00 sec, header contains: 4322
Segmentation fault (core dumped)
Hi,
Is it possible to combine each taxa rank otutable into one rather than having them separate (one for species, one for genera etc)?
I can see that the program creates files containing all the ranks put together (kraken style), like "complete profile" file or the one in the taxa_groups folder but the counts, in my case, do not match with the counts in the separate files, therefore I am confused.
Thanks
Make ngsLCA output as Kaiju output file style (https://github.com/bioinformatics-centre/kaiju) that gives you the counts only for the lowest ranks, rather than Kraken2 or Metaphaln that sum up the counts to the highest ranks.
Which is compatible with Phyloseq (https://joey711.github.io/phyloseq/) and Mia (https://microbiome.github.io/OMA/).
Hi,
I'm wondering if it's ok to run ngsLCA with the bam file from bwa-mem. I see in the tutorial using bowtie2 with some specific parameters. If we use bwa-mem, is there any option that are recommended.
Thanks,
Hien
I've been having problems running ngsLCA on my own data so I tried running the test data first, but I also can't get that to work. I downloaded SPL_015_1444.fq.plastids.bam from ERDA, downloaded the ncbi tax_tax_dmp and then ran the following code:
ngsLCA -editdistmin 0 -editdistmax 0 -names ncbi_tax_dmp/names.dmp -nodes ncbi_tax_dmp/nodes.dmp -acc2tax ncbi_tax_dmp/nucl_gb.accession2taxid.gz -bam SPL_015_1444.fq.plastids.bam -outnames outfile.ed0
filenames needs to be possible to specify
hi:
thanks for ngsLCA
but i get a error when Install the package "ngsLCA" with the command devtools::install_github("wyc661217/ngsLCA").
ERROR: dependencies ‘ComplexHeatmap’, ‘ggpubr’, ‘vegan’ are not available for package ‘ngsLCA’,my R version is latest 4.2.0
Can we add accession number and query sequence length to output print?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.