Comments (2)
Some more information: below is (1) the exact error message I see when I try to run GraphBin, (2) the first 10 lines of the contigs.paths file, (3) the first 10 headers of the contigs file, and (4) the first 10 lines of the assembly graph file. The assembly graph is the only file that was constructed with the original fasta deflines, but it doesn't look like they are part of the file content anyway.
(1)
2021-12-16 10:50:52,790 - INFO - Welcome to GraphBin: Refined Binning of Metagenomic Contigs using Assembly Graphs.
2021-12-16 10:50:52,790 - INFO - This version of GraphBin makes use of the assembly graph produced by SPAdes which is based on the de Bruijn graph approach.
2021-12-16 10:50:52,790 - INFO - Input arguments:
2021-12-16 10:50:52,790 - INFO - Assembly graph file: assembly_graph_with_scaffolds.gfa
2021-12-16 10:50:52,790 - INFO - Contig paths file: contigs-v3.paths
2021-12-16 10:50:52,790 - INFO - Existing binning output file: MaxBin2_graphbin_map.csv
2021-12-16 10:50:52,790 - INFO - Final binning output file: graphbin/
2021-12-16 10:50:52,790 - INFO - Maximum number of iterations: 100
2021-12-16 10:50:52,790 - INFO - Difference threshold: 0.1
2021-12-16 10:50:52,790 - INFO - GraphBin started
2021-12-16 10:50:52,801 - INFO - Number of bins available in the initial binning result: 23
2021-12-16 10:50:52,801 - INFO - Constructing the assembly graph
2021-12-16 10:50:52,801 - ERROR - Please make sure that the correct path to the contig paths file is provided.
2021-12-16 10:50:52,802 - INFO - Exiting GraphBin... Bye...!
(2) contigs.paths
c_000000000001
93572836-,92654567+,92654571+,92654567+,92654573+,93571458-
c_000000000001'
93571458+,92654573-,92654567-,92654571-,92654567-,93572836+
c_000000000002
91213823-,93572872-,93549998-,93054327-,25436494-,93565676-,93574710-,93461762-,84512918+,93553746-,93560474+,93464329+,92657131+,92657087+
c_000000000002'
92657087-,92657131-,93464329-,93560474-,93553746+,84512918-,93461762+,93574710+,93565676+,25436494+,93054327+,93549998+,93572872+,91213823+
c_000000000003
84786808+,93050927-;
(3) contigs file
c_000000000001
c_000000000002
c_000000000003
c_000000000004
c_000000000005
c_000000000006
c_000000000007
c_000000000008
c_000000000009
c_000000000010
(4) assembly graph
S 1562096 GACAAGAGCTCATGTCTGTGCTGCAATACTGGCGGGGCCAGGATGTGACGCTCATCGGCGCCACTACACTATGTGATCATTATTCCAAAGCGCTCGCCCTCGCTGGTGCCGCTGCATCAATCCTTGATGTGACCGCAATGACCCTAGCCGGGCTTGCGGCGGCACATCAGCAAGGGCAGCATGCCGATGCATAGAGAAATCATCGCAATTCTGCGGGGCGT KC:i:938
S 43432 CAGTTGTCCAGAGGCCTGCAACATATTTTGAACCAGCAACAAAAACATTGCCCAGAACAGCCATAAAAATACCAATTATGGTGCTGAACAGAATGACCATTACTGACGAAAAGATTAAACCAAACATTCACATTACCTTCTTCGAAGGTCACTTGTTTACAATACATTAACTTAAAGCATAAACATTATAAATTGCGGATCTTTGCTCTTTTCCCCCTTTCATTTCTT KC:i:848
S 85384538 GCCTCCTCCTCCTCCGCCGTCGCCTCCGCAGCTCGTCGGGCGGTCCGCGAATGGCGGTCGGCCGCGAGACATCGAGGCGTCTGCGTCCACGCGCGCCGCTCGGGACCACGCCGTACGGAGCTCCCCCATCTGCGCCCGGCTCTGGGGCACCAGCTGGTGGATGTGCAGCCCGTTGGTGCCGAGGGTTGCGAAGGAAGGTTGCTGCCAGCCGGAGGCCACCCCCCCCGCTCCACCAGAAGGGGTCGCGG KC:i:836
S 92795225 TTATTGATTGATGATGGATCAATATCAATATGTACTTTTTTTGACTTTGGAGAGAACTCATCAATTTTACCAGTTATACGATCATCAAATCTTGCACCAATATTAATTAGTAAATCACAATCGTGCATGGCATTATTGGCTTCATAAGTTCCATGCATTCCAAGCATTCCTAAAAATTGATTATCATCACCAGGATAAGAGCCAAGACCTTGCAGTGTTGAAGTAATTGGAAATCCAGTTAATGCAACTAGCCCTCTCAAGAGTTCACTTGCTTTTGGCCCAGAATTTACAACACCACCACCTGTATAAAAAATTGGTTTAGAAGCTTTATTCATTAACTCAATCAATTGATCAATATCTTTTTGATTAAATTTATTTTTGTTTTTAATTTCAATTTTTTTTTCTTTTTTAAAAGAAGCATATTTTGTTTTTGCAAACTGGATATCTTTTGGAATATCTACAAGCACTGGCCCCGGTCTTCCTGTGGTTGCAACCTCAAATGCTTCATTCATAATTCTTTGGAGATCATTAATATTTTTCACTAACCAATTGTGTTTTGTGCATGGTCTTGTAATTCCAGTTGTGTCACATTCCTGAAAAGCATCTGTTCCTATTAAATGAGTGGGTACTTGTCCTGAAATACAAACTAATGGTACCGAGTCCATGTAAGCATCTGTTAATGCAGTAACAACATTAGTTGCTCCCGGCCCAGAAGTAACTAAAACAACTCCTGGTTTACCAGAAGATCTTGCATAACCCTCTGCTGCATGACCTGCGCCCTGTTCATGTCTAACTAAAATATGTTTAATCGAAGTGTGATTTTTAAGTTCATCATAAATCGGAAGAACTGCTCCTCCTGGGTAACCAAAAATATGATTTACTTTTTGGTCTTCTAGACACTTAAATACTATCTCTGCTCCAGTATAT KC:i:4363
S 952260 GGGCCGTCTTCGACCCCGCGGACATGCCGCAGGGGCTCGAACAGATAGAGCCACATCTTTGCCAACCAAATCGCGTTTCACGCTCTCCTTTTTCCGGTCGGAGGGGCATTTGATTATCTTGGGAAGTGATAAACTTGTTGAGAAAGTTGAAACTGATATTCAATCAGAACCACTGCGAGTTTTTTTGGATCTAGGCTCGACCTCCTCGGCCGCCGTATACGCC KC:i:384
S 777246 GCCCAGCGCTCCCCCTCCGCCGTTCATATGTGCCCCCAAATACGCCGGTCACAGGGACGGCATGGTCTTCAAAAATGACCACAGTGGCCTGGGCTACTACCGCGATGTTGGGCATCAAGTCGCGAGCGTGCTGGCTGACGCCGCCGCCGTCAGCCAGTACATGCCGGTGGTGCAAATCCCTCTGCACCAGCTTCTTCTTGAGCAGCTCCGCCTTGAGGGCAGAGCTGAGGACCTGTCT KC:i:192
S 739802 GCGGCGCCCGGAACCCGGACGAGGTGAGTTTATATTTTATTTCTCGTACGGGCAATTGAACTGACGGCGTGTTTTATTTACTGTATAGATTCCGTTTGACGATCAAGCCGCCTTTGCCAAAGATTTAGCCGCGCTCTTCGAAGAGCCCGGCGCGTTCGCGGACATCGTCATCACCGCGGAGGTGATCCAGACGGTGCTCGCGCTCGGCGTGGTCACCGCCGTCGC KC:i:761
S 81177015 CTTCGACGCCGGCTCGATCGGTGGAGGAGGGCTGCTCGACCTCGGACCCCAGCCCGCGGACGAGCCGGCCCAGCTGCGTGCTAGTGGCGGCGGAGGCGGGGGCGTGCCAGAAAGCGTGCGAGACGCCGCCCGCGCCTCGAGGGAGCGACGGAGCGGCGAGCGGGTGCGCATGAGCAACGATCGCTCGGTCGACTCGCTGACGCCGCGGGGCGACGGCGCGGACGAGGCGTTCCTGACAGC KC:i:384
S 85844838 GTCAATACCAGCAGAATATTCATCAAAGCTCTTAACTTCTGGTGGGAATGACGAAATACGGGGCTGTACCCGCAACATTGGGAGAAAAACAGAACGAGCAAACCGGGCAAAACGTTCGTTAATCATACGAAGCGCATAGTAGTCACCAAGTAGCGAAAGATCATCCGAACCAAACTTGAATGGACGAACATCTGCCATATCACCAACATCTAGAGATGCAACTGATTCGGAAGTTTCTGACCCAAGCCCTTCAATAAGGGCATTAACTTCATCACTACTTAATTTGCGGGTGGATGCCAAAATAACACCTATTATGTTCTAAAAGTTTATTGTAGTACAAAAGATGTGAAATGTACTTCTGCCACACCACCAAAGTCTTCAAGAGATATAAGTTTAGCATTCATGGCATCTTTCATTGTAGTAGCAAGCTTCTCACGTCCTGCAATACCCTTAATATCCTCTTCACCAAACTCACTTAAAACTCCAAGGACGATTGAACGAAGCGCAAGCTGATGAGACTCAACCATTTCCATTATGCTTTCATCATATTGCGTTGATACACCAACACCAACCTGAAGCATTTTACGTGATCCCATTAGGTTTGCCGTAAATGTGCCTGGAAATTCATAATAGATGGTCACAAAGTTCTCAACTTCAGGTGTTTCTTTGGAA KC:i:2253
S 1347984 CTTCTAAAGCAGGCAGTACTTTAGGAAGGACACAAGATCTACTAGAGCAGGCAGTTCTCTAGCAGAGGCTTAGGCAGCAGTAATGGATAACTTGGTATACTGTAAGGTTGATTCCTTTATTTCAGAATAAGTTTTAACATAATATAAAATGATTATACCTAAGGTTCACCACTATAACCACTGTCAGAATGTCTAGGAGCAAGTAATCACTTATGAGT KC:i:861
from graphbin.
Hello @nvpatin,
Thanks for the issue. All the versions of GraphBin were originally designed to make use of the standard contig-naming conventions of the respective assemblers (e.g., NODE_*
for SPAdes) to keep track of the contig numbers. This is why you are getting the above error. I would suggest using the original naming convention of SPAdes and giving it a try.
If you need to run GraphBin with custom naming formats, it might take a really long time to implement and test it for all the assemblers. Let me know how it goes.
Thank you!
Best regards,
Vijini
from graphbin.
Related Issues (20)
- qusetion of score HOT 1
- project refactor to improve portability
- Fix script attributes and update contributors
- Setup test suit
- useful helper function for testing cli apps
- using pytest fixtures for cleaning up test output directories
- Change software license
- Speedup final file write process
- TST: Setup nox testing
- update docs
- BUG: Validate args.paths check for Flye input
- How to run fastg2gfa? HOT 1
- Starting from a failed point
- ENH: Convert to use `click`
- link disappeared HOT 3
- Please depend on 'igraph' instead of 'python-igraph' on PyPI HOT 2
- Running flye assemblies and getting error wanting contigs.paths file for spades HOT 2
- Make it compatible with assembly graphs coming from other assembly software HOT 1
- utils/parsers/*.py/write_output is not OS agnostic
- DEV: separate main code from `__init__.py`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from graphbin.