Coder Social home page Coder Social logo

Comments (9)

rchikhi avatar rchikhi commented on August 18, 2024

Hi Nick!
Input looks like what BCALM can take, and command line looks good. Variable-length reads are fine.

  1. can you please pastebin the full output?
  2. did bcalm work for you on another dataset, or does that bug occur for every dataset you try?

from gatb-core.

NickSto avatar NickSto commented on August 18, 2024

Hi Ryan, thanks for the reply! Paul pointed me to this tool and it looks like it could really help me out of a bind.
I should've mentioned, I did try it on the example/tiny_read.fa dataset (using example/run-tiny.sh) and it finished fine, without this bug.
The full stdout is here, and the stderr is here.

from gatb-core.

rchikhi avatar rchikhi commented on August 18, 2024

I see nothing out of the ordinary in the log files, except maybe:

  1. 48 threads is more threads than I generally test with. Please try with "-nb-cores 1" added as a command line parameter
  2. if it doesn't work, would you mind sending me the file? I'm on brubeck.

from gatb-core.

NickSto avatar NickSto commented on August 18, 2024

Oh, weird, I don't know how it ended up using 48 threads. I set it to 1 and it didn't fail with that error anymore! However, it failed with a different one:

[...]
Done with all compactions   --  memory [current, maximum (maxRSS)]: [ 661, 2177] MB 
Nb bglue threads: 1
unitigs graph destructor called
EXCEPTION: Unable to open bank 'k21a10n1/SC8C1.bcalm.unitigs.fa.glue'

And the output directory is no longer filled with dozens of files like SC8C1.bcalm.unitigs.fa.glue.44 and SC8C1.bcalm.unitigs.fa.doubledKmers.29, but instead with only SC8C1.bcalm.unitigs.fa.glue, SC8C1.bcalm.unitigs.fa.glue.0, and SC8C1.bcalm.h5.
The input file is in scratch5, at scratch5/nick/family/graph/bcalm/SC8C1/SC8C1.fq.gz. I'll also try with some different inputs, just to check.

from gatb-core.

NickSto avatar NickSto commented on August 18, 2024

Update: I've narrowed it down to the choice of input.
I tried going back to the raw data, before quality filtering, and it worked! Somehow, the difference in the FASTQ files before and after filtering did it.
I filtered using sickle (v1.33), which trims bad ends off reads and filters them out if there's not enough good sequence left. Here's the specific sickle command I used:
$ sickle pe -t sanger -q 20 -l 100 -f $fastq1in -r $fastq2in -o $fastq1out -p $fastq2out -s $fastqs
You can find the raw data on brubeck at:
scratch1/boris/pnas_2014/mt.data/SC8C1-bl_[12].fq
and the filtered data at:
scratch5/nick/family/all4/fastq/filt/SC8C1-bl_[12].fq
Update to the update: And now I can't reproduce the case where it works. No matter how I manipulate the input, I keep getting EXCEPTION: Unable to open bank 'filename.bcalm.unitigs.fa.glue'.
But that bank file exists and contains the name of another file, filename.bcalm.unitigs.fa.glue.0, which contains valid FASTA sequences.

from gatb-core.

rchikhi avatar rchikhi commented on August 18, 2024

Thanks for the data. I can reproduce the issue now. I think it is due to the "-out" argument pointing to another folder, if you remove this argument or ask to output in the same folder, it doesn't crash. I'll work on a fix.

from gatb-core.

NickSto avatar NickSto commented on August 18, 2024

Argh! Yep, that was it. Thanks so much for helping to figure this out!

from gatb-core.

rchikhi avatar rchikhi commented on August 18, 2024

you're welcome, and i'm a bit embarassed about that bug in such a basic functionality :) let me know if you stumble on other issues with bcalm. especially if the original "wtf traveller kmer" error resurfaces.

from gatb-core.

rchikhi avatar rchikhi commented on August 18, 2024

the -out bug has been fixed by commit 11a2aa1

from gatb-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.