Coder Social home page Coder Social logo

Comments (16)

infphilo avatar infphilo commented on July 22, 2024

I copy @yaschenk and @kwrodarmer on this issue. Many users run HISAT2 with the sra-acc option successfully, but occasionally some seem to encounter the above problem (i.e., VCursorCellDataDirect failed). The HISAT2 binaries were built using ngs-sdk.1.1.1 (now I noticed a newer version, 1.2.2, was released about a month ago). Is this something that has been fixed or problems inherent due to data transfer over the internet? BTW, I'll modify HISAT2 to output a warning instead of terminating silently for this runtime error.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

There have been no changes to address an error of this sort. We'll need to debug. Could you provide more information, such as the command line that failed?

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

I'll run it again tonight, but last time was just a straight up --sra-acc with no other parameters with the standard hg19 genome, piping to samtools to make a bam. I'll test if it still happens and if it's
somehow specific to some SRAs. I can tell you that it was not an internet connection on my side (might have been server side).

On 19.10.2015 19:32, kwrodarmer wrote:

There have been no changes to address an error of this sort. We'll
need to debug. Could you provide more information, such as the command
line that failed?


Reply to this email directly or view it on GitHub
https://github.com/infphilo/hisat2/issues/5#issuecomment-149290561.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

It is exactly the specific input accessions that I'd like in order to be able to duplicate the problem on our end. Without it, there is no way we can assist in debugging the problem.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

SRR1203781 was the one I tested with, did not try others when I encountered problems

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

Thanks - that's exactly what I needed. Okay, we'll start debugging.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

On my end I had it complete a run successfully, so maybe sometimes the servers at the SRA end fail and cause a premature termination. It would be nice if that happens if the program terminates with an error (not a warning) and ideally would try before that to resume the connection.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

We can investigate the network issues with our systems group. Please send network and execution time information to [email protected] and we will check our logs for anything specific. Please note, however, that we already checked for errors at or around the time you reported the problem and did not find anything suspicious. Still, with more accurate information we may be able to do more.

Meanwhile, we're trying to duplicate your results.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

I think it might be related to our computing cluster environment, the runs finish from my desktop, but randomly finish from the error above at some point when I run them from the cluster. Running more tests today, will report later.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

As it (still) fails only on the cluster, I'm wondering if it could be related to a disk space issue for the cache of sra-tools. I think it defaults to the current user's home directory, which in my case is limited (I'm running the script on our /scratch space, where disk space is not an issue). Looking for a way to change it right now, will report if it fixes the problem.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

Hypothesis was right, the default cache of sra-tools was set to $HOME/ncbi which is space limited on the cluster, made a symbolic link to our storage space and now it completes fully.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

By default, our configuration places its cache in the user's home directory. This can be changed using the tool vdb-config from the SRA toolkit. But if you are running with a good internet connection (and especially if you are running on a compute cluster), it is probably a better idea to disable user caching at all.

Caching will help with random access patterns to avoid retrieving the same portion of the SRA file multiple times, and it will help if you running multiple passes over the same file. Using a cluster can cause a small but important access conflict with some of the reference sequences if the cache is shared, which is the case whenever you use default location in $HOME.

So far, we have not been able to reproduce your results, but they would be consistent with running out of disk space, since quality values take up the bulk of SRA storage.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

Here is a link to our configuration page: https://github.com/ncbi/sra-tools/wiki/Toolkit-Configuration

from hisat2.

infphilo avatar infphilo commented on July 22, 2024

This is quite useful, thanks guys! I'll add some additional description to --sra-acc option on the HISAT2 Website (the manual page) so that people know how to disable the cache especially when they use a cluster. I'll also modify HISAT2 to terminate with an error message in case of a connection error or something else. I may modify HISAT2 to retry one more time before giving up and terminating.

from hisat2.

kwrodarmer avatar kwrodarmer commented on July 22, 2024

We're going to make some improvements to work harder to continue in the face of this type of error. While we clearly detect the problem when it occurs, it may not be the best decision in this case to throw it back to HISAT2. Instead, we could just invalidate the cache and continue serving directly from the network. So we'll try to do more on our end, too.

from hisat2.

michael-imbeault avatar michael-imbeault commented on July 22, 2024

Thanks everyone, consider this fixed on my end, so closing the issue.

from hisat2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.