Coder Social home page Coder Social logo

sheinasim / hifiadapterfilt Goto Github PK

View Code? Open in Web Editor NEW
79.0 79.0 17.0 76 KB

Remove CCS reads with remnant PacBio adapter sequences and convert outputs to a compressed .fastq (.fastq.gz).

License: GNU General Public License v3.0

Shell 99.38% TeX 0.62%

hifiadapterfilt's People

Contributors

sheinasim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

hifiadapterfilt's Issues

Blast version

Hi!

I am trying to run this tool on my PacBio reads but I get a general error from blast (like if some options are not compatible) that makes me think that maybe there is a problem with my blast version.
What is the recommended version for blast to use?

Thanks!!

add --cite or --version

would it be possible to add a --cite or --version to the script? it'd make integration into our pipelines that much easier.

performance question

Thanks for this nice tool. Just a quick question. Does HiFiAdapterFilt perform better if I run it on one large input file? Or would it be better to split my input file in several smaller fastq.gz files (for example with fastqsplitter), which I then store in one directory on which I let HiFiAdapterFilt run?

fasta file type

Dear @sheinasim
Thanks for your solution of adapters.
Is there anyway to accept fasta file type for HiFi reads ?

Looking forward to reply.

Best
Johnson

.temp_file_list: No such file or directory

Hi,
I am trying to run HiFiAdapterFilt on a raw PacBio hifi .bam file using the following command:

bash pbadapterfilt.sh -p /path/to/bam/220209_hifi_reads.bam -o HiFiAdaptFilt_220209

and I keep getting the following error:
path/to/bam/220209_hifi_reads.bam.temp_file_list: No such file or directory

It seems to be making the output directory, but cuts out immediately. Do you know what could be going on here?

expected time

What are the typical run times for the pipeline? I have a 40GB fastq and I expect it could be anywhere from 5-15 hours, based on transfer rates.

pacbio_vectors_db error

I encountered an error:

BLAST Database error: No alias or index file found for nucleotide database [/pacbio_vectors_db] in search path [/home/liangc/Downloads/PacBio:/scratch/ncbi_blast+/db:]

To solve this problem, I have tried two approaches
(1): Inside my /Downloads/PacBio folder, I created a folder named "pacbio_vectors_db" and put all relevant files within this folder
(2): Inside my /Downloads/PacBio folder, places all individual files that I copied from HiFiAdapterFilt-master/DB/.

Both approaches do not work.

In my .bashrc file, I have put these commands:

export BLASTDB=/scratch/ncbi_blast+/db/
export PATH=$PATH:/home/liangc/Software/HiFiAdapterFilt-master
export PATH=$PATH:/home/liangc/Software/HiFiAdapterFilt-master/DB
export PATH=$PATH:/home/liangc/Downloads/PacBio/pacbio_vectors_db

Blast database issue

Hi Sheina,

I was eager to try your adapter trimming tool but I experienced some issues I thought you might want to know about.

I noticed in the 2.0.0 release that line 6 is:
DBpath=$(echo $PATH | sed 's/:/\n/g' | grep "HiFiAdapterFilt/DB" | head -n 1)

However, I think it should be:
DBpath=$(echo $PATH | sed 's/:/\n/g' | grep "HiFiAdapterFilt-2.0.0/DB" | head -n 1)

That still isn't solving a problem I'm having, though. I'm consistently getting the following error unless I hard-code the path to the blast database:

BLAST Database error: No alias or index file found for nucleotide database [/pacbio_vectors_db] in search path [/grps2/kmk/2022-04-20_Ap212/PB744_AP212_5C_ULTP/r64069_20220330_183728/C1/outputs/adapter_filter_test::]

Also, do you know if the adapters used in the ultra-low PacBio library prep are the same as the ones in the pacbio_vectors_db file?

Thanks!
Kevin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.