sheinasim / hifiadapterfilt Goto Github PK
View Code? Open in Web Editor NEWRemove CCS reads with remnant PacBio adapter sequences and convert outputs to a compressed .fastq (.fastq.gz).
License: GNU General Public License v3.0
Remove CCS reads with remnant PacBio adapter sequences and convert outputs to a compressed .fastq (.fastq.gz).
License: GNU General Public License v3.0
Hi!
I am trying to run this tool on my PacBio reads but I get a general error from blast (like if some options are not compatible) that makes me think that maybe there is a problem with my blast version.
What is the recommended version for blast to use?
Thanks!!
fastq.gz and fasta.gz are default outputs for IIe system, make option to work directly from them
would it be possible to add a --cite or --version to the script? it'd make integration into our pipelines that much easier.
Hey, @sheinasim I can't relate to the error the tool throws at me.
This is my command:
sh hifiadapterfilt.sh -p /data01/mlandi/TME117-assembly/filtered.data/filtered.cell1.fastq.gz -t 32 -o filtered.adapter
Kindly assist, Do I miss parameters?
Thanks for this nice tool. Just a quick question. Does HiFiAdapterFilt perform better if I run it on one large input file? Or would it be better to split my input file in several smaller fastq.gz files (for example with fastqsplitter), which I then store in one directory on which I let HiFiAdapterFilt run?
Dear @sheinasim
Thanks for your solution of adapters.
Is there anyway to accept fasta
file type for HiFi reads ?
Looking forward to reply.
Best
Johnson
Hi,
I am trying to run HiFiAdapterFilt on a raw PacBio hifi .bam file using the following command:
bash pbadapterfilt.sh -p /path/to/bam/220209_hifi_reads.bam -o HiFiAdaptFilt_220209
and I keep getting the following error:
path/to/bam/220209_hifi_reads.bam.temp_file_list: No such file or directory
It seems to be making the output directory, but cuts out immediately. Do you know what could be going on here?
What are the typical run times for the pipeline? I have a 40GB fastq and I expect it could be anywhere from 5-15 hours, based on transfer rates.
I encountered an error:
BLAST Database error: No alias or index file found for nucleotide database [/pacbio_vectors_db] in search path [/home/liangc/Downloads/PacBio:/scratch/ncbi_blast+/db:]
To solve this problem, I have tried two approaches
(1): Inside my /Downloads/PacBio folder, I created a folder named "pacbio_vectors_db" and put all relevant files within this folder
(2): Inside my /Downloads/PacBio folder, places all individual files that I copied from HiFiAdapterFilt-master/DB/.
Both approaches do not work.
In my .bashrc file, I have put these commands:
export BLASTDB=/scratch/ncbi_blast+/db/
export PATH=$PATH:/home/liangc/Software/HiFiAdapterFilt-master
export PATH=$PATH:/home/liangc/Software/HiFiAdapterFilt-master/DB
export PATH=$PATH:/home/liangc/Downloads/PacBio/pacbio_vectors_db
delete big fasta and fastq files (unzipped or bam converted files)
Hi Sheina,
I was eager to try your adapter trimming tool but I experienced some issues I thought you might want to know about.
I noticed in the 2.0.0 release that line 6 is:
DBpath=$(echo $PATH | sed 's/:/\n/g' | grep "HiFiAdapterFilt/DB" | head -n 1)
However, I think it should be:
DBpath=$(echo $PATH | sed 's/:/\n/g' | grep "HiFiAdapterFilt-2.0.0/DB" | head -n 1)
That still isn't solving a problem I'm having, though. I'm consistently getting the following error unless I hard-code the path to the blast database:
BLAST Database error: No alias or index file found for nucleotide database [/pacbio_vectors_db] in search path [/grps2/kmk/2022-04-20_Ap212/PB744_AP212_5C_ULTP/r64069_20220330_183728/C1/outputs/adapter_filter_test::]
Also, do you know if the adapters used in the ultra-low PacBio library prep are the same as the ones in the pacbio_vectors_db file?
Thanks!
Kevin
count of filtered files ends up in stderr, could write into stats.txt output instead?
Does HiFiAdapterFilt apply to the HiFi data derived from Revio platform?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.