Coder Social home page Coder Social logo

Comments (4)

nextgenusfs avatar nextgenusfs commented on July 20, 2024

These are the only two files in the 'processed' folder? Based on the logfile the script appears to be merging paired ends reads correctly, although if this is 250 bp reads then you should change one of the settings. In that processed folder do you then see a single file for each of your samples, i.e. V4.fastq?

What primers did you use for amplification and sequencing? The default settings are the ITS2 region using fITS7 and ITS4 primers. You can specify different primers using the -f and -r options. The default settings also assume that you used the Illumina TruSeq dual barcoding approach, where your reads look like this from the sequencing center and that the primers are intact:
5'primer-read-3'primer
So the script will only output reads where it can find the forward primer. If that is not your read structure, i.e. your primers are already removed then you need to pass the --require_primer off option at runtime.

You should also set the --read_length 250 if you have PE 250 bp reads.

from amptk.

MycoMap avatar MycoMap commented on July 20, 2024

I do see a file for each of the samples. The issue may be this dataset only looks at ITS1.

What would the remainder of the script be for setting new forward and reverse primers?

from amptk.

nextgenusfs avatar nextgenusfs commented on July 20, 2024

So you would do something like this if it was ITS1-F and ITS2 primers - note you should add the actual primer sequences that you used that will remain after Illumina trims off their adapters and index sequences (typically this is just the normal primer). If you used the custom sequencing primers that are used in the community, i.e. from Smith et al. 2014 - then you need to pass the --require_primer off option. The --rescue_forward option will keep the forward reads if the paired reads cannot be merged.

ufits illumina -i rawdata -o process_ITS1 -f CTTGGTCATTTAGAGGAAGTAA \
-r GCTGCGTTCTTCATCGATGC --read_length 250 --rescue_forward

Remember that running any of the commands in UFITS without any options will output a help menu:

ufits illumina
Usage:       ufits illumina <arguments>
version:     0.5.5

Description: Script takes a folder of Illumina MiSeq data that is already de-multiplexed and processes it for
             clustering using UFITS.  The default behavior is to: 1) merge the PE reads using USEARCH, 2) find and
             trim away primers, 3) rename reads according to sample name, 4) trim/pad reads to a set length.

Arguments:   -i, --fastq         Input folder of FASTQ files (Required)
             -o, --out           Output folder name. Default: ufits-data
             --reads             Paired-end or forward reads. Default: paired [paired, forward]
             --read_length       Illumina Read length (250 if 2 x 250 bp run). Default: 300 
             --rescue_forward    Rescue Forward Reads if PE do not merge, e.g. abnormally long amplicons
             -f, --fwd_primer    Forward primer sequence. Default: fITS7
             -r, --rev_primer    Reverse primer sequence Default: ITS4
             --require_primer    Require the Forward primer to be present. Default: on [on, off]
             -n, --name_prefix   Prefix for re-naming reads. Default: R_
             -m, --min_len       Minimum length read to keep. Default: 50
             -l, --trim_len      Length to trim/pad reads. Default: 250
             --full_length       Keep only full length sequences.
             --cpus              Number of CPUs to use. Default: all
             -u, --usearch       USEARCH executable. Default: usearch8
             --cleanup           Remove intermediate files.

from amptk.

MycoMap avatar MycoMap commented on July 20, 2024

Thank you very much. I think this should get me to where I need to be.

from amptk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.