Coder Social home page Coder Social logo

Comments (4)

nextgenusfs avatar nextgenusfs commented on July 20, 2024 1

Great, happy to hear that it is useful!

Yes, you can combine the runs via concatenating the demuxed files - but you need to make sure that each sample has a unique label. So the function of the --mult_samples is to add a "base name" to each read so that it is distinct from another run, i.e.

amptk ion -i run1.fastq.gz -o run1 --mult_samples r1 --barcode_fasta labels.fa

This will create fastq headers that have 'r1.' at the beginning of the sample name, i.e.

$ amptk ion -i ion.test.fastq -o run1 --mult_samples r1
-------------------------------------------------------
[Apr 04 03:04 PM]: OS: MacOSX 10.13.3, 8 cores, ~ 17 GB RAM. Python: 2.7.11
[Apr 04 03:04 PM]: AMPtk v1.1.2-c1661a0, USEARCH v9.2.64, VSEARCH v2.7.0
[Apr 04 03:04 PM]: Foward primer: AGTGARTCATCGAATCTTTG,  Rev comp'd rev primer: GCATATCAATAAGCGGAGGA
[Apr 04 03:04 PM]: Loading FASTQ Records
[Apr 04 03:04 PM]: 2,000 reads (1.6 MB)
-------------------------------------------------------
[Apr 04 03:04 PM]: Concatenating Demuxed Files
[Apr 04 03:04 PM]: 2,000 total reads
[Apr 04 03:04 PM]: 1,409 valid Barcode
[Apr 04 03:04 PM]: 1,406 Fwd Primer found, 1,151 Rev Primer found
[Apr 04 03:04 PM]: 34 discarded too short (< 100 bp)
[Apr 04 03:04 PM]: 1,372 valid output reads
[Apr 04 03:04 PM]: Found 19 barcoded samples
                Sample:  Count
              r1.BC.27:  95
              r1.BC.23:  90
              r1.BC.17:  90
              r1.BC.28:  88
              r1.BC.20:  82
              r1.BC.73:  80
              r1.BC.18:  77
              r1.BC.16:  72
              r1.BC.15:  72
              r1.BC.21:  72
              r1.BC.10:  71
              r1.BC.22:  68
              r1.BC.11:  66
              r1.BC.14:  65
               r1.BC.9:  65
              r1.BC.24:  62
              r1.BC.12:  60
              r1.BC.19:  53
               r1.BC.5:  44
[Apr 04 03:04 PM]: Output file:  run1.demux.fq.gz (242.0 KB)
[Apr 04 03:04 PM]: Mapping file: run1.mapping_file.txt

The other way is to just provide unique labels in the barcode fasta file:

>sample1
CTAAGGTAAC
>sample2
TAAGGAGAAC
>sample3
AAGAGGATTC
>sample4
TACCAAGATC
...

And then finally you can specify unique names in a mapping file as well, i.e.

#SampleID	BarcodeSequence	LinkerPrimerSequence	ReversePrimer	phinchID	DemuxReads	Treatment
sample1	CAGAAGGAAC	CCATCTCATCCCTGCGTGTCTCCGACTCAGCAGAAGGAACAGTGARTCATCGAATCTTTG	TCCTCCGCTTATTGATATGC	r1.BC.5	44	no_data
sample2	TGAGCGGAAC	CCATCTCATCCCTGCGTGTCTCCGACTCAGTGAGCGGAACAGTGARTCATCGAATCTTTG	TCCTCCGCTTATTGATATGC	r1.BC.9	65	no_data
sample3	CTGACCGAAC	CCATCTCATCCCTGCGTGTCTCCGACTCAGCTGACCGAACAGTGARTCATCGAATCTTTG	TCCTCCGCTTATTGATATGC	r1.BC.10	71	no_data
sample4	TCCTCGAATC	CCATCTCATCCCTGCGTGTCTCCGACTCAGTCCTCGAATCAGTGARTCATCGAATCTTTG	TCCTCCGCTTATTGATATGC	r1.BC.11	66	no_data

And then use that mapping file to demux

$ amptk ion -i ion.test.fastq -o run1 -m my_mappingfile.txt
-------------------------------------------------------
[Apr 04 03:09 PM]: OS: MacOSX 10.13.3, 8 cores, ~ 17 GB RAM. Python: 2.7.11
[Apr 04 03:09 PM]: AMPtk v1.1.2-c1661a0, USEARCH v9.2.64, VSEARCH v2.7.0
[Apr 04 03:09 PM]: Foward primer: AGTGARTCATCGAATCTTTG,  Rev comp'd rev primer: GCATATCAATAAGCGGAGGA
[Apr 04 03:09 PM]: Loading FASTQ Records
[Apr 04 03:09 PM]: 2,000 reads (1.6 MB)
-------------------------------------------------------
[Apr 04 03:09 PM]: Concatenating Demuxed Files
[Apr 04 03:09 PM]: 2,000 total reads
[Apr 04 03:09 PM]: 248 valid Barcode
[Apr 04 03:09 PM]: 248 Fwd Primer found, 210 Rev Primer found
[Apr 04 03:09 PM]: 2 discarded too short (< 100 bp)
[Apr 04 03:09 PM]: 246 valid output reads
[Apr 04 03:09 PM]: Found 4 barcoded samples
                Sample:  Count
               sample3:  71
               sample4:  66
               sample2:  65
               sample1:  44
[Apr 04 03:09 PM]: Output file:  run1.demux.fq.gz (42.2 KB)
[Apr 04 03:09 PM]: Mapping file: my_mappingfile.txt

If you already have your demuxed runs and don't want to redo it, you could also do a find/replace with sed on your demuxed runs:

sed 's/barcodelabel=/barcodelabel=run1./g' demux.fq > demux.fixed.fq

And then finally you can just concatenate using cat which will work even on gzipped files:

cat run1.demux.fq.gz run2.demux.fq.gz > combined.demux.fq.gz

from amptk.

JonathanVanHamme avatar JonathanVanHamme commented on July 20, 2024

Brilliant, thanks for the quick and thorough reply, Jon! It is working like a charm.

from amptk.

nextgenusfs avatar nextgenusfs commented on July 20, 2024

Thanks Jon! Is the data quality improved on the Ion S5 XL? We are still running the PGM - no plans to upgrade at the moment, but just curious. And let me know if you run into any more problems with AMPtk or have some features that you'd like to see incorporated.

from amptk.

JonathanVanHamme avatar JonathanVanHamme commented on July 20, 2024

from amptk.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.