Coder Social home page Coder Social logo

pullseq's People

Contributors

bcthomas avatar nileshpatra avatar tmancill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pullseq's Issues

no output

Hi,
I am trying to file a MiSeq Index file based on a set of headers using the following command:

pullseq -i HallS-pool_S1_L001_I1_001.fastq -n IDs.txt -v > barcodes.fastq

This is the read out as pullseq is running:

verbose flag is set
Input is HallS-pool_S1_L001_I1_001.fastq
Names in IDs.txt will be included
Output will be 50 columns long

done reading from input (19467839 entries)
Input is FASTQ format
Processed 0 entries
Pulled 0 entries

Why might this be happening? Here is any example of the index fastq file:
@M00366:86:000000000-AJP15:1:1101:18035:1000 1:N:0:1
ATAGGAATAACC
+
CCCCCGGGGGCD
@M00366:86:000000000-AJP15:1:1101:14066:1000 1:N:0:1
GTGAGGTTCGGC
+
6A--6C@,C++F
@M00366:86:000000000-AJP15:1:1101:14848:1000 1:N:0:1
ATAATTGCCGAG
+
CCCCCGGGGGGG
@M00366:86:000000000-AJP15:1:1101:18086:1000 1:N:0:1
TCTCTACAAGTA
+
8----;-,--;-
@M00366:86:000000000-AJP15:1:1101:16316:1000 1:N:0:1

Here is what the IDs.txt file looks like:
@M00366:86:000000000-AJP15:1:1101:18846:1146 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:17470:1146 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:18794:1147 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:9220:1147 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:9734:1147 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:12133:1147 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:17621:1148 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:20761:1148 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:11504:1148 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:19907:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:17935:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:17274:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:10546:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:13379:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:16248:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:13417:1149 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:11550:1150 1:N:0:1
@M00366:86:000000000-AJP15:1:1101:16087:1150 1:N:0:1

Is this structured okay? Why might no output be generated? I've been searching all over for a tool like this and I was hoping this would do the trick. I hope I can get this to work.

Thank you for your help,
Colleen

Port to pcre2?

Hi @bcthomas

Thanks for your work on pullseq. I maintain pullseq in Debian, and we are now in the process of removing pcre (which is unmaintained for several years) hence, could you please port the code to the newer pcre2?

Regards,
Nilesh

strange behaviour in converting fastq to fasta

--Hi,

i try to convert a fastq file to fasta and some sequences are not converted, see below:

my input file: test.fastq (14 sequences)
@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (2+)
HKQCQNYNSSVR_ACKNLLYQARQQYKTKYKYRTRASILCNRCHNRGYKTSIL_RQ_NRLE_DFTRG
+

@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3+)
TNNAKIIIVQLDKPVRIYCTRPGNNTRQSISIGPGRAFYVTGVITGDIRQAYCNVSRTDWNKILQE
+

@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3-)
LL_NLIPICSTDVTICLSYIPCYDTCYIKCSPWSYTYTLSCIVAGPGTINSYRLI_LNYYNFGIVC
+

@M00842:73:000000000-A6TKT:1:1101:17239:1665 1:N:0:0 (2+)
HKQCQNYNSTVSHPCKN_FFHARQQYKKKCYVWTRANIFCNR_HNRGYKTSTL_YLLKRLE_DFTRG
+

@M00842:73:000000000-A6TKT:1:1101:17605:1728 1:N:0:0 (2+)
HKQCQNYNSTVSHACKN_LFQAWQQYKKECKDRTRANILCNR_HNRGYKTSTL_CQ_NRLE*DFTRG
+

@M00842:73:000000000-A6TKT:1:1101:17605:1728 1:N:0:0 (3+)
TNNAKIIIVQLATPVRINCSRPGNNTRKSVRIGPGQTFYATGDIIGGIRRAHCNVSRTDWNKTLQEV
+

@M00842:73:000000000-A6TKT:1:1101:17605:1728 1:N:0:0 (2-)
YLL_SLIPICSTDITMCSSYTPYYVTCCIKCLPWSYPYTLSCIVARPGTINSYRRG_LYYYNFGIVC
+

@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (2+)
HKQCQNYNSSVR_ACKNLLYQARQQYKTKYKYRTRASILCNRCHNRGYKTSIL_RQ_NRLE_DFTRG
+

@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3+)
TNNAKIIIVQLDKPVRIYCTRPGNNTRQSISIGPGRAFYVTGVITGDIRQAYCNVSRTDWNKILQE
+

@M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3-)
LL_NLIPICSTDVTICLSYIPCYDTCYIKCSPWSYTYTLSCIVAGPGTINSYRLI_LNYYNFGIVC
+

@M00842:73:000000000-A6TKT:1:1101:15135:1817 1:N:0:0 (2+)
HKQCQNYNSTVSYACKN_LFQAWQQYKKECKDRTRANILCNR_HNRGYKTSTL_CQ_NRLE*DFTRG
+

@M00842:73:000000000-A6TKT:1:1101:15135:1817 1:N:0:0 (3+)
TNNAKIIIVQLATPVRINCSRPGNNTRKSVRIGPGQTFYATGDIIGDIRRAHCNVSRTDWNKTLQEV
+

@M00842:73:000000000-A6TKT:1:1101:15135:1817 1:N:0:0 (2-)
YLL_SLIPICSTDITMCSSYIPYYVTCCIECLPWSYPYTLSCIVARPGTINSYRRS_LYYYNFGIVC
+

@M00842:73:000000000-A6TKT:1:1101:13686:1838 1:N:0:0 (2+)
HKQCQNYNSTVR_ACKN_LYQAWQQYKTKYKYRTRASILCNR_HNRGYKTSIL_CQ_NRVE_DFTRG
+

then i run this command: ./pullseq -i test.fastq -c -l 150
and the output:

M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (2+)
HKQCQNYNSSVR_ACKNLLYQARQQYKTKYKYRTRASILCNRCHNRGYKTSIL_RQ_NRLE_DFTRG
M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3-)
LL_NLIPICSTDVTICLSYIPCYDTCYIKCSPWSYTYTLSCIVAGPGTINSYRLI_LNYYNFGIVC
M00842:73:000000000-A6TKT:1:1101:17605:1728 1:N:0:0 (2+)
HKQCQNYNSTVSHACKN_LFQAWQQYKKECKDRTRANILCNR_HNRGYKTSTL_CQ_NRLE_DFTRG
M00842:73:000000000-A6TKT:1:1101:17605:1728 1:N:0:0 (2-)
YLL_SLIPICSTDITMCSSYTPYYVTCCIKCLPWSYPYTLSCIVARPGTINSYRRG_LYYYNFGIVC
M00842:73:000000000-A6TKT:1:1101:18725:1757 1:N:0:0 (3+)
TNNAKIIIVQLDKPVRIYCTRPGNNTRQSISIGPGRAFYVTGVITGDIRQAYCNVSRTDWNKILQE
M00842:73:000000000-A6TKT:1:1101:15135:1817 1:N:0:0 (2+)
HKQCQNYNSTVSYACKN_LFQAWQQYKKECKDRTRANILCNR_HNRGYKTSTL_CQ_NRLE_DFTRG
M00842:73:000000000-A6TKT:1:1101:15135:1817 1:N:0:0 (2-)
YLL_SLIPICSTDITMCSSYIPYYVTCCIECLPWSYPYTLSCIVARPGTINSYRRS_LYYYNFGIVC

i have only 7 sequences converted, why ?

thank you --

error during compilation - code may need update with gcc 10.2.0?

While making pullseq with gcc 10.2.0, I got the following error during linking:
...
gcc -g -O2 -o pullseq hash.o output.o size_filter.o search_header.o file_read.o pull_by_re.o pull_by_name.o pull_by_size.o pullseq.o -lpcre -lz
/usr/bin/ld: output.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here /usr/bin/ld: output.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here
/usr/bin/ld: output.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here /usr/bin/ld: size_filter.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here
/usr/bin/ld: size_filter.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here /usr/bin/ld: size_filter.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here
/usr/bin/ld: search_header.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here /usr/bin/ld: search_header.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here
/usr/bin/ld: search_header.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here /usr/bin/ld: file_read.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here
/usr/bin/ld: file_read.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here /usr/bin/ld: file_read.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here
/usr/bin/ld: pull_by_re.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here /usr/bin/ld: pull_by_re.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here
/usr/bin/ld: pull_by_re.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here /usr/bin/ld: pull_by_name.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here
/usr/bin/ld: pull_by_name.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here /usr/bin/ld: pull_by_name.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here
/usr/bin/ld: pull_by_size.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here /usr/bin/ld: pull_by_size.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here
/usr/bin/ld: pull_by_size.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here /usr/bin/ld: pullseq.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: multiple definition of progname'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:26: first defined here
/usr/bin/ld: pullseq.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: multiple definition of verbose_flag'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:28: first defined here /usr/bin/ld: pullseq.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: multiple definition of QUALITY_SCORE'; hash.o:/home/kinestetika/bin/util/pullseq/src/global.h:27: first defined here
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:376: pullseq] Error 1
make[2]: Leaving directory '/home/kinestetika/bin/util/pullseq/src'
make[1]: *** [Makefile:283: all] Error 2
make[1]: Leaving directory '/home/kinestetika/bin/util/pullseq/src'
make: *** [Makefile:344: all-recursive] Error 1

Autoconf error

configure.ac:7: error: possibly undefined macro: AM_INIT_AUTOMAKE
If this token and others are legitimate, please use m4_pattern_allow.
See the Autoconf documentation.

When run manually, it creates the configure file, as expected; however, I am writing a script to include your software in the homebrew package management system, and this error causes the script to terminate (abort the install).

It may be worth adding the need to run autoconf into the documentation, as the 1.0.0 tarball does not include the generated configure file.

seqdiff, unexpected results

Hi Brian,

I write with regards to the seqdiff command where I'm unable to produce the expected results.

For example, I duplicated (i.e., cp) a fastq file and ran the following command
seqdiff -1 file1.fq -2 file1b.fq -s
and received the following summary output:

first_file_total = 4255201
first_file_uniq = 0
second_file_total = 4255201
second_file_uniq = 0
common = 2250574

I then created test fastq files with 7 reads; the only difference being a deletion of the first 4 bases in the first read of the duplicate fastq file.
I received the following output summary (expected values in parentheses):

first_file_total = 7
first_file_uniq = 7 (1)
second_file_total = 0 (7)
second_file_uniq = 0 (1)
common = 0 (6)

Any thoughts or help would be greatly appreciated.

Updating CFLAGS variable doesn't solve error during configuration

Hi,
while running ./configure I've got message about absent libpcre2. I've installed it with conda:

>conda install -c anaconda pcre2

And updated CFLAGS variable as described in README:

> pcre-config --cflags
-I/home/sochkalova/miniconda3/envs/das_tool/include
>export CFLAGS="-I/home/sochkalova/miniconda3/envs/das_tool/include"
>./configure

That gave me the same message about not installed libpcre2. I don't understand what to do. Can you please help?

P.S. I work on the server where I don't have access to sudo(just in case)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.