Coder Social home page Coder Social logo

Comments (8)

mdshw5 avatar mdshw5 commented on August 24, 2024 1

Ah I see. That error message is telling you that the FASTA file cannot be gzip compressed. You can however use block-gzip compression to compress the FASTA file. See https://www.htslib.org/doc/bgzip.html

from pyfaidx.

yonniejon avatar yonniejon commented on August 24, 2024

So the problem was no the chr prefix. I replaced my bed file to not contain the "chr" prefixes and I removed the "chr" prefixes in my fasta reference file and the problem persists.

from pyfaidx.

mdshw5 avatar mdshw5 commented on August 24, 2024

This means that you have a non utf-8 character at the beginning of your file. Did you by chance export this from MS Excel as utf-16? If so then you need to convert your file to utf-8 encoding. You can also export from Excel in utf-8 encoding as well.

from pyfaidx.

yonniejon avatar yonniejon commented on August 24, 2024

I did not. I ran nano tmp.bed and pasted the following contents exactly:

chr6 132891948 132892108
chr10 127585142 127585221

from pyfaidx.

mdshw5 avatar mdshw5 commented on August 24, 2024

Just to confirm - you have said:

where tmp.bed.gz looks like:
chr6 132891948 132892108
chr10 127585142 127585221

Do you mean that the tmp.bed file contains this, and you have also gzipped it? If so I think I understand the issue. The --bed option does not handle gzipped input. If you want to pass a gzipped file you could do:

$ faidx hg19/genome.fa.gz -b - <( gzip -dc tmp.bed.gz)

The above would use a sub shell to decompress your bed file and send it to stdin, which can be read by the --bed argument using the "-" symbol. You could alternatively pass an uncompressed bed file.

from pyfaidx.

yonniejon avatar yonniejon commented on August 24, 2024

"Do you mean that the tmp.bed file contains this, and you have also gzipped it?"

Yes you are correct. But I only gzipped it because when I ran it without gzip/bgzip I got the following error:

faidx genome.fa.gz -b tmp.bed

Traceback (most recent call last):
File "/cs/usr/jjj/.local/bin/faidx", line 8, in
sys.exit(main())
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/cli.py", line 202, in main
write_sequence(args)
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/cli.py", line 53, in write_sequence
for line in fetch_sequence(args, fasta, name, start, end):
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/cli.py", line 70, in fetch_sequence
sequence = fasta[name][start:end]
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/init.py", line 920, in getitem
return self._fa.get_seq(self.name, start + 1, stop)[::step]
File "/cs/usr/jrosensk/.local/lib/python3.9/site-packages/pyfaidx/init.py", line 1149, in get_seq
seq = self.faidx.fetch(name, start, end)
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/init.py", line 727, in fetch
seq = self.from_file(name, start, end)
File "/cs/usr/jjj/.local/lib/python3.9/site-packages/pyfaidx/init.py", line 769, in from_file
self.file.seek(i.offset)
File "/usr/lib/python3/dist-packages/Bio/bgzf.py", line 650, in seek
self._load_block(start_offset)
File "/usr/lib/python3/dist-packages/Bio/bgzf.py", line 611, in _load_block
block_size, self._buffer = _load_bgzf_block(handle, self._text)
File "/usr/lib/python3/dist-packages/Bio/bgzf.py", line 444, in _load_bgzf_block
raise ValueError(
ValueError: A BGZF (e.g. a BAM file) block should start with b'\x1f\x8b\x08\x04', not b'\xea^\x8b\xb0'; handle.tell() now says 16541

from pyfaidx.

yonniejon avatar yonniejon commented on August 24, 2024

Got it! Thanks. Sorry about the confusion!

from pyfaidx.

mdshw5 avatar mdshw5 commented on August 24, 2024

No worries - glad to help!

from pyfaidx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.