Coder Social home page Coder Social logo

andersenlab / vcf-kit Goto Github PK

View Code? Open in Web Editor NEW
119.0 17.0 25.0 11.2 MB

VCF-kit: Assorted utilities for the variant call format

Home Page: http://www.andersenlab.org

License: MIT License

Python 66.02% JavaScript 33.38% HTML 0.30% Dockerfile 0.30%
vcf python

vcf-kit's Introduction

Build Status Coverage Status Documentation Status

VCF-kit - Documentation

VCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files. A summary of the commands is provided below.

Command Description
calc Obtain frequency/count of genotypes and alleles.
call Compare variants identified from sequences obtained through alternative methods against a VCF.
filter Filter variants with a minimum or maximum number of REF, HET, ALT, or missing calls.
geno Various operations at the genotype level.
genome Reference genome processing and management.
hmm Hidden-markov model for use in imputing genotypes from parental genotypes in linkage studies.
phylo Generate dendrograms from a VCF.
primer Generate primers for variant validation.
rename Add a prefix, suffix, or substitute a string in sample names.
tajima Calculate Tajima’s D.
vcf2tsv Convert a VCF to TSV.

Installation

VCF-Kit has been upgraded to Python 3

VCF-kit has been tested with Python 3.6. VCF-kit makes use of additional software for a variety of tasks:

  • bwa (v 0.7.12)
  • samtools (v 1.3)
  • bcftools (v 1.3)
  • blast (v 2.2.31+)
  • muscle (v 3.8.31)
  • primer3 (v 2.5.0)

You can install these dependencies and VCF-kit using conda, or you can use a Docker image.

Conda

conda config --add channels bioconda
conda config --add channels conda-forge
conda create -n vcf-kit \
  danielecook::vcf-kit=0.2.6 \
  "bwa>=0.7.17" \
  "samtools>=1.10" \
  "bcftools>=1.10" \
  "blast>=2.2.31" \
  "muscle>=3.8.31" \
  "primer3>=2.5.0"

conda activate vcf-kit

Docker

You can also run VCF-kit with all installed dependencies using docker:

docker run -it andersenlab/vcf-kit vk

vcf-kit's People

Contributors

danielecook avatar samwachspress avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vcf-kit's Issues

problem with primer3 in both conda and docker

i have tried installing VCF-kit as you mentioned in github site and i downloaded the reference genome as mentioned.

run1:

command: vk primer indel --ref=WBcel235 QX1211.indels.vcf.gz

Using reference located at /home/shanmu/.genome/WBcel235/WBcel235.fa.gz
   --size ignored; size is set dynamically when genotyping indels.

Applied 2 variants

Cannot find thermo path '/primer3_config/

similarly i have installed vcfkit through docker and ran the programme as below
Run2:

sudo docker run -v /root/.genome/:/data -v /media/shanmu/shanmu_hdd1/chickpea_minicore/merged_298/filtered_passed/docker/VCF-kit/test_data:/datapath -it andersenlab/vcf-kit vk primer indel /datapath/QX1211.indels.vcf.gz --ref=/data/WBcel235

Genome 'WBcel235' does not exist

run 3:
sudo docker run -v /home/shanmu/.genome/:/data -v /media/shanmu/shanmu_hdd1/chickpea_minicore/merged_298/filtered_passed/docker/VCF-kit/test_data:/datapath -it andersenlab/vcf-kit vk primer indel /datapath/QX1211.indels.vcf.gz --ref=/data/WBcel235

--size ignored; size is set dynamically when genotyping indels.

[E::fai_build3_core] Failed to open the file /data/WBcel235
[faidx] Could not load fai index of /data/WBcel235
[E::fai_build3_core] Failed to open the file /data/WBcel235
[faidx] Could not load fai index of /data/WBcel235
Note: the --sample option not given, applying all records regardless of the genotype
Applied 0 variants

Cannot find thermo path '/primer3_config/

Warning --sample option not given

Hi, I am trying understand what is causing the warning "Note: the --sample option not given, applying all records regardless of the genotype". I don't see any options in the tool help section also. I tried providing the SAMPLE ID as mentioned in the VCF, but still program is throwing the same warning. Could someone help me with some explanation.

got an KeyError with the tajima.py

Hi, I got an KeyError report with the tajima.py like this:
Traceback (most recent call last):
File "/lustre/home/gaojie/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/tajima.py", line 148, in
main()
File "/lustre/home/gaojie/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/tajima.py", line 144, in main
for i in tajima(args[""]).calc_tajima(wz, sz, args["--sliding"], extra=args["--extra"]):
File "/lustre/home/gaojie/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/tajima.py", line 82, in calc_tajima
AC = variant.INFO["AC"]
File "cyvcf2/cyvcf2.pyx", line 2133, in cyvcf2.cyvcf2.INFO.getitem
KeyError: b'AC'

Could you please tell me how to fix this. Thank you very much!

Conda install error

conda install -c bioconda vcfkit or conda install -c bioconda vcfkit=0.1.6 gives the following error
image

No reference strain in output alignment file?

The tool is very convenient and memory efficient.
I'm new to the coding, but is there any way to also output the reference into alignment fasta file.
With the following command, I get only the strains but not the reference strain in the concatenated variants in alignment fasta format.
$ vk phylo fasta my_multiple_strains.vcf > my_alignment.fasta

Thank you for your contribution.

Running Docker version: "ModuleNotFoundError: No module named 'pomegranate'"

I'm a PhD student hoping to use this software! I installed through docker (docker run -it andersenlab/vcf-kit vk). Then I tried to run the hmm command and got a "Error [my vcf file name] does not exist". I entered interactive mode (docker run -dit andersenlab/vcf-kit ; docker exec -it [container id] /bin/bash) to try and run vcf-kit interactively to see if I could get a more informative error message, then got this error:

(base) root@f172fe56e9ab:/# vk hmm -h Traceback (most recent call last): File "/opt/conda/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/hmm.py", line 26, in <module> import pomegranate ModuleNotFoundError: No module named 'pomegranate' (base) root@f172fe56e9ab:/#

It seems like there is a module that is not installed? I would love any help you can offer with resolving this issue. Thanks!!

vk doesnot work on mac

tests-MBP:TEST testaccount$ vk
Traceback (most recent call last):
File "/usr/local/bin/vk", line 5, in
from pkg_resources import load_entry_point
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 3095, in
@_call_aside
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 3081, in _call_aside
f(*args, **kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 3108, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 660, in _build_master
return cls._build_from_requirements(requires)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 673, in _build_from_requirements
dists = ws.resolve(reqs, Environment())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 851, in resolve
raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (scipy 0.13.0b1 (/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python), Requirement.parse('scipy>=0.13.3'), set(['yahmm']))

VK does not work with plink outputed VCF because the AC is empty

vk tajima --no-header --extra 100000 50000 plink.vcf
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/vcfkit/tajima.py", line 146, in
main()
File "/usr/local/lib/python2.7/site-packages/vcfkit/tajima.py", line 142, in main
for i in tajima(args[""]).calc_tajima(wz, sz, args["--sliding"], extra=args["--extra"]):
File "/usr/local/lib/python2.7/site-packages/vcfkit/tajima.py", line 80, in calc_tajima
AC = variant.INFO["AC"]
File "cyvcf2/cyvcf2.pyx", line 1880, in cyvcf2.cyvcf2.INFO.getitem (cyvcf2/cyvcf2.c:36641)
KeyError: 'AC'

##fileformat=VCFv4.2
##fileDate=20171123
##source=PLINKv1.90
##contig=<ID=24,length=62643700>
##INFO=<ID=PR,Number=0,Type=Flag,Description="Provisional reference allele, may not be based on real reference genome">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT t1 t2 t3
24 430241 24_430241 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 1/1 1/1 1/1 1
24 478701 24_478701 A G . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 0/0 0/1 0/1 0/0 1/1 1/1 1/1 0/1 .
24 576821 24_576821 A G . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 1/1 1/1 1/1 .
24 587710 24_587710 G A . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0
24 602258 24_602258 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 0/0 1/1 1/1 0/1 1/1 0/1 0/1 0/0 1/1 1/1 1/1 1/1 .
24 632760 24_632760 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 0/0 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 0/1 1/1 1/1 .
24 653401 24_653401 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 0/0 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 0/1 1/1 1/1 .
24 679380 24_679380 A G . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 0/0 1/1 1/1 0/1 0/0 0/1 0/1 0/0 1/1 0/1 1/1 0/1 0
24 706868 24_706868 G A . . PR GT 0/0 0/1 1/1 0/1 0/0 0/0 0/1 1/1 0/0 0/0 0/1 0/1 0/1 0/1 1/1 0/0 0/1 0/0 0/1 0
24 734205 24_734205 G A . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 .
24 757597 24_757597 G A . . PR GT 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0
24 770828 24_770828 A G . . PR GT 0/0 0/1 1/1 0/1 0/0 0/0 0/1 1/1 0/0 0/0 0/1 0/1 0/1 0/1 1/1 0/0 0/1 0/0 0/1 0
24 785766 24_785766 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 0/0 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 1/1 1/1 0/1 .
24 797195 24_797195 A G . . PR GT 0/0 0/1 1/1 0/1 0/0 0/0 0/1 1/1 0/0 0/0 0/1 0/1 0/1 0/1 1/1 0/0 0/0 0/0 0/1 0
24 823211 24_823211 A G . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 1/1 1/1 0/1 .
24 832216 24_832216 G A . . PR GT 1/1 0/1 0/0 0/1 1/1 1/1 0/1 1/1 1/1 1/1 0/1 0/1 0/1 0/1 0/0 1/1 1/1 1/1 0/1 .

No module named cyvcf2

Hi

After the last update, I am getting an error on imports. Try the machine where I have it installed before and in a new one, fresh and no previous installs. I am on macOS, here is the error

Traceback (most recent call last):
  File "/usr/local/bin/vk", line 7, in <module>
    from vcfkit.vk import main
  File "/usr/local/lib/python2.7/site-packages/vcfkit/vk.py", line 27, in <module>
    from utils.vcf import *
  File "/usr/local/lib/python2.7/site-packages/vcfkit/utils/vcf.py", line 1, in <module>
    from cyvcf2 import VCF as cyvcf2
  File "/usr/local/lib/python2.7/site-packages/cyvcf2/__init__.py", line 1, in <module>
    from .cyvcf2 import (VCF, Variant, Writer, r_ as r_unphased, par_relatedness,
ImportError: No module named cyvcf2

error loading tabix index for building a region specific tree

Hi, I am trying to build a tree based on a specific region on a chromosome - however I am receiving the error below saying that there was an issue loading/reading the tabix file. I have my .vcf file and it's corresponding tabix file (.vcf.gz.tbi) in the same working folder, and I was able to build a tree based off of my .vcf file.

(vcf-kit) [ccastane9@andersserver-01 FKBP6_home]$ vk phylo tree nj ECA13_260.vcf 13:11230000-11700000 > ECA13_tree_11230000_11700000_260.newick
[E::idx_find_and_load] Could not retrieve index file for 'ECA13_260.vcf'
Traceback (most recent call last):
File "/home/ccastane9/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/phylo.py", line 104, in
main()
File "/home/ccastane9/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/phylo.py", line 57, in main
for line in variant_set:
File "cyvcf2/cyvcf2.pyx", line 442, in call
AssertionError: error loading tabix index for b'ECA13_260.vcf'

issues installing vcf-kit on Mac

Hello,
I'm having issue installing VCF-kit on my macOS High Sierra machine. It seems to be a permission issue

~ ❯❯❯ pip install vcf-kit ⏎ Collecting vcf-kit Requirement already satisfied: cython>=0.24.1 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Collecting yahmm==1.1.2 (from vcf-kit) Requirement already satisfied: setuptools in ./Library/Python/2.7/lib/python/site-packages (from vcf-kit) Requirement already satisfied: docopt in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Requirement already satisfied: awesome-slugify in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Requirement already satisfied: cyvcf2>=0.6.5 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Collecting requests (from vcf-kit) Using cached requests-2.18.4-py2.py3-none-any.whl Collecting tabulate (from vcf-kit) Requirement already satisfied: numpy in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Collecting clint (from vcf-kit) Collecting intervaltree==2.1.0 (from vcf-kit) Requirement already satisfied: biopython in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Requirement already satisfied: matplotlib in ./Library/Python/2.7/lib/python/site-packages (from vcf-kit) Requirement already satisfied: scipy in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from vcf-kit) Requirement already satisfied: jinja2 in ./Library/Python/2.7/lib/python/site-packages (from vcf-kit) Collecting networkx==1.11 (from vcf-kit) Using cached networkx-1.11-py2.py3-none-any.whl Requirement already satisfied: Unidecode<0.05,>=0.04.14 in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from awesome-slugify->vcf-kit) Requirement already satisfied: regex in /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages (from awesome-slugify->vcf-kit) Requirement already satisfied: certifi>=2017.4.17 in ./Library/Python/2.7/lib/python/site-packages (from requests->vcf-kit) Collecting chardet<3.1.0,>=3.0.2 (from requests->vcf-kit) Using cached chardet-3.0.4-py2.py3-none-any.whl Collecting idna<2.7,>=2.5 (from requests->vcf-kit) Using cached idna-2.6-py2.py3-none-any.whl Collecting urllib3<1.23,>=1.21.1 (from requests->vcf-kit) Using cached urllib3-1.22-py2.py3-none-any.whl Collecting args (from clint->vcf-kit) Collecting sortedcontainers (from intervaltree==2.1.0->vcf-kit) Using cached sortedcontainers-1.5.9-py2.py3-none-any.whl Requirement already satisfied: functools32 in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=1.5.6 in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: python-dateutil in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: cycler>=0.10 in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: subprocess32 in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: pytz in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: six>=1.10 in ./Library/Python/2.7/lib/python/site-packages (from matplotlib->vcf-kit) Requirement already satisfied: MarkupSafe>=0.23 in ./Library/Python/2.7/lib/python/site-packages (from jinja2->vcf-kit) Requirement already satisfied: decorator>=3.4.0 in ./Library/Python/2.7/lib/python/site-packages (from networkx==1.11->vcf-kit) Installing collected packages: networkx, yahmm, chardet, idna, urllib3, requests, tabulate, args, clint, sortedcontainers, intervaltree, vcf-kit Found existing installation: networkx 2.0 Uninstalling networkx-2.0: Successfully uninstalled networkx-2.0 Rolling back uninstall of networkx Exception: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/basecommand.py", line 215, in main status = self.run(options, args) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/commands/install.py", line 342, in run prefix=options.prefix_path, File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/req/req_set.py", line 784, in install **kwargs File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/req/req_install.py", line 851, in install self.move_wheel_files(self.source_dir, root=root, prefix=prefix) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/req/req_install.py", line 1064, in move_wheel_files isolated=self.isolated, File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/wheel.py", line 377, in move_wheel_files clobber(source, dest, False, fixer=fixer, filter=filter) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/wheel.py", line 316, in clobber ensure_dir(destdir) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pip/utils/__init__.py", line 83, in ensure_dir os.makedirs(path) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/os.py", line 157, in makedirs mkdir(name, mode) OSError: [Errno 13] Permission denied: '/Library/Frameworks/Python.framework/Versions/2.7/share/doc/networkx-1.11'

I have admin rights on this machine but I don't know if it's ok to change the permission here
~ ❯❯❯ ls -lthr /Library/Frameworks/Python.framework/Versions/2.7/share/ total 0 drwxrwxr-x 3 root admin 96B Sep 16 12:02 man drwxr-xr-x 3 root admin 96B Jan 9 10:23 doc

ImportError: No module named vcfkit

Dear Developers,

I was running vk vcf2tsv on my vcf and I had the error below. Please advice

Traceback (most recent call last):
File "/home/user/.local/lib/python3.6/site-packages/vcfkit/vcf2tsv.py", line 89, in
line = line.replace("u'","") # No idea why u' is prefixed...
TypeError: a bytes-like object is required, not 'str'
ImportError: No module named vcfkit

nothing to be generated

Hi,
I have installed vcf-kit,
and I use:
vk primer indel --ref=Zea_mays.AGPv4.dna_sm.toplevel.fa chr10.vcf.filter.indel.gz
I get nothing, nothing output, no error message.
Is something wrong?

Thanks

something wrong installing with anaconda

Hello,

I use this to install with anaconda:
conda config --add channels bioconda
conda create -n vcf-kit python=2.7 vcfkit

And the error is:

UnsatisfiableError: The following specifications were found to be incompatible with each other
Output in format: Requested package -> Available versions
Package python conflicts for:
vcfkit -> biopython -> python[version='3.4.|3.5.|3.6.|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0']
vcfkit -> python[version='2.7.
|>=2.7,<2.8.0a0']
python=2.7

vk setup Error

Hi,

$ pip install numpy
$ pip install VCF-kit
I finished this two comannd.

And then,

(py27) [username@twins2 ~]$ vk setup
Traceback (most recent call last):
File "/home/username/anaconda3/envs/py27/bin/vk", line 11, in
sys.exit(main())
File "/home/username/anaconda3/envs/py27/lib/python2.7/site-packages/vcfkit/vk.py", line 64, in main
check_output(["brew", "tap", "homebrew/science"])
File "/home/username/anaconda3/envs/py27/lib/python2.7/subprocess.py", line 212, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)
File "/home/username/anaconda3/envs/py27/lib/python2.7/subprocess.py", line 390, in init
errread, errwrite)
File "/home/username/anaconda3/envs/py27/lib/python2.7/subprocess.py", line 1024, in _execute_child
raise child_exception
OSError: [Errno 13] Permission denied

What happened?

Is it possible to plot the Tajima's D result?

Hi,

This is a feature request. I obtained the Tajima's D value for the sliding windows for each population, but I would like to have a dot plot. It may sounds quite simple, but I am not an R expert , it would take a long time for me to figure out, so I wonder if you could include this part in the tool?

Thanks,
Cui

phylogenetic tree

Hello,

I am vinay kumar reddy and i am working with vcfkit for phylogenetic tree for my snp data. I could succeed in getting fasta format and distances between the samples but when i used
command "vk phylo tree nj |upgma --plot myvcffile", i did get the tree diagram as it is shown in the document i read. It shows an error like this and i dont know wat it is, can you please help me in succeeding with a tree diagram. The error is "Traceback (most recent call last):
File "/storage2/active/nvinay/mthesis/ve/lib/python2.7/site-packages/vcfkit/phylo.py", line 101, in
main()
File "/storage2/active/nvinay/mthesis/ve/lib/python2.7/site-packages/vcfkit/phylo.py", line 91, in main
template = open(prefix + "/tree.html",'r').read(tree)
IOError: [Errno 2] No such file or directory: '/storage2/active/nvinay/mthesis/ve/lib/python2.7/site-packages/vcfkit/static/tree.html".

I also tried to do it with the newick.format but i dont know where i can see an outfile. If at all i want to work with R, i need some output file, how can i get it.

Thanks,
Vinay Kumar Reddy Nannuru.

The 'VCF-kit==0.1.6' distribution was not found and is required by the application

Hello,

I have been trying to install for the past couple of days. The first few errors dealt with ascii code issues, and a couple missing dev libraries. I was finally able to get cyvcf2 to install and then VCF-kit installed successfully using pip.

Trying to test the install I am getting the following output:

Traceback (most recent call last):
File "/usr/bin/vk", line 6, in
from pkg_resources import load_entry_point
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 3138, in
@_call_aside
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 3122, in _call_aside
f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 3151, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 664, in _build_master
ws.require(requires)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 981, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 867, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'VCF-kit==0.1.6' distribution was not found and is required by the application

First thought would be something is not mapping properly or there was a missing sym-link. Time to take a break from this and come back with fresh eyes.

vk primer bio.alphabet error

Hi!

I'm trying to use vk primer to design some primers and running into an importError with Bio.Alphabet:

 vk primer snip --ref=WBcel235 data/test.vcf.gz

Traceback (most recent call last):
  File "/anaconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/primer.py", line 26, in <module>
    from vcfkit.utils.primer_vcf import primer_vcf
  File "/anaconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/utils/primer_vcf.py", line 11, in <module>
    from vcfkit.utils.primer3 import primer3
  File "/anaconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/utils/primer3.py", line 9, in <module>
    from Bio.Alphabet.IUPAC import IUPACAmbiguousDNA as DNA_SET
  File "/anaconda3/envs/vcf-kit/lib/python3.7/site-packages/Bio/Alphabet/__init__.py", line 21, in <module>
    "Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the ``molecule_type`` as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information."
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the ``molecule_type`` as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.

I tried simply commenting out the Bio.Alphabet line from the files, but then I get a core dump instead of an output. Do you have suggestions on a better way to get around this error? I installed vcf-kit using the instructions on the GitHub front page.

Thank you!

TypeError Python 3

Hello,

I am having an environment.yml file and I am installing vcfkit through conda. The file looks as follows:

name:myenv
channels:

  • bioconda
    depedencies:
    -vcfkit

I get the following error when I am using vcfkit:

Command error:
Traceback (most recent call last):
File "/opt/conda/envs/myenv/lib/python3.6/site-packages/vcfkit/vcf2tsv.py", line 89, in
line = line.replace("u'","") # No idea why u' is prefixed...
TypeError: a bytes-like object is required, not 'str'

Has anyone come across the same error?

calculate Tajima'D on haploid

Dear developers,
My fungi strains are haploidy and I would like to calculate Tajima'D via vk tajima. When I read doc, it says that require vcf file be diploid sites and the result is mysterious. I call snp with bcftools:
bcftools call --ploidy 1
Could you please help me how to solve it?
Thanks,
Alex

Custom genome assembly?

Hi! I'm wondering if it is possible to use a genome assembly that cannot be downloaded with the available methods?

I'm working with the Zostera marina assembly (PRJNA41721) and it is not on the downloaded genomes list from NCBI. I want to use the primer command to develop RFLP assays. Is there a way to do this starting with the fasta file?

Thanks!

setup error

when I type "vk setup" I get this error:

Traceback (most recent call last):
File "/home/user/miniconda3/bin/vk", line 7, in
from vcfkit.vk import main
File "/home/user/miniconda3/lib/python3.6/site-packages/vcfkit/vk.py", line 24, in
from utils import lev, message
ModuleNotFoundError: No module named 'utils'

What can I do?

Cannot find thermo path '/primer3_config/

Hi,

I've tried running the docker image of vk primer, but so far without success. I keep getting the error in the title.
The command I use is:
docker run -it --rm -v $PWD:/data -v $PWD/home:/root/.genome andersenlab/vcf-kit vk primer snip --ref=./data/home/ref_genome_dir ./data/calls.vcf.gz

I've read multiple issues about this error, but so far none help my case. Due to the structure of my project I need to use the vcfkit docker image and can't download it with conda. Are there any other fixes I can try?
If you need additional information, please let me know.

Thanks in advance,
Thom

vcf2tsv error

Dear @danielecook ,

I'm having an error when running vcf2tsv, my command:

vk vcf2tsv wide file.vcf

Error message:

Traceback (most recent call last):
File "lib/python2.7/site-packages/vcfkit/vcf2tsv.py", line 83, in
comm = Popen(comm, stdout=PIPE, stderr=PIPE)
File "lib/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

issue with primer3

I have installed VCF-kit and downloaded version WS245 of the C. elegans genome. I've also downloaded the VCF file from CeNDR. To generate primers that distinguish wild isolates based on SNPs, I've tried to run:
vk primer snip --ref=WS245 WI.20180527.impute.vcf.gz

I get the following error output:

Using reference located at /Users/amy/.genome/WS245/WS245.fa.gz

--size ignored; Set to 600-800 bp.

dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib
Referenced from: /Users/amy/anaconda/envs/py27/bin/samtools
Reason: image not found
dyld: Library not loaded: @rpath/libcrypto.1.0.0.dylib
Referenced from: /Users/amy/anaconda/envs/py27/bin/samtools
Reason: image not found
Note: the --sample option not given, applying all records regardless of the genotype

Cannot find thermo path '/primer3_config/

I've checked that primer3 is installed, but I suspect there must still a problem with it. I've also tried uninstalling and re-installing primer3, but that doesn't seem to resolve the issue. Any insights on this issue would be appreciated! Thanks.

vk phylo bug

Hi,

vk phylo tree nj --plot all_variants.vcf
gave the following error,

Traceback (most recent call last):
File "/home/prat/vcf-env/local/lib/python2.7/site-packages/vcfkit/phylo.py", line 84, in
main()
File "/home/prat/vcf-env/local/lib/python2.7/site-packages/vcfkit/phylo.py", line 74, in main
tree_template = Template(open(_ROOT + "/static/tree.html", 'r').read())
IOError: [Errno 2] No such file or directory: '/home/prat/vcf-env/lib/python2.7/site-packages/static/tree.html'

vk setup errors

Hi - trying to install VCF-kit. Things seem to go OK up until running vk setup
I am aware that there might be a problem with Homebrew and the version of Ruby which it is using.

This is what I am seeing:

$ vk setup Traceback (most recent call last): File "/usr/local/bin/vk", line 5, in <module> from pkg_resources import load_entry_point File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 2797, in <module> parse_requirements(__requires__), Environment() File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 580, in resolve raise VersionConflict(dist,req) # XXX put more info here pkg_resources.VersionConflict: (scipy 0.13.0b1 (/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python), Requirement.parse('scipy>=0.13.3'))

vk calc syntax error

Hi, I run the following command:

vk calc genotypes snps_all.vcf

But then I end up with the following error:

File "/usr/local/lib/python2.7/dist-packages/vcfkit/calc.py", line 45 print "\t".join(["sample", "freq_of_gt", "n_gt_at_freq"]) ^ SyntaxError: invalid syntax

Not sure if a simple fix that I'm just missing?

Error parsing GATK produced VCFs

HI

I tried using vk calc to analyze a GATK produced VCF and got an error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/vcfkit/calc.py", line 103, in <module>
    main()
  File "/usr/local/lib/python2.7/site-packages/vcfkit/calc.py", line 94, in main
    vcf = freq_vcf(args["<vcf>"])
  File "/usr/local/lib/python2.7/site-packages/vcfkit/calc.py", line 30, in __init__
    vcf.__init__(self, filename)
  File "/usr/local/lib/python2.7/site-packages/vcfkit/utils/vcf.py", line 33, in __init__
    map(int, re.compile("##contig.*length=(.*?)>").findall(self.raw_header))
ValueError: invalid literal for int() with base 10: '16571,assembly=hg19'

Works fine for Freebayes generated files.

problems with PyFPE

I have some problems using vk phylo software. I installed all the dependencies and I have this error:
Traceback (most recent call last):
File "/home/luca/miniconda2/bin/vk", line 7, in
from vcfkit.vk import main
File "/home/luca/miniconda2/lib/python2.7/site-packages/vcfkit/vk.py", line 27, in
from utils.vcf import *
File "/home/luca/miniconda2/lib/python2.7/site-packages/vcfkit/utils/vcf.py", line 1, in
from cyvcf2 import VCF as cyvcf2
File "/home/luca/miniconda2/lib/python2.7/cyvcf2/init.py", line 1, in
from .cyvcf2 import (VCF, Variant, Writer, r_ as r_unphased, par_relatedness,
ImportError: /home/luca/miniconda2/lib/python2.7/cyvcf2/cyvcf2.so: undefined symbol: PyFPE_jbuf
Can you help me?
Thank you very much in advance
Luca

Problem install VCF-kit

Hi,
I am trying to install your tool.

source activate python2env
pip install yahmm
brew install bwa samtools bcftools blast muscle
pip install VCF-kit

Then, I try this:

(python2env) ➜ ~ vk genome --help
Traceback (most recent call last):
File "/Users/cr517/anaconda/bin/vk", line 7, in
from vcfkit.vk import main
File "/Users/cr517/anaconda/lib/python3.6/site-packages/vcfkit/vk.py", line 24, in
from utils import lev, message
ImportError: cannot import name 'lev'

If you get the chance, I would appreciate your help.

Best,

Cristian.

TypeError: '<' not supported between instances of 'int' and 'NoneType'

Hi,

I am trying to run Tajima's D in vcf-kit using the following command:
docker run -it andersenlab/vcf-kit vk tajima 10000 --sliding file.vcf

However, I receive the following set of error messages immediately:
Traceback (most recent call last):
File "/opt/conda/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/tajima.py", line 148, in
main()
File "/opt/conda/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/tajima.py", line 129, in main
if wz < sz:
TypeError: '<' not supported between instances of 'int' and 'NoneType'

I have tried different vcf files to see if that might have been the problem, but I get the same error message regardless. I am running Python version 3.6.13 in Terminal in Mac BigSur. What can I do to fix this? Thanks for your help!

Memory Issues when running tajima module

Hi

I am running vcf-kit (version 0.2.6) from a anaconda installation. I am trying to run the tajima module using the following command:
vk tajima 10000 10000 chromosme.vcf.gz > chromosome_td.tsv

Each time I run the command, it runs for a few minutes but quickly consumes a lot of RAM before quitting. Curious if you know how to prevent this?

Thanks!

error while using docker image (andersenlab/vcf-kit:20200822175018b7b60d: TypeError: a bytes-like object is required, not 'str'

Hello - I am receiving the following error using the latest docker image (andersenlab/vcf-kit:20200822175018b7b60d). Any idea what this means or what I need to do to fix it? Any help you can provide is much appreciated.

Error:
Traceback (most recent call last):
File "/opt/conda/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/vcf2tsv.py", line 89, in
line = line.replace("u'","") # No idea why u' is prefixed...
TypeError: a bytes-like object is required, not 'str'

cannot get this to work

Hello,

I have been unable to successfully download and run vcfkit and have exhausted all options. I have tried the various install methods and even tried changing the primer3.py script to include the path to the primer3_config but I keep getting the same result: Cannot find thermo path '/primer3_config/

Neither vcfkit nor primer3 is being installed with a primer3_config that is unless I just clone the entire repositories. which still didn't solve the : Cannot find thermo path '/primer3_config/ issue.

Any help?

vk error on RHEL7

Hi,
Installation VCF-kit using pip went smoothly and it looked that I had all dependencies installed.
However, if I start any of the vk commands, I get the following error:

File "/usr/bin/vk", line 7, in
from vcfkit.vk import main
File "/usr/lib/python3.4/site-packages/vcfkit/vk.py", line 24, in
from utils import lev, message
ImportError: cannot import name 'lev'

utils==0.9.0 is installed

Thanks for your help in advance !

pip install fail

I couldn't install by pip install. Any suggestions or alternatives? Here are some of the error messages I saw.

[at the end]
Command "/Users/hoyon/miniconda3/bin/python3 -u -c "import setuptools, tokenize;file='/private/var/folders/q3/_06d0ck94zj4c_l7nmk0r9_00000gn/T/pip-build-i96np5eh/biopython/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /var/folders/q3/_06d0ck94zj4c_l7nmk0r9_00000gn/T/pip-n4dr4h_m-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/var/folders/q3/_06d0ck94zj4c_l7nmk0r9_00000gn/T/pip-build-i96np5eh/biopython/

[other error messages]
Building wheels for collected packages: biopython
Running setup.py bdist_wheel for biopython ... error
Complete output from command /Users/hoyon/miniconda3/bin/python3 -u -c "import setuptools, tokenize;file='/private/var/folders/q3/_06d0ck94zj4c_l7nmk0r9_00000gn/T/pip-build-i96np5eh/biopython/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d /var/folders/q3/_06d0ck94zj4c_l7nmk0r9_00000gn/T/tmpzjq85embpip-wheel- --python-tag cp34:

-- skipped --

gcc -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/include -Qunused-arguments -Qunused-arguments -I/Users/hoyon/miniconda3/include/python3.4m -c Bio/cpairwise2module.c -o build/temp.macosx-10.5-x86_64-3.4/Bio/cpairwise2module.o
gcc: error: unrecognized command line option ‘-Qunused-arguments’; did you mean ‘-Wunused-argument’?
gcc: error: unrecognized command line option ‘-Qunused-arguments’; did you mean ‘-Wunused-argument’?
error: command 'gcc' failed with exit status 1


Failed building wheel for biopython

... FWIW, I recently replaced the default gcc (Clang) with gcc (Stacks 6.2) as instructed below. Could this be the problem?
http://gbs-cloud-tutorial.readthedocs.io/en/latest/03_computer_setup.html

vcf2tsv cna't print header

When I use --print-header option, it gets a error that the variable fill_field is not defined. I look into the code, it looks like it apears 3 times, but is not defined, and not used for downstream code. How to fix it?

ImportError: cannot import name 'lev' from 'utils' (/Applications/miniconda3/lib/python3.7/site-packages/utils/__init__.py)

Any help with this error -- Maybe a python 3 issue?

(base) Jessicas-MacBook-Pro-2:filtering jessicaoswald$ vk setup
Traceback (most recent call last):
File "/Applications/miniconda3/bin/vk", line 5, in
from vcfkit.vk import main
File "/Applications/miniconda3/lib/python3.7/site-packages/vcfkit/vk.py", line 24, in
from utils import lev, message
ImportError: cannot import name 'lev' from 'utils' (/Applications/miniconda3/lib/python3.7/site-packages/utils/init.py)

KeyError: b'AC' in Tajima'sD calculation

Hi, I tried to run the tajima option but keep getting errors.

This is the command I ran:
singularity exec vcfkit:0.2.9--pyh5bfb8f1_0 vk tajima 5000 1000 SNV.vcf.gz

And this is the error:
CHROM BIN_START BIN_END N_Sites N_SNPs TajimaD
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/vcfkit/tajima.py", line 148, in
main()
File "/usr/local/lib/python3.6/site-packages/vcfkit/tajima.py", line 144, in main
for i in tajima(args[""]).calc_tajima(wz, sz, args["--sliding"], extra=args["--extra"]):
File "/usr/local/lib/python3.6/site-packages/vcfkit/tajima.py", line 82, in calc_tajima
AC = variant.INFO["AC"]
File "cyvcf2/cyvcf2.pyx", line 2138, in cyvcf2.cyvcf2.INFO.getitem
KeyError: b'AC'

I am using Singularity so I believe there should not be the python module incompatibility issue...
Thank you very much for any help.

Exception due to misuse of numpy options

The recent version of VCF-kit is unusable right after the installation:

[nknyazeva@head02 silwer]$ vk
Traceback (most recent call last):
  File "/home/nknyazeva/.local/bin/vk", line 9, in <module>
    load_entry_point('VCF-kit==0.1.6', 'console_scripts', 'vk')()
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 542, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2569, in load_entry_point
    return ep.load()
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2229, in load
    return self.resolve()
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2235, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/nknyazeva/.local/lib/python2.7/site-packages/vcfkit/vk.py", line 27, in <module>
    from utils.vcf import *
  File "/home/nknyazeva/.local/lib/python2.7/site-packages/vcfkit/utils/vcf.py", line 10, in <module>
    np.set_printoptions(threshold=np.nan)
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/numpy/core/arrayprint.py", line 246, in set_printoptions
    floatmode, legacy)
  File "/home/tools/python/Python-2.7/lib/python2.7/site-packages/numpy/core/arrayprint.py", line 93, in _make_options_dict
    raise ValueError("threshold must be numeric and non-NAN, try "
ValueError: threshold must be numeric and non-NAN, try sys.maxsize for untruncated representation

caused by line:

np.set_printoptions(threshold=np.nan)

in vcfkit/utils/vcf.py

The issue can be fixed with:

import sys
np.set_printoptions(threshold=sys.maxsize)

Please refer to numpy/numpy#12987 for the explanation and the origin of the fix.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.