konradjk / loftee Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
WARNING: 22562 : Use of uninitialized value in split at /scratch/vep-data/Plugins/loftee/LoF.pm line 574, <__ANONIO__> line 5046.
Use of uninitialized value $number_of_exons in subtraction (-) at /scratch/vep-data/Plugins/loftee/LoF.pm line 587, <__ANONIO__> line 5046.
WARNING: 22561 : DBD::SQLite::st execute failed: database disk image is malformed at /scratch/vep-data/Plugins/loftee/gerp_dist.pl line 130, <__ANONIO__> line 5046.
WARNING: Plugin 'LoF' went wrong: MySQL ERROR: No such file or directory at /scratch/vep-data/Plugins/loftee/gerp_dist.pl line 130, <__ANONIO__> line 5046.
DBD::SQLite::st execute failed: database disk image is malformed at /scratch/vep-data/Plugins/loftee/gerp_dist.pl line 130, <__ANONIO__> line 5046.
It looks like I'm having problems with MySQL or phylocsf_gerp.sql file, but actually the problem was with --fork argument. I spent a lot of time debugging due to the fact that this error is misleading.
Look there "Is LOFTEE thread safe?".
Split all data into batches is a crutch way. I believe that this issue can be resolved, or at least it is necessary to clearly describe the problem.
I'm running VEP with LOFTEE on variant 1-3782413-C-T
. gnomAD browser reports a Low Confidence LoF for stop_gained
transcripts ENST00000468793
and ENST00000475969
here. LOFTEE does not produce any LoF annotations.
I'm running VEP like this:
./vep \
--offline \
--cache \
--dir /vep \
--assembly GRCh37 \
--plugin LoF,loftee_path:loftee,human_ancestor_fa:GRCh37/human_ancestor.fa.gz \
--input_file input.txt \
--output_file output.txt
The input file is:
> cat input.txt
1 3782413 3782413 C/T +
VEP or LOFTEE produce not errors or warnings. The stop_gained
transcripts ENST00000468793
and ENST00000475969
appear in the output but no LoF annotations are present. See complete output below.
## ENSEMBL VARIANT EFFECT PREDICTOR v94.5
## Output produced at 2018-11-29 18:28:10
## Using cache in /vep/homo_sapiens/94_GRCh37
## Using API version 94, DB version ?
## ensembl-variation version 94.066b102
## ensembl-funcgen version 94.08b0c13
## ensembl-io version 94.8d53275
## ensembl version 94.5c08d90
## ClinVar version 201706
## ESP version 20141103
## gnomAD version 170228
## 1000genomes version phase3
## genebuild version 2011-04
## COSMIC version 81
## polyphen version 2.2.2
## assembly version GRCh37.p13
## sift version sift5.2.2
## dbSNP version 150
## regbuild version 1.0
## HGMD-PUBLIC version 20164
## gencode version GENCODE 19
## Column descriptions:
## Uploaded_variation : Identifier of uploaded variant
## Location : Location of variant in standard coordinate format (chr:start or chr:start-end)
## Allele : The variant allele used to calculate the consequence
## Gene : Stable ID of affected gene
## Feature : Stable ID of feature
## Feature_type : Type of feature - Transcript, RegulatoryFeature or MotifFeature
## Consequence : Consequence type
## cDNA_position : Relative position of base pair in cDNA sequence
## CDS_position : Relative position of base pair in coding sequence
## Protein_position : Relative position of amino acid in protein
## Amino_acids : Reference and variant amino acids
## Codons : Reference and variant codon sequence
## Existing_variation : Identifier(s) of co-located known variants
## Extra column keys:
## IMPACT : Subjective impact classification of consequence type
## DISTANCE : Shortest distance from variant to transcript
## STRAND : Strand of the feature (1/-1)
## FLAGS : Transcript quality flags
## LoF : Loss-of-function annotation (HC = High Confidence; LC = Low Confidence)
## LoF_filter : Reason for LoF not being HC
## LoF_flags : Possible warning flags for LoF
## LoF_info : Info used for LoF annotation
#Uploaded_variation Location Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation Extra
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000338895 Transcript synonymous_variant 602 279 93 H caC/caT - IMPACT=LOW;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000339350 Transcript 3_prime_UTR_variant,NMD_transcript_variant 579 - - - - - IMPACT=MODIFIER;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000378206 Transcript 3_prime_UTR_variant,NMD_transcript_variant 416 - - - - - IMPACT=MODIFIER;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000378209 Transcript synonymous_variant 602 279 93 H caC/caT - IMPACT=LOW;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000378212 Transcript downstream_gene_variant - - - - - - IMPACT=MODIFIER;DISTANCE=97;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000468793 Transcript stop_gained,NMD_transcript_variant 445 301 101 R/* Cga/Tga - IMPACT=HIGH;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000475969 Transcript stop_gained,NMD_transcript_variant 305 301 101 R/* Cga/Tga - IMPACT=HIGH;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000477548 Transcript 3_prime_UTR_variant,NMD_transcript_variant 497 - - - - - IMPACT=MODIFIER;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000481945 Transcript upstream_gene_variant - - - - - - IMPACT=MODIFIER;DISTANCE=1814;STRAND=1
1_3782413_C/T 1:3782413 T ENSG00000169598 ENST00000491998 Transcript synonymous_variant,NMD_transcript_variant 602 279 93 H caC/caT - IMPACT=LOW;STRAND=1
If I use a different variant, for example 1-1565047-C-T
, LOFTEE produces the correct LoF=LC
annotation for stop_gained
transcript ENST00000378712
.
I tried both LOFTEE v0.3-beta
and master
. Any pointers or workarounds on how to address this would be greatly appreciated.
Hello,
I keep have this issue for a while but don't know where is wrong,
Use of uninitialized value in split at /.vep/Plugins/LoF.pm line 531, <ANONIO> line 265001.
Use of uninitialized value $number_of_exons in subtraction (-) at /.vep/Plugins/LoF.pm line 544, <ANONIO> line 265001.
I did pass in the loftee path: loftee_path:$PWD/loftee and it works for other libs, not for this uninitialized value, not sure how this happens.
Hi,
I have checked the previous issues that might be related to this one but the problem remains
$ ./vep --offline -i 4ensembl.vcf --tab โ-o output.txt --plugin LoF,loftee_path:/MY_PATH/gits/loftee --force_overwrite --dir_plugins /MY_PATH/gits/loftee
Smartmatch is experimental at /MY_PATH/gits/loftee/de_novo_donor.pl line 175.
Smartmatch is experimental at /MY_PATH/gits/loftee/de_novo_donor.pl line 214.
Smartmatch is experimental at /MY_PATH/gits/loftee/splice_site_scan.pl line 191.
Smartmatch is experimental at /MY_PATH/gits/loftee/splice_site_scan.pl line 194.
Smartmatch is experimental at /MY_PATH/gits/loftee/splice_site_scan.pl line 238.
Smartmatch is experimental at /MY_PATH/gits/loftee/splice_site_scan.pl line 241.
WARNING: Failed to compile plugin LoF: can't open!
Compilation failed in require at /MY_PATH/gits/loftee/loftee_splice_utils.pl line 4.
Compilation failed in require at /MY_PATH/gits/loftee/LoF.pm line 33.
Compilation failed in require at (eval 34) line 2.
BEGIN failed--compilation aborted at (eval 34) line 2.
Following previous issues comments:
$ perl -c /MY_PATH/gits/loftee/loftee_splice_utils.pl
/MY_PATH/gits/loftee/loftee_splice_utils.pl syntax OK
export PERL5LIB=${HOME}"/.vep/Plugins":${HOME}"/gits/loftee":"/PATH2VEP/ensembl-vep"
Any clues what might be going wrong?
Hi,
I pulled the docker image for VEP from DockerHub (docker pull ensemblorg/ensembl-vep), installed all plugins (docker run -t -i -v $HOME/vep_data:/opt/vep/.vep ensemblorg/ensembl-vep perl INSTALL.pl -a cfp -s homo_sapiens -y GRCh37 -g all) and tried to run LoF using the following command:
./vep -i examples/homo_sapiens_GRCh37.vcf --cache --port 3337 --plugin LoF --force_overwrite
ERROR MESSAGE:
WARNING: Failed to compile plugin LoF: Can't locate utr_splice.pl in @inc (@inc contains: /opt/vep/.vep/Plugins /opt/vep/src/ensembl-vep/modules /opt/vep/src/ensembl-vep /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /opt/vep/.vep/Plugins/LoF.pm line 26.
Compilation failed in require at (eval 51) line 2.
BEGIN failed--compilation aborted at (eval 51) line 2.
Could you help fix that? Thanks a lot!
Pre-annotated ExAC or GnomeAD by LOFTEE is available or not?
Hello,
I'm worried that I'm bugging you with a simple error but I just can't figure it out what's going on. I am attempting to run the LOFTEE plugin with VEP, and I continue to get an error when I try and run it (please see screenshots). I am using perl > 5.10, have all the supporting files in place, etc. but continue to see:
Bareword found where operator expected at /hpc/local/CentOS7/hers_en/software/ensembl-tools-release-85/scripts/variant_effect_predictor/Plugins/splice_module.pl line 6, near ""en" class"
(Missing operator before class?)
Bareword found where operator expected at /hpc/local/CentOS7/hers_en/software/ensembl-tools-release-85/scripts/variant_effect_predictor/Plugins/splice_module.pl line 26, near "<title>loftee"
(Missing operator before loftee?)
2016-09-26 22:45:06 - Failed to compile plugin LoF: Unrecognized character \xC2; marked by <-- HERE after at master <-- HERE near column 46 at /hpc/local/CentOS7/hers_en/software/ensembl-tools-release-85/scripts/variant_effect_predictor/Plugins/splice_module.pl line 26.
Compilation failed in require at /hpc/local/CentOS7/hers_en/software/ensembl-tools-release-85/scripts/variant_effect_predictor/Plugins/LoF.pm line 24.
Compilation failed in require at (eval 142) line 2.
BEGIN failed--compilation aborted at (eval 142) line 2.
Will email you the annotation with ";" that causes this problem
Hi,
There is a question :
I installed kent source tree
and Bio::DB::BigFile
for LoF.pm
, but when I run vep command I got the following error:
Possible precedence issue with control flow operator at Bio/DB/IndexedBase.pm line 845.
Smartmatch is experimental at vep_cache/Plugins/de_novo_donor.pl line 214.
Smartmatch is experimental at vep_cache/Plugins/splice_site_scan.pl line 191.
Smartmatch is experimental at Plugins/splice_site_scan.pl line 194.
Smartmatch is experimental at Plugins/splice_site_scan.pl line 238.
Smartmatch is experimental at vep_cache/Plugins/splice_site_scan.pl line 241.
-------------------- EXCEPTION --------------------
MSG:
ERROR: Forked process(es) died: read-through of cross-process communication detected
STACK Bio::EnsEMBL::VEP::Runner::_forked_buffer_to_output miniconda3/share/ensembl-vep-97.3-0/modules/Bio/EnsEMBL/VEP/Runner.pm:556
STACK Bio::EnsEMBL::VEP::Runner::next_output_line miniconda3/share/ensembl-vep-97.3-0/modules/Bio/EnsEMBL/VEP/Runner.pm:361
STACK Bio::EnsEMBL::VEP::Runner::run miniconda3/share/ensembl-vep-97.3-0/modules/Bio/EnsEMBL/VEP/Runner.pm:202
STACK toplevel miniconda3/bin/vep:223
Date (localtime) = Fri Aug 30 16:39:18 2019
Ensembl API version = 97
---------------------------------------------------
LoF command:
--plugin LoF,loftee_path:vep_cache/Plugins/,gerp_bigwig:vep_cache/loftee_file/gerp_conservation_scores.homo_sapiens.GRCh38.bw,conservation_file:vep_cache/loftee_file/loftee.sql,human_ancestor_fa:vep_cache/loftee_file/human_ancestor.fa.gz
--fork 8
I have moved all loftee files into Plugins/
.
Whether LoF.pm
cannot be used with -- fork
.
Can you help me solve these problems๏ผ
Thank you.
Hi Daniel
i am running Loftee and getting these two error.
Is it possible to post process a VEP generated VCF with LOFTEE? Or does is the only way to run it is with the --plugin option? I have a large number of VCFs with VEP annotations, but weren't run with LOFTEE.
Hi,
I'm struggling to set LoF plugin in my VEP. After installing LoF.pm and related files addressed in here, I ran vep using script below:
vep98=/data/software/VEP/ensembl-vep-release-98 perl $vep98/vep \ --cache --dir_cache $vep98/cache \ -i $1.vcf -o $1.vep.vcf \ --assembly GRCh38 --symbol --vcf --exclude_predicted \ --fasta $vep98/cache/homo_sapiens/98_GRCh38/Homo_sapiens.GRCh38.dna.toplevel.fa \ --minimal --allele_number 50 --no_stats \ --pick --force_overwrite --offline --fork 20 \ --plugin LoF,loftee_path:$vep98/Plugins/loftee,human_ancestor_fa:$vep98/Plugins/loftee/human_ancestor.fa.gz \ --dir_plugins /data/software/VEP/ensembl-vep-release-98/Plugins
Unfortunately, I got some errors and I have no idea how I can figure out this issue.
WARNING: Failed to compile plugin LoF: can't open! Compilation failed in require at /data/software/VEP/ensembl-vep-release-98/Plugins/loftee/loftee_splice_utils.pl line 4. Compilation failed in require at /data/software/VEP/ensembl-vep-release-98/Plugins/LoF.pm line 33. Compilation failed in require at (eval 219) line 2. BEGIN failed--compilation aborted at (eval 219) line 2.
Any suggestion?
Thanks in advance!
Hi,
I am getting below warning while using Lof tool not sure if the output will be not as per the expectation:
Use of uninitialized value in numeric lt (<) at /home/veptools/.vep/Plugins/LoF.pm line 285.
Use of uninitialized value in numeric lt (<) at /home/veptools/.vep/Plugins/LoF.pm line 285.
Use of uninitialized value $min_intron_size in numeric lt (<) at /home/veptools/.vep/Plugins/LoF.pm line 322.
Use of uninitialized value $min_intron_size in numeric lt (<) at /home/veptools/.vep/Plugins/LoF.pm line 322.
Using the tool as :
--plugin LoF,human_ancestor_fa:/home/veptools/.vep/human_ancestor.fa along with other vep parameters.
It appears that the $min_intron_size variable is not getting populated.
Please guide me.
Thanks in advance.
Hi all!
We have obtained VEP LOFTEE (92.1) output quite close to results in Lek et al., 2016.
In our dataset the number of LoF=HC PTVs per chromosome (table 1) and per individual (table 2) were appeared to be close to ExAC's.
But it looks like we have a problem with "frameshift" LoF=HC PTVs resulted in 0 (table 3). Here we have two questions:
Used command:
./vep --offline --assembly GRCh37 --force_overwrite --cache --dir_cache ~/.vep --no_stats --plugin LoF,human_ancestor_fa:${INPUT_DIR}/ensembl-vep/ANC_SEQ/human_ancestor.fa.gz,loftee_path:~/.vep/Plugins/loftee --cache_version 92 -i ${INPUT_DIR}/VCF_by_inds_chrom_${i}/${sample} -o ${INPUT_DIR}/ensembl-vep/VEP_OUT92_1/LOFTEE_chrom_${i}_ind_${sample}.txt --dir_plugins ~/.vep/Plugins/loftee
Output:
Table 1
CHR LoF=HC** LoF=LC***
1 4495 4526
2 2337 4439
3 1144 3575
4 2222 1751
5 1771 2873
6 4486 2010
7 2167 2681
8 1843 4810
9 1085 2593
10 767 520
11 10783 10041
12 3215 2851
13 962 298
14 1062 3186
15 1323 1875
16 3123 2621
17 5830 4493
18 432 3032
19 6107 7033
20 547 150
21 433 125
22 2147 2375
Total 58281 67858
** for i in {1..22}; do echo
** for i in {1..22}; do echo $i; grep "stop_gained|frameshift|splice_donor|splice_acceptor" LOFTEE_chrom_${i}ind.vcf.gz.txt | grep LoF=LC | wc -l; done
Table 2
N INDIVID_ID LoF=HC**
1 S000014405 148
2 S000014412 129
3 S000014411 146
4 S000014410 151
5 S000014408 148
...
401 S000035685 119
Total 58281
Avarage 145,3
Max 208
Min 97
for sample in $(bcftools query --list-samples /mmg/ural/heterosis/singleton_MAC_filtered/SINGLETONS_removed_d_22.recode.vcf); do echo $sample; grep "stop_gained|frameshift|splice_donor|splice_acceptor" LOFTEE_chrom_*ind${sample}.vcf.gz.txt | grep LoF=HC | wc -l; done
Table 3
PTV_type LoF=HC** LoF=LC***
stop_gained 37829 11532
splice_acceptor 10479 18328
splice_donor 9973 37998
frameshift 0 0
Total 58281
**
grep "stop_gained" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=HC | wc -l
grep "splice_acceptor" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=HC | wc -l
grep "splice_donor" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=HC | wc -l
grep "frameshift" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=HC | wc -l
**
grep "stop_gained" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=LC | wc -l
grep "splice_acceptor" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=LC | wc -l
grep "splice_donor" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=LC | wc -l
grep "frameshift" LOFTEE_chrom_ind.vcf.gz.txt | grep LoF=LC | wc -l
Hello,
I have noticed that when running LOFTEE, the LoF_flags are not being properly annotated; they are always blank in the output. So, for instance, rs201677741 from the ExAC vcf is annotated by our installation of LOFTEE as follows:
14_24709750_G/T 14:24709750 T 26277 NM_001099274.1 Transcript stop_gained 1278/1852 936/1356 312/451 Y/* taC/taA rs201677741 LoF_flags=;IMPACT=HIGH;LoF_filter=;LoF=HC;LoF_info=POSITION:0.690265486725664,PHYLOCSF_TOO_SHORT;STRAND=-1;VARIANT_CLASS=SNV;SYMBOL=TINF2;BIOTYPE=protein_coding;CANONICAL=YES;ENSP=NP_001092744.1;EXON=6/9;HGVSc=NM_001099274.1:c.936C>A;HGVSp=NP_001092744.1:p.Tyr312Ter;GMAF=T:0.0002;AFR_MAF=T:0;AMR_MAF=T:0;EAS_MAF=T:0;EUR_MAF=T:0.001;SAS_MAF=T:0;AA_MAF=T:0.0002;EA_MAF=T:0.0001;ExAC_MAF=T:3.720e-04;ExAC_Adj_MAF=T:0.0003727;ExAC_AFR_MAF=T:0.0002041;ExAC_AMR_MAF=T:0;ExAC_EAS_MAF=T:0;ExAC_FIN_MAF=T:0;ExAC_NFE_MAF=T:0.0006444;ExAC_OTH_MAF=T:0;ExAC_SAS_MAF=T:0
รs you can see, there is no flag after LoF_flags (bolded). However, if I look at this variant in the gnomAD browser (see attached image), it is given the flag PHYLOCSF_WEAK.
We have run a few variants and we have not seen LoF_flags where we should, even though they are annotated as such in gnomAD.
This is our code:
perl /opt/tools/ensembl/ensembl-tools/scripts/variant_effect_predictor/variant_effect_predictor.pl \ --cache --dir_cache=/opt/tools/ensembl/cache/ \ -i dummy.vcf \ -o /mnt/causes-data01/new/COUSE/test/dummy.out \ --plugin LoF,loftee_path:/opt/tools/loftee,human_ancestor_fa:/opt/tools/loftee/human_ancestor.fa,check_complete_cds,conservation_file:mysql \ --dir_plugins=/opt/tools/loftee \ --everything \ --assembly GRCh37 \ --port 3337 \ --force \ --refseq \ --total_length \ --fasta /opt/tools/ensembl/tools_data/vep/homo_sapiens/Homo_sapiens.GRCh37.dna.toplevel.fa
We have tried pointing to the downloaded conservation_file, phylocsf.sql, and alternatively, loading the source file in MySQL.
Help with this issue would be greatly appreciated!
Thanks,
Maddie
We have step up the VEP and LOFTEE successfully, and the current final question is, can we run Loftee Plugin for hg38?
I found the files in your FTP is required for us to run loftee,
https://personal.broadinstitute.org/konradk/loftee_data/GRCh37/
[ ] GERP_scores.final.sorted.txt.gz 11-Jan-2018 11:05 11G
[ ] GERP_scores.final.sorted.txt.gz.tbi 11-Jan-2018 12:39 2.8M
[ ] human_ancestor.fa.gz 13-Mar-2015 16:31 834M
[ ] human_ancestor.fa.gz.fai 13-Mar-2015 16:32 736
[ ] human_ancestor.fa.gz.gzi 13-Mar-2015 16:32 748K
[ ] phylocsf_gerp.sql 15-Jan-2018 20:25 416M
And for hg38, I found found the file:
https://personal.broadinstitute.org/konradk/loftee_data/GRCh38/
[ ] human_ancestor.fa.gz 11-Sep-2018 09:46 844M
[ ] human_ancestor.fa.gz.fai 11-Sep-2018 09:49 736
[ ] human_ancestor.fa.gz.gzi 11-Sep-2018 09:49 747K
I seems that we still don't have phylocsf_gerp.sql for hg38(CRGh38), I am wondering if it is possible for us to have phylocsf_gerp.sql in hg38 so that we can quickly run loftee on our large variant dataset?
I'm struggling to get the LoF plugin running in VEP (88). I've specified my plugin directory and VEP is looking there for the plugins. Everytime I run VEP during initialisiation, I get the following:
Failed to compile plugin LoF:
2017-07-19 16:44:34 - Failed to compile plugin LoF: Can't open me2x3acc1!
I've grabbed the maxEntScan stuff from GitHub and put that in the Plugin directory, but still no luck.
Any suggestions?
Hello,
I've been trying to annotate a GRCh38 .vcf file with using Loftee
I already downloaded human_ancester_fa files, loftee.sql.gz (for GRCh38) and
a bigwig file (gerp_conservation_scores.homo_sapiens.GRCh38.bw)
This is my code.
export PERL5LIB=$PERL5LIB:/Users/dcha/.vep/Plugins
/Users/dcha/program/ensembl-vep/vep --assembly GRCh38 --offline --no_stats --tab -i /Users/dcha/program/vep_files/input/hg38_no_annot_2000.vcf -o /Users/dcha/program/vep_files/output/test_for_loftee.txt --canonical --pick --pick_order canonical --plugin LoF,loftee_path:/Users/dcha/.vep/Plugins/,human_ancestor_fa:/Users/dcha/.vep/Plugins/human_ancestor.fa.gz,conservation_file:/Users/dcha/.vep/Plugins/loftee.sql,gerp_bigwig:/Users/dcha/.vep/Plugins/gerp_conservation_scores.homo_sapiens.GRCh38.bw --force_overwrite
By the way, It always gives me the error message as below:
WARNING: Plugin 'LoF' went wrong: Can't locate object method "prepare" via package "gerp_conservation_scores.homo_sapiens.GRCh38.bw" (perhaps you forgot to load "gerp_conservation_scores.homo_sapiens.GRCh38.bw"?) at /Users/dcha/.vep/Plugins/gerp_dist.pl line 82, <ANONIO> line 2000.
What should I do to fix this problem?
Thank you for reading this.
I kept having this error for a while, and wondering how can I fix it?
Use of uninitialized value $number_of_exons in subtraction (-) at /n/home05/zhouhufeng/.vep/Plugins/LoF.pm line 546, <ANONIO> line
LOFTEE fails and gives empty fields in the output. Error msg:
DBD::SQLite::db prepare failed: no such table: gerp_exons at /path/to/vep/ensembl-vep-94.5-0/gerp_dist.pl line 129, <__ANONIO__> line 117945.
WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /path/to/vep/ensembl-vep-94.5-0/gerp_dist.pl line 130, <__ANONIO__> line 117945.
Looks like it's looking for gerp_exons
table. Where may I get a copy of that file?
Line 129 in df3d29e
Based on the posted issues from the past, it looks like you used to have some of the required annotation files here:
https://personal.broadinstitute.org/konradk/loftee_data/GRCh37/
But that folder is now empty. Could you return the gerp and similar annotation files? As far as I can tell it's not possible to get useful results out of loftee without them (the loftee fields are just blank).
Hi,
I'm unable to use LoF to annotate my variants as it seems it is missing a feature/SQL table.
Specifically, when I run VEP
./vep --plugin LoF,loftee_path:/path/to/loftee,
human_ancestor_fa:/path/to/loftee/human_ancestor.fa.gz,
conservation_file:/path/to/loftee/phylocsf.sql,
fast_length_calculation:0,get_splice_features:1,check_complete_cds:1
I get the following errors:
DBD::SQLite::db prepare failed: no such table: gerp_bases at /path/to/loftee/gerp_dist.pl line 82, line 5015.
Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /path/to/loftee/gerp_dist.pl line 83, line 5015.
basically saying that gerp_bases is missing.
Could you please add that table to LoF?
Thank you,
Denise
Is there is any score provided by LoFTEE to estimate how many transcripts of a gene is affected by the particular LoF variant?
Can I run vep with the --fork
flag while using LOFTEE? I tried to run it with --fork 8
and it seemed to run into occasional problems with the SQLite databases based on these error messages:
DBD::SQLite::st execute failed: database disk image is malformed at vep/plugins/LoF.pm line 565.
WARNING: Plugin 'LoF' went wrong: MySQL ERROR: No such file or directory at vep/plugins/LoF.pm line 565.
Use of uninitialized value $number_of_exons in subtraction (-) at vep/plugins/LoF.pm line 586, <__ANONIO__> line 60180.
Use of uninitialized value in split at vep/plugins/LoF.pm line 573, <__ANONIO__> line 60180.
When I remove the --fork
parameter I don't get any such errors.
hello there!
Need your help as I am fighting this strange error. Here is the situation
Please can someone (especially Konrad) kindly shed some lights on what might be the root cause? Many thanks!
========================================
This is my command line
%sh
export PERL5LIB=$PERL5LIB:/statgen/variant_annotation/data/.vep/Plugins:/statgen/variant_annotation/data/.vep/Plugins/loftee
/statgen/variant_annotation/vep/ensembl-vep/vep
--assembly GRCh38
--dir /statgen/variant_annotation/data/.vep/
--cache
--dir_cache /statgen/variant_annotation/data/.vep
--offline
--merged
--force_overwrite
--everything
--dir_plugins /statgen/variant_annotation/data/.vep/Plugins/loftee/
--plugin LoF,loftee_path:/statgen/variant_annotation/data/.vep/Plugins/loftee/,human_ancestor_fa:/statgen/variant_annotation/data/human_ancestor.fa.gz,conservation_file:/statgen/variant_annotation/data/loftee.sql,gerp_bigwig:/statgen/variant_annotation/data/gerp_conservation_scores.homo_sapiens.GRCh38.bw
--vcf
-i /dbfs/users/jliu5/test_ukbb_data.vcf.bgz
-o /tmp/test_ukbb_data_vep_annotated.vcf
========================================
Below is the error stack trace:
Use of uninitialized value in pattern match (m//) at /statgen/variant_annotation/data/.vep/Plugins/loftee/LoF.pm line 137.
2019-05-17 22:49:24 - INFO: BAM-edited cache detected, enabling --use_transcript_ref; use --use_given_ref to override this
WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /statgen/variant_annotation/data/.vep/Plugins/loftee/gerp_dist.pl line 130, <$fh> line 4455.
DBD::SQLite::db prepare failed: no such table: gerp_exons at /statgen/variant_annotation/data/.vep/Plugins/loftee/gerp_dist.pl line 129, <$fh> line 4455.
DBD::SQLite::db prepare failed: no such table: gerp_exons at /statgen/variant_annotation/data/.vep/Plugins/loftee/gerp_dist.pl line 129, <$fh> line 4455.
DBD::SQLite::db prepare failed: no such table: gerp_exons at /statgen/variant_annotation/data/.vep/Plugins/loftee/gerp_dist.pl line 129, <$fh> line 4455.
DBD::SQLite::db prepare failed: no such table: gerp_exons at /statgen/variant_annotation/data/.vep/Plugins/loftee/gerp_dist.pl line 129, <$fh> line 4455.
DBD::SQLite::db prepare failed: no such table: gerp_exons at
< ... the above error message repeats many times ...>
Hi,
I am running loftee but while vep is annotating the file loftee does not.
Here is my command
/home/hpcc/tools/ensembl-vep/./vep --cache --offline -i Input.vcf --everything --force_overwrite -o Output.vcf --dir_plugins /home/hpcc/.vep/Plugins/loftee/
`
Is there something wrong with my command?
Here are some lines header of output file.
## ENSEMBL VARIANT EFFECT PREDICTOR v93.3
## Output produced at 2018-09-26 00:33:37
## Using cache in /home/hpcc/.vep/homo_sapiens/93_GRCh38
## Using API version 93, DB version ?
## ensembl-io version 93
## ensembl-funcgen version 93
## ensembl version 93
## ensembl-variation version 93
## sift version sift5.2.2
## ESP version V2-SSA137
## genebuild version 2014-07
## COSMIC version 85
## regbuild version 16
## dbSNP version 150
## assembly version GRCh38.p12
## 1000genomes version phase3
## gnomAD version 170228
## polyphen version 2.2.2
## ClinVar version 201805
## gencode version GENCODE 28
## HGMD-PUBLIC version 20174
## Column descriptions:
## Uploaded_variation : Identifier of uploaded variant
## Location : Location of variant in standard coordinate format (chr:start or chr:start-end)
## Allele : The variant allele used to calculate the consequence
## Gene : Stable ID of affected gene
## Feature : Stable ID of feature
Hi, I am trying to run loftee on Linux. Below is my command and error message.
Can someone please kindly let me know what is wrong here?
Thank you & best regards,
jie
=== my command ===
vep -i all.tmp6.vcf --vcf --cache --dir /mnt/d/data/vep_cache --offline --assembly GRCh38 -o all.tmp7.vcf --force_overwrite --sift b --canonical --symbol --plugin LoF,loftee_path:/mnt/d/software_lin/loftee,human_ancestor_fa:/d/mnt/data/gatk_bundle/hg38/Homo_sapiens_assembly38.fasta.gz --dir_plugins /mnt/d/software_lin/loftee
=== the error ===
Smartmatch is experimental at /mnt/d/software_lin/loftee/de_novo_donor.pl line 175.
Smartmatch is experimental at /mnt/d/software_lin/loftee/de_novo_donor.pl line 214.
Smartmatch is experimental at /mnt/d/software_lin/loftee/splice_site_scan.pl line 191.
Smartmatch is experimental at /mnt/d/software_lin/loftee/splice_site_scan.pl line 194.
Smartmatch is experimental at /mnt/d/software_lin/loftee/splice_site_scan.pl line 238.
Smartmatch is experimental at /mnt/d/software_lin/loftee/splice_site_scan.pl line 241.
WARNING: Failed to compile plugin LoF: Can't locate List/MoreUtils.pm in @inc (you may need to install the List::MoreUtils module) (@inc contains: /mnt/d/software_lin/loftee /mnt/d/software_lin/ensembl_vep/modules /mnt/d/software_lin/ensembl_vep /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.26.1 /usr/local/share/perl/5.26.1 /usr/lib/x86_64-linux-gnu/perl5/5.26 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.26 /usr/share/perl/5.26 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base) at /mnt/d/software_lin/loftee/svm.pl line 5.
BEGIN failed--compilation aborted at /mnt/d/software_lin/loftee/svm.pl line 5.
Compilation failed in require at /mnt/d/software_lin/loftee/LoF.pm line 34.
Compilation failed in require at (eval 32) line 2.
BEGIN failed--compilation aborted at (eval 32) line 2.
Hello,
When I am running LoF plugin through VEP (as shown below), I am getting warnings and no LoF annotation in the output. I have tried to google and read the online documentation but could not find the solution. Can anyone help me with this?
./ensembl-vep/vep -i /ebc_data/bayazit1/PRE_MIGRATION_ADMIX/dsets/full_seq/vcf_from_Vasily/NE_chr_2.rsnum.vcf.gz
--cache --force_overwrite --dir_cache /ebc_data/bayazit1/.vep --offline --sift b --symbol
--canonical -polyphen p
-plugin LoF,loftee_path:/ebc_data/bayazit1/.vep/Plugins/loftee,human_ancestor_fa:/ebc_data/bayazit1/PRE_MIGRATION_ADMIX/dsets/full_seq/vcf_from_Vasily/ensembl-vep/ANC_SEQ/human_ancestor.fa.gz,
conservation_file:/ebc_data/bayazit1/PRE_MIGRATION_ADMIX/dsets/full_seq/vcf_from_Vasily/ensembl-vep/PhyloCSF_SQL_database/phylocsf.sql.gz
--dir_plugins /ebc_data/bayazit1/.vep/Plugins/loftee
-o /gpfs/hpchome/bayazit1/udustorage/VEP_LoF/LoF_RS_added_VEP_output_for_dset_NE_chr_2.txt
WARNING: Plugin 'LoF' went wrong: Illegal division by zero at /ebc_data/bayazit1/.vep/Plugins/loftee/maxEntScan/score3.pl line 157, <$fh> line 5014.
WARNING: Plugin 'LoF' went wrong: Illegal division by zero at /ebc_data/bayazit1/.vep/Plugins/loftee/maxEntScan/score5.pl line 93, <$fh> line 5014.
WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /ebc_data/bayazit1/.vep/Plugins/loftee/LoF.pm line 509, <$fh> line 5014.
WARNING: Plugin 'LoF' went wrong: Illegal division by zero at /ebc_data/bayazit1/.vep/Plugins/loftee/maxEntScan/score3.pl line 157, <$fh> line 10014.
WARNING: Plugin 'LoF' went wrong: Illegal division by zero at /ebc_data/bayazit1/.vep/Plugins/loftee/maxEntScan/score5.pl line 93, <$fh> line 10014.
WARNING: Plugin 'LoF' went wrong: Illegal division by zero at /ebc_data/bayazit1/.vep/Plugins/loftee/maxEntScan/score5.pl line 93, <$fh> line 15014.
WARNING: Plugin 'LoF' went wrong: Can't call method "execute" on an undefined value at /ebc_data/bayazit1/.vep/Plugins/loftee/LoF.pm line 509, <$fh> line 15014.
W
... more lines
Hello
I analysed some variants using the LOFTEE plugin, and have also
extracted the LOFTEE annotations for the same variants from the
Gnomad API,
and I found discrepancies in the calls between the LOFTEE plugin and Gnomad
on some variants, to the extant that some variants are classified by Gnomad
as high confidence LOF, but are classified by the plugin as low confidence LOF.
Many of these discrepant are flagged by the plugin (but not by Gnomad), as
END_TRUNC, and are assigned by the plugin GERP_DIST 0, whereas the GERP_DIST in the Gnomad annotations is non-zero.
For example, for the variant 1-152081684-TTCTG-T I got from the plugin the following LOF info:
GERP_DIST:0,BP_DIST:1824,PERCENTILE:0.687242798353909,DIST_FROM_LAST_EXON:-3869,50_BP_RULE:FAIL,PHYLOCSF_TOO_SHORT1-152081684-TTCTG-T
And this from Gnomad:
GERP_DIST:208.392210000001,BP_DIST:1824,PERCENTILE:0.687242798353909,DIST_FROM_LAST_EXON:-3869,50_BP_RULE:FAIL,PHYLOCSF_TOO_SHORT
When inspecting the log file, I see repeated warnings such as:
WARNING: 33436 : Use of uninitialized value in split at human-vep-v14/resources/vep_plugins/Plugins/LoF.pm line 573, <__ANONIO__> line 283.
Use of uninitialized value $number_of_exons in subtraction (-) at human-vep-v14/resources/vep_plugins/Plugins/LoF.pm line 586, <__ANONIO__> line 283.
Use of uninitialized value in split at human-vep-v14/resources/vep_plugins/Plugins/LoF.pm line 573, <__ANONIO__> line 283
Can you please help me understand these discrepancies?
Best wishes
Dolev
Hello, I installed vep v94, installed the maxEntScan plugin and loftee plugin via;
wget https://github.com/konradjk/loftee/archive/master.zip
set the PERL5LIB env to be the loftee location (same as vep plugin directory)
Ran vep/loftee cmd;
/usr/local/exports/bin/emd43/ensembl-vep/vep -i ~/ngs/54gene_unique_vars.vcf -o ~/ngs/54gene_unique_vars_out.vcf --fasta ~/ngshome/human_ref_hg19_chr.fa --cache --offline --dir_plugins /usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins/ --dir_cache /usr/local/exports/bin/emd43/ensembl-vep/.vep --dir /usr/local/exports/bin/emd43/ensembl-vep/.vep --assembly GRCh37 --force_overwrite --exclude_predicted --no_stats --ccds --canonical --refseq --vcf --plugin MaxEntScan,/usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins/maxEntScan --plugin LoF,loftee_path:/usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins
the vep runs ok and annotates variants with maxEntScan scores, but loftee does not run as it
fails to compile;
WARNING: Failed to compile plugin LoF: Can't open me2x3acc1!
Compilation failed in require at /usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins/loftee_splice_utils.pl line 4.
Compilation failed in require at /usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins/LoF.pm line 33.
This looks very similar to issue #19 which was resolved at the time. I tried running the same cmd in
the loftee directory and see a different error;
WARNING: Failed to compile plugin LoF: can't open!
but;
gen01(90) ll LoF.pm
-rw-r--r-- 1 ed ed 23363 Aug 29 14:35 LoF.pm
gen01(91) echo $PERL5LIB
.:/usr/local/exports/bin/emd43/ensembl-vep/.vep/Plugins
Can anyone help on this?
Thanks, Ed
Hi,
I'm using LOFTEE to annotate variants mapped to GRCh37.
I noticed that VEP caches for GRCh37 were only available for gene and transcript ids in old GENCODE release 19. (https://m.ensembl.org/info/docs/tools/vep/script/vep_cache.html#cache)
I tried to set latest GENCODE (e.g., release 29) gff files by --gff argument; but LOFTEE saw GENCODE release 19 in vep cache.
In fact, "ENSG00000069712" does not exist in GENCODE release 29 but was included in the output file.
In offline vep, I cannot run LOFTEE without setting the vep cache.
How can I annotate GRCh37 variants using the specified, latest GENCODE?
Although what I need would be to update vep cache by using my latest GENCODE release, how can I make the vep cache for LOFTEE?
I'm using vep 96.0 or 97.0.
Thanks a lot.
Best,
Masaru
Hi,
Is it possible to use loftee with GRCh38? Where should I source the human ancestor file from? Or can I just use the same?
Thanks
M
New version of loftee identifies some variants with missense_variant consequence as LoF HC.
I didn't notice this for previous loftee versions and its not documented.
Is it something that i should worry about or its a new loftee feature?
Thanks!
I'm trying to install LoF plugin for VEP. I already have a LoF.pm (here attached, was already present when downloading the VEP
LoF.txt
) file but do not correspond with the LoF.pm file present in https://github.com/konradjk/loftee/blob/v0.3-beta/LoF.pm
which one should I use ?
If I have use the one from you github, I just have to install the :
splice_module.pl
ancestral.pm
samtools
and finally phylocsf.sql from
https://www.broadinstitute.org/~konradk/loftee/phylocsf.sql.gz
is that right ?
Hi,
I believe I have downloaded the files for hg38, but am unable to run LOFTEE with the below error. I've specified the LOFTEE_PATH, but it is still looking in /vep/loftee, and I do not have the hg38 GERP_scores...gz from the repo or Broad server. Tabix 0.2.6 is on my $PATH, and I have DBI::mysql. Any help would be most appreciated.
export PERL5LIB=$PERL5LIB:~/tools/loftee
VEP_PATH=~/tools/ensembl-vep
CACHE=~/db/vep94
LOFTEE_PATH=~/tools/loftee
ANCESTOR=~/db/loftee/human_ancestor.fa.gz
SQL=~/db/loftee/phylocsf.sql
perl $VEP_PATH/vep -i $IN1 -o $OUT1 --vcf --cache --dir_cache $CACHE --force_overwrite --dir_plugins $LOFTEE_PATH --plugin LoF,loftee_path:$LOFTEE_PATH,human_ancestor_fa:$ANCESTOR,conservation_file:$SQL
WARNING: Failed to instantiate plugin LoF: Cannot read /vep/loftee/GERP_scores.final.sorted.txt.gz using tabix at ~/tools/loftee/LoF.pm line 137.
Thanks,
Kris
When running vep with Loftee, it issues a info: Use of uninitialized value in split at .../LoF.pm
.
Fix ancestral alleles so that when the ancestral allele does not match ref or alt, it is set to NA rather than False (indicates that we know the answer).
Hi ,
I am having the following error:
WARNING: Plugin 'LoF' went wrong: Can't locate object method "prepare" via package "false" (perhaps you forgot to load "false"?) at /home/ec2-user/.vep/Plugins/gerp_dist.pl line 129, <ANONIO> line 53522.
I have seen this issue in a previous post, but was unresolved.
Is there any assistance on this please?
Thanks in advance,
Nuno
Can you add an example vcf file (just few variants known to have LoF effect) and the command to run it? It will be easy to test if the annotation works.
Thanks
Regards
Veera
When running the following command:
python /data/Install/LOFTEE/loftee-master/src/tableize_vcf.py --vcf /data/Share/nick/Paralog_Anno/data_files/test.out_paraloc --out /data/Share/nick/Paralog_Anno/data_files/test.out_paraloc_tableized --vep_info Amino_acids,Codons,Paralogue_Vars
I get this error:
WARNING: Did not find minimal_representation. Outputting raw positions.
SUCCESS: Found bgzip! Will bgzip the table.
SUCCESS: Found Amino_acids
SUCCESS: Found Codons
SUCCESS: Found Paralogue_Vars
14. FAILED ON LINE: 14 65077986 404110 CATATACTGGAT C . . ALLELEID=399858;CLNDISDB=MedGen:C1708353,Orphanet:ORPHA29072;CLNDN=Hereditary_Paraganglioma-Pheochromocytoma_Syndromes;CLNHGVS=NC_000014.9:g.65077987_65077997delATATACTGGAT;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Pathogenic;CLNVC=Deletion;CLNVCSO=SO:0000159;GENEINFO=MAX:4149;MC=SO:0001589|frameshift_variant,SO:0001627|intron_variant;ORIGIN=1;RS=1060500101;CSQ=-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000284165|protein_coding|4/4||||360-370|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000341653|protein_coding||3/3|||||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358402|protein_coding|3/4||||349-359|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358664|protein_coding|4/5||||342-352|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000394606|nonsense_mediated_decay|4/6||||391-401|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000553928|nonsense_mediated_decay|4/6||||232-242|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|non_coding_transcript_exon_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000553951|retained_intron|3/3||||288-298||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555419|protein_coding|3/4||||103-113|103-113|35-38|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555667|protein_coding|3/4||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000555932|protein_coding||1/1|||||||||1||-1||HGNC|HGNC:6913|,-|upstream_gene_variant|MODIFIER|AL139022.1|ENSG00000259118|Transcript|ENST00000556127|antisense_RNA|||||||||||1|4037|1||Clone_based_ensembl_gene||,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556443|protein_coding|3/3||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|start_retained_variant&5_prime_UTR_variant|LOW|MAX|ENSG00000125952|Transcript|ENST00000556892|protein_coding|3/4||||335-345|?-2|?-1||||1||-1|cds_end_NF|HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556979|protein_coding|4/5||||389-399|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|5_prime_UTR_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000557277|protein_coding|4/6||||357-367||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000557746|protein_coding|3/5||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000618858|protein_coding|4/6||||416-426|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|
Traceback (most recent call last):
File "/data/Install/LOFTEE/loftee-master/src/tableize_vcf.py", line 396, in <module>
main(args)
File "/data/Install/LOFTEE/loftee-master/src/tableize_vcf.py", line 346, in main
raise e
KeyError: 'start_retained_variant'
The error prevents tableize from moving on to any variants following the one that caused it and crashes out.
test.out_paraloc
contains the following, specifically, the first variant is parsed fine and then the second one causes the error:
##fileformat=VCFv4.1
##VEP="v90" time="2018-05-15 03:04:05" cache="/data/Share/nick/Paralog_Anno/homo_sapiens/90_GRCh38" ensembl-funcgen=90.743f32b ensembl-variation=90.58bf949 ensembl=90.4a44397 ensembl-io=90.9a148ea 1000genomes="phase3" COSMIC="81" ClinVar="201706" ESP="V2-SSA137" HGMD-PUBLIC="20164" assembly="GRCh38.p10" dbSNP="150" gencode="GENCODE 27" genebuild="2014-07" gnomAD="170228" polyphen="2.2.2" regbuild="16" sift="sift5.2.2"
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|ALLELE_NUM|DISTANCE|STRAND|FLAGS|SYMBOL_SOURCE|HGNC_ID|Paralogue_Vars">
##Paralogue_Vars=Equivalant variants and locations in paralogous genes
#CHROM POS ID REF ALT QUAL FILTER INFO
14 65077985 29786 G A . . ALLELEID=38741;CLNDISDB=MedGen:C0027672,SNOMED_CT:699346009|MedGen:C3149711;CLNDN=Hereditary_cancer-predisposing_syndrome|Pheochromocytoma,_susceptibility_to;CLNHGVS=NC_000014.9:g.65077985G>A;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Pathogenic,_risk_factor;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;CLNVI=OMIM_Allelic_Variant:154950.0002;GENEINFO=MAX:4149;MC=SO:0001587|nonsense,SO:0001623|5_prime_UTR_variant,SO:0001627|intron_variant;ORIGIN=1;RS=387906650;CSQ=A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000284165|protein_coding|4/4||||372|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000341653|protein_coding||3/3|||||||||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358402|protein_coding|3/4||||361|196|66|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358664|protein_coding|4/5||||354|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000394606|nonsense_mediated_decay|4/6||||403|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000553928|nonsense_mediated_decay|4/6||||244|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|non_coding_transcript_exon_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000553951|retained_intron|3/3||||300||||||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555419|protein_coding|3/4||||115|115|39|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555667|protein_coding|3/4||||374|196|66|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000555932|protein_coding||1/1|||||||||1||-1||HGNC|HGNC:6913|,A|upstream_gene_variant|MODIFIER|AL139022.1|ENSG00000259118|Transcript|ENST00000556127|antisense_RNA|||||||||||1|4049|1||Clone_based_ensembl_gene||,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556443|protein_coding|3/3||||374|196|66|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556892|protein_coding|3/4||||347|4|2|R/*|Cga/Tga||1||-1|cds_end_NF|HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556979|protein_coding|4/5||||401|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|5_prime_UTR_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000557277|protein_coding|4/6||||369||||||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000557746|protein_coding|3/5||||374|196|66|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|,A|stop_gained|HIGH|MAX|ENSG00000125952|Transcript|ENST00000618858|protein_coding|4/6||||428|223|75|R/*|Cga/Tga||1||-1||HGNC|HGNC:6913|
14 65077986 404110 CATATACTGGAT C . . ALLELEID=399858;CLNDISDB=MedGen:C1708353,Orphanet:ORPHA29072;CLNDN=Hereditary_Paraganglioma-Pheochromocytoma_Syndromes;CLNHGVS=NC_000014.9:g.65077987_65077997delATATACTGGAT;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Pathogenic;CLNVC=Deletion;CLNVCSO=SO:0000159;GENEINFO=MAX:4149;MC=SO:0001589|frameshift_variant,SO:0001627|intron_variant;ORIGIN=1;RS=1060500101;CSQ=-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000284165|protein_coding|4/4||||360-370|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000341653|protein_coding||3/3|||||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358402|protein_coding|3/4||||349-359|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000358664|protein_coding|4/5||||342-352|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000394606|nonsense_mediated_decay|4/6||||391-401|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant&NMD_transcript_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000553928|nonsense_mediated_decay|4/6||||232-242|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|non_coding_transcript_exon_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000553951|retained_intron|3/3||||288-298||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555419|protein_coding|3/4||||103-113|103-113|35-38|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000555667|protein_coding|3/4||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|intron_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000555932|protein_coding||1/1|||||||||1||-1||HGNC|HGNC:6913|,-|upstream_gene_variant|MODIFIER|AL139022.1|ENSG00000259118|Transcript|ENST00000556127|antisense_RNA|||||||||||1|4037|1||Clone_based_ensembl_gene||,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556443|protein_coding|3/3||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|start_retained_variant&5_prime_UTR_variant|LOW|MAX|ENSG00000125952|Transcript|ENST00000556892|protein_coding|3/4||||335-345|?-2|?-1||||1||-1|cds_end_NF|HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000556979|protein_coding|4/5||||389-399|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|5_prime_UTR_variant|MODIFIER|MAX|ENSG00000125952|Transcript|ENST00000557277|protein_coding|4/6||||357-367||||||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000557746|protein_coding|3/5||||362-372|184-194|62-65|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|,-|frameshift_variant|HIGH|MAX|ENSG00000125952|Transcript|ENST00000618858|protein_coding|4/6||||416-426|211-221|71-74|IQYM/X|ATCCAGTATATg/g||1||-1||HGNC|HGNC:6913|
Version of Python used was Python 2.7.6
Hi can you please share whatever command you have been using for Loftee annotation ? I am using below command and not getting even one LOF Mutation in my WGS VCF variants files while I am m getting LoF entries in headings. What could be the reason for this ?
##LoF=Loss-of-function annotation (HC = High Confidence; LC = Low Confidence)
##LoF_filter=Reason for LoF not being HC
##LoF_flags=Possible warning flags for LoF
##LoF_info=Info used for LoF annotation
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|VARIANT_CLASS|SYMBOL_SOURCE|HGNC_ID|CANONICAL|TSL|APPRIS|CCDS|ENSP|SWISSPROT|TREMBL|UNIPARC|GENE_PHENO|SIFT|PolyPhen|DOMAINS|HGVS_OFFSET|GMAF|AFR_MAF|AMR_MAF|EAS_MAF|EUR_MAF|SAS_MAF|AA_MAF|EA_MAF|ExAC_MAF|ExAC_Adj_MAF|ExAC_AFR_MAF|ExAC_AMR_MAF|ExAC_EAS_MAF|ExAC_FIN_MAF|ExAC_NFE_MAF|ExAC_OTH_MAF|ExAC_SAS_MAF|CLIN_SIG|SOMATIC|PHENO|PUBMED|MOTIF_NAME|MOTIF_POS|HIGH_INF_POS|MOTIF_SCORE_CHANGE|LoF|LoF_filter|LoF_flags|LoF_info">
perl /gpfs/software/genomics/VEP/ensembl-tools-release-83/scripts/variant_effect_predictor/variant_effect_predictor.pl -i PMC01_jointCalling.filtered.snps.indels.vcf --cache --assembly GRCh37 --offline --dir_cache /gpfs/projects/vep/ --vcf --everything --sift --polyphen --hgvs --fasta /gpfs/projects/refData/Homo_sapiens.GRCh37.dna.primary_assembly.fa --force --symbol --dir_plugins /gpfs/software/genomics/VEP/loftee -o PMC01_jointCalling.filtered.snps.indels_VEP.vcf
perl /gpfs/software/genomics/VEP/ensembl-tools-release-83/scripts/variant_effect_predictor/variant_effect_predictor.pl -i PMC01_jointCalling.filtered.snps.indels.vcf --cache --assembly GRCh37 --offline --dir_cache /gpfs/projects/vep/ --vcf --everything --sift --polyphen --hgvs --fasta /gpfs/projects/refData/Homo_sapiens.GRCh37.dna.primary_assembly.fa --force --symbol --dir_plugins /gpfs/software/genomics/VEP/loftee ----plugin LoF -o PMC01_jointCalling.filtered.snps.indels_VEP.vcf
Hi,
I am getting a warning for some variants related to slicing, which disables the plugin from working. Could this be related to strange input variants, or is this a potential bug? Unfortunately, the actual variants associated with the warnings were not reported by VEP, so I do not have a good idea of the places in which the warning hits in.
Excerpt from VEP's warning output below:
WARNING: Plugin 'LoF' went wrong:
-------------------- EXCEPTION --------------------
MSG: Slice start cannot be greater than slice end
STACK Bio::EnsEMBL::Slice::expand /home/vep/src/ensembl-vep/Bio/EnsEMBL/Slice.pm:1247
STACK LoF::get_upstream_acceptor_flank /home/vep/src/ensembl-vep/modules/extended_splice.pl:209
STACK LoF::get_effect_on_splice /home/vep/src/ensembl-vep/modules/extended_splice.pl:111
STACK LoF::run /home/vep/src/ensembl-vep/modules/LoF.pm:180
STACK (eval) /home/vep/src/ensembl-vep/modules/Bio/EnsEMBL/VEP/OutputFactory.pm:1898
Date (localtime) = Thu Jul 5 12:19:01 2018
Ensembl API version = 92
regards,
Sigve
Hi,
I have this vcf:
##VEP="v90" time="2018-06-06 16:36:41" cache="/data/Share/nick/Paralog_Anno/homo_sapiens/90_GRCh38" ensembl-funcgen=90.743f32b ensembl-variation=90.58bf949 ensembl=90.4a44397 ensembl-io=90.9a148ea 1000genomes="phase3" COSMIC="81" ClinVar="201706" ESP="V2-SSA137" HGMD-PUBLIC="20164" assembly="GRCh38.p10" dbSNP="150" gencode="GENCODE 27" genebuild="2014-07" gnomAD="170228" polyphen="2.2.2" regbuild="16" sift="sift5.2.2"
##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence annotations from Ensembl VEP. Format: Allele|Consequence|IMPACT|SYMBOL|Gene|Feature_type|Feature|BIOTYPE|EXON|INTRON|HGVSc|HGVSp|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|ALLELE_NUM|DISTANCE|STRAND|FLAGS|SYMBOL_SOURCE|HGNC_ID|Paralogue_Vars">
##Paralogue_Vars=Equivalant variants and locations in paralogous genes
#CHROM POS ID REF ALT QUAL FILTER INFO
11 47355502 . C T . . gene=MYBPC3,HGVSc=ENST00000545968.1:c.2965G>A,HGVSp=ENSP00000442795.1:p.Glu989Lys,pathogenic=NA;CSQ=T|missense_variant|MODERATE|SPI1|ENSG00000066336|Transcript|ENST00000227163|protein_coding|5/5||||579|541|181|V/I|Gtc/Atc||1||-1||HGNC|HGNC:11241|&SPIB:chr19_50428082-50428084:L:L:REFID=1&AC020909.1:chr19_50428082-50428084:L:L:REFID=1&SPIC:chr12_101486385-101486387:L:L:REFID=1&,T|upstream_gene_variant|MODIFIER|MYBPC3|ENSG00000134571|Transcript|ENST00000256993|protein_coding|||||||||||1|2800|-1||HGNC|HGNC:7551|,T|missense_variant|MODERATE|SPI1|ENSG00000066336|Transcript|ENST00000378538|protein_coding|5/5||||761|538|180|V/I|Gtc/Atc||1||-1||HGNC|HGNC:11241|,T|upstream_gene_variant|MODIFIER|MYBPC3|ENSG00000134571|Transcript|ENST00000399249|protein_coding|||||||||||1|2800|-1||HGNC|HGNC:7551|,T|stop_gained|HIGH|SPI1|ENSG00000066336|Transcript|ENST00000533030|protein_coding|2/2||||260|90|30|W/*|tgG/tgA||1||-1||HGNC|HGNC:11241|,T|downstream_gene_variant|MODIFIER|SPI1|ENSG00000066336|Transcript|ENST00000533968|protein_coding|||||||||||1|2679|-1||HGNC|HGNC:11241|,T|upstream_gene_variant|MODIFIER|MYBPC3|ENSG00000134571|Transcript|ENST00000544791|nonsense_mediated_decay|||||||||||1|2800|-1||HGNC|HGNC:7551|,T|upstream_gene_variant|MODIFIER|MYBPC3|ENSG00000134571|Transcript|ENST00000545968|protein_coding|||||||||||1|2800|-1||HGNC|HGNC:7551|
There's multiple CSQ delimited by ,
. Using --all_csqs
still only gives one CSQ, e.g.:
CHROM POS REF ALT ID FILTER SYMBOL Amino_acids Codons Paralogue_Vars
11 47355502 C T . . SPI1 W/* tgG/tgA NA
The full command I used was:python /data/Share/nick/Paralog_Anno/loftee/src/tableize_vcf.py --vcf test.out_paraloc --out test.out_paraloc_tableized --include_id --vep_info SYMBOL,Amino_acids,Codons,Paralogue_Vars --all_csqs
Any reason for this?
We have downloaded loftee plugin from grch38 branch and tried to use it but encountered the below error. Could you please have a look and assist ?
WARNING: Failed to compile plugin LoF: Attempt to reload Bio/DB/BigFile.pm aborted.
Compilation failed in require at /path/to/vep/96/ensembl-vep/Bio/DB/BigWig.pm line 7.
BEGIN failed--compilation aborted at /path/to/vep/96/ensembl-vep/Bio/DB/BigWig.pm line 7.
Compilation failed in require at /path/to/vep/plugin/loftee-grch38/loftee/gerp_dist.pl line 2.
BEGIN failed--compilation aborted at /path/to/vep/plugin/loftee-grch38/loftee/gerp_dist.pl line 2.
Compilation failed in require at /path/to/vep/plugin/loftee-grch38/loftee/LoF.pm line 27.
Compilation failed in require at (eval 72) line 2.
BEGIN failed--compilation aborted at (eval 72) line 2.
I downloaded the BigWig.pm and BigFile.pm from https://metacpan.org/release/Bio-BigFile .
Below is the command that I have executed,
vep -I /tmp/Loftee/loftee_chr1Vars_b38 -o test1_b38_loftee_out --offline --cache --dir_cache /path/to/vep/96/ensembl-vep/.vep --assembly GRCh38 --cache_version 93 --vcf --force_overwrite --plugin LoF,loftee_path:/path/to/vep/plugin/loftee-grch38/loftee,human_ancestor_fa:/tmp/vep/Build-38/human_ancestor.fa.gz,gerp_bigwig:/tmp/vep/Build-38/gerp_conservation_scores.homo_sapiens.GRCh38.bw,gerp_database:/tmp/vep/Build-38/gerp_conservation_scores.homo_sapiens.GRCh38.bw,conservation_file:/tmp/phylocsf_gerp.sql --dir_plugins /path/to/vep/plugin/loftee-grch38/loftee
Hi,
I wanted to apply LOFTEE to filter variants.
But the link of phylocsf.sql didn't work anymore ...
Please tell me where can I download it?
Thank you!
Hi (cc @vladsaveliev),
From what I can see, there are different branches for grch37 (master) and grch38,. Yet, there are several inputs on how to add grch37 tracks (conservation etc) within the README of the grch38 branch. I am slightly confused as to which branch is most preferred and up-to-date, is there any recommendation here that I have missed?
kind regards,
Sigve
I have a pretty simple question. I'm trying to understand what the output means after running VEP with loftee. I see some variants with the annotation: LoF=HC;LoF_info=.... Which I understand is a high confidence loss of function variant. However there are other sites that only have LoF_info flags and no other LoF annotations (no LoF=HC, nor LoF=LC). Does that mean that the site is not a loss of function site?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.