Comments (3)
It looks like it's working on my test VCF (~1000 variants annotated with VEP):
In [1]: from fuc import pyvcf, pymaf
In [2]: mf = pymaf.MafFrame.from_vcf("test_variants.vcf.gz")
In [3]: mf
Out[3]: <fuc.api.pymaf.MafFrame at 0x129285a90>
In [2]: mf = pymaf.MafFrame.from_vcf("test_variants.vcf.gz")
In [3]: mf
Out[3]: <fuc.api.pymaf.MafFrame at 0x129285a90>
In [4]: mf.df
Out[4]:
Hugo_Symbol Entrez_Gene_Id Center ... Tumor_Seq_Allele2 Protein_Change Tumor_Sample_Barcode
0 MTOR ENSG00000198793 . ... A . 117
from fuc.
You are absolutely correct here. Thanks for reporting the problem. I fixed this issue in the 0.34.0-dev
branch as per your suggestion (i.e. split by ;
and then look for CSQ
). See an example below (note that the first row uses the line you provided in the original post):
>>> from fuc import pyvcf, pymaf
>>> data = {
... 'CHROM': ['chr1', 'chr2'],
... 'POS': [100, 101],
... 'ID': ['.', '.'],
... 'REF': ['G', 'T'],
... 'ALT': ['A', 'C'],
... 'QUAL': ['.', '.'],
... 'FILTER': ['.', '.'],
... 'INFO': ['AC=2;ACGTNacgtnMINUS=0,0,0,0,0,0,0,0,0,0;ACGTNacgtnPLUS=5,0,61,0,0,0,0,0,0,0;AN=4;AS_FilterStatus=SITE;AS_SB_TABLE=9,41|1,5;CALLERS=mutect2;CSQ=A|3_prime_UTR_variant|MODIFIER|MTOR|ENSG00000198793|Transcript|ENST00000361445|protein_coding|58/58||ENST00000361445.9:c.*700C>T||8471/8721|||||||-1|||SNV|HGNC|HGNC:3942|YES|1|P1|CCDS127.1|ENSP00000354558|P42345||UPI000012ABD3||Ensembl|G|G||1|||||chr1:g.11106785G>A|||||||||||||||||||||||||||;ClippingRankSum=-0.79;DKFZBias=damage;DP=85;ECNT=2;EPR=pass;FS=0;GERMQ=93;MBQ=34,33;MFRL=153,154;MMQ=60,60;MPOS=28;MQ=60;MQ0=0;MQRankSum=0;POPAF=7.3;ReadPosRankSum=0.452;TLOD=12.45', 'AC=9;CSQ=C|splice_donor_variant|HIGH|MTOR|2475|Transcript|NM_001386500.1|protein_coding||46/57||||||||||-1||EntrezGene||||||||A|A|||||||||||||||||||||||||||||'],
... 'FORMAT': ['GT:AD:DP:AF', 'GT:AD:DP:AF'],
... 'A': ['0/1:176,37:213:0.174', '0/1:966,98:1064:0.092']
... }
>>> vf = pyvcf.VcfFrame.from_dict([], data)
>>> vf.df
CHROM POS ID REF ... FILTER INFO FORMAT A
0 chr1 100 . G ... . AC=2;ACGTNacgtnMINUS=0,0,0,0,0,0,0,0,0,0;ACGTN... GT:AD:DP:AF 0/1:176,37:213:0.174
1 chr2 101 . T ... . AC=9;CSQ=C|splice_donor_variant|HIGH|MTOR|2475... GT:AD:DP:AF 0/1:966,98:1064:0.092
[2 rows x 10 columns]
>>> mf = pymaf.MafFrame.from_vcf(vf, keys=['AD', 'AF'])
>>> mf.df
Hugo_Symbol Entrez_Gene_Id Center NCBI_Build ... Protein_Change Tumor_Sample_Barcode AD AF
0 MTOR ENSG00000198793 . . ... . A 176,37 0.174
1 MTOR 2475 . . ... . A 966,98 0.092
[2 rows x 17 columns]
>>>
You can install the development branch:
$ git clone https://github.com/sbslee/fuc
$ cd fuc
$ git checkout 0.34.0-dev
$ pip install -e .
Please let me know if this doesn't solve the issue.
from fuc.
Great to hear! Thanks again for reporting. The official release for 0.34.0
will be made some time next month. Please feel free to reopen this issue if necessary.
from fuc.
Related Issues (20)
- [VCF] Add function to find intersection between VCF files HOT 1
- [MAF/VCF] Add function to convert unannotated VCF to MAF HOT 1
- [MAF/VCF] Add function to create rainfall plots HOT 1
- [VCF] Add function to convert missing genotypes (./.) to REF homozygous (0/0)
- [BAM] Add function to plot uniformity in read depth
- [VCF] Add function to plot summary statistics HOT 1
- [VCF] Add function to convert 23andMe data to VCF
- [VCF] Error related to `pyvcf.VcfFrame.plot_hist` HOT 11
- [VCF] How to remove all rows with the same variant in VCF file using `pyvcf` HOT 10
- [VCF] Add function to create a scatter plot of allele frequency for two datasets HOT 1
- [General] Error during installation fuc via conda HOT 2
- [VCF] Add function to compute AC/AN/AF in the INFO column HOT 1
- [VCF] Add function to remove samples with high missingness
- [General] Error while importing pyvcf HOT 2
- [VCF] Update `pyvcf.VcfFrame.filter_sampnum` to be more robust HOT 1
- [MAF] maf-oncoplt Index Error HOT 3
- [VCF] Issue reading vcf from mutect2, strelka2 HOT 2
- [VCF] Question on usage HOT 7
- [MAF] Variant color coding mismatch HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fuc.