Hi,
I am using methphaser as a singularity image on an HPC. I have made the singularity image work with the test data, so shouldn't be any technical issues with the image.
I am doing adaptive sequencing of a 1,2 Mbp region on chromosome 14 (the IGH locus). I am assembling the reads to a personal reference of this region, and then I re-map the ROI-reads to this personal reference. This works very well for my purpose, but when I am adding methphaser to the pipeline it yields only empty vcf and bam files (only headers). Shouldn't it at least contain the information in the input bam and input vcf files? I am not getting any errors when running methphaser.
I would be extremely grateful if you could help me troubleshoot.
Below is truncated input bam file with frist two sequences, truncated input vcf file, gtf file and truncated fast-reference:
Truncated BAM file:
750c6e3f-9e8c-44b9-9e31-fc66c1b1fc2c 16 contig_1 1 60 1189S416M1I1717M1I1167M5D438M1D86M1I193M2D234M2D285M1I59M1D36M2D23M1I365M1D1397M1I793M1D8M1I4M1D198M2D12M1D1249M1D5M1I11M102S * 0 0 CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC&')+069:78A?DHBEDFJGD>C>G=:8400.---,'')255676;<;:<5>?@BFBCIDCBABCDGEBBKCED@=CBECB>DCF@@DCCFCA;@@C=?>CAD>><A@CB?BAE>A?ECFB?@A>CA?>B@EB?D-5640...0>?@>D:9*)'&&%$$$$$$$$(()*,/8...-.03656=>>=>??ESGJMFMGIBEEGHEHKPIG<=6/.4..-,+))-2DGIFGD:'&)%$$$%&&'+++'%%&%&(+,2223-,,*((&$$$%&''+++**) NM:i:42 ms:i:17172 AS:i:17172 nn:i:0 tp:A:P cm:i:1027 s1:i:5636 s2:i:324 de:f:0.0039 SA:Z:contig_1,1578,-,2107M7D7888S,1,83; rl:i:4350 CO:Z:MM:Z:C+h?,4,2,4,1,0,0,1,0,1,0,0,0,1,13,13,59,20,42,26,28,24,11,90,4,28,0,38,11,34,1,32,14,23,48,27,17,68,2,5,0,4,126,0,0,3,1,3,1,4,2,9,0,2,2,4,0,5,2,13,0,1,0,0,5,0,8,2,0,1,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,4,2,0,1,0,8,2,0,1,0,1,0,3,2,0,1,0,9,2,0,1,0,1,0,4,2,0,1,0,9,2,0,0,0,9,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,2,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,0,0,4,2,0,0,0,1,2,15,0,0,0,0,0; ML:B:C,36,8,165,2,1,16,3,3,2,2,3,264,92,253,254,253,235,248,253,254,255,253,254,190,240,248,250,206,239,9,182,209,4,221,183,65,185,15,242,0,0,241,254,255,254,244,245,227,250,250,254,254,196,230,252,65,248,254,223,248,129,252,204,250,254,253,208,123,242,253,247,254,250,13,60,196,217,255,255,255,240,116,250,254,255,68,92,0,86,119,249,255,255,254,234,252,251,248,254,253,33,238,224,254,252,253,20,1,225,171,22,182,252,253,254,247,0,242,245,254,149,52,242,239,243,54,50,62,58,123,107,134,177,248,254,156,215,205,224,202,201,241 HP:i:1 PC:i:210 PS:i:2459
40fce1e4-96a0-4a00-8695-fa9590857b1d 16 contig_1 1 60 1363S1821M1D653M24D119M1I10M1D113M6I391M1D171M1I342M1I149M1D168M9M1D41M3D760M1D45M2D395M1D2713M2D1363M2D207M1D18M2D16M2D15M32S * 0 0 AACCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA&)135@AABBBF?G=A:EBDGCBDBHCA@D@EDA>DEDED>BCDEDDEBDCB>CBFFB>C@C???A@DB@@@>C>>>DADDA?CAEBA:AAD;=BC@CA?;C?CCA9CADB@BC@FB?;?>?>>=@>BBA?B=?=;;4AA46FIBAGMSLSIJKIIIJJSIISSJHSHKGEGJGSJLQKSHBBJ2?FFIIIGIISLQSIKJOIHSPSNHSFSKSSSEGJSBLEFA??=999;>SLKLSIILSJHSS??@BBBBAEDABHISMSSSLOLMLSGNHJQKJISLJSRNSLJHJIMSSOIOLFISMMMJIKILHEHFSPSKKSSSGGLOJJJLHIKSSJMIKMLMKJSJKHHSJPIJSLJND0<;=AB66666LSSIIIJEC6-,,,2015>DIE###$$$%&')&(.,+'%%%+356/..-.56((((()**')),,,,+'&%&%$$%')))'&&&%$$ NM:i:320 ms:i:112130 AS:i:112134 nn:i:0 tp:A:P cm:i:9642 s1:i:52735 s2:i:2852 de:f:0.0037 rl:i:5199 CO:Z:MM:Z:C+h?,16,0,1,3,2,1,1,5,1,2,2,7,3,0,4,1,37,0,3,1488,2,0,0,0,9,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,2,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,0,3,2,0,0,0,1,0,4,2,0,0,0,1,0,4,2,0,0,0,1,2,15,0,0,0; ML:B:C,29,104,69,23,0,2,62,2,2,2,1,1,1,1,3,0,0,1,0,11,29,1,9,7,3,6,8,22,2,4,1,2,49,1,2,2,68,1,28,1,0,1,92,1,5,2,224,254,8,8,3,5,1,9,2,9,3,2,1,1,1,3,144,2,18,30,9,5,0,5,1,14,1,4,4,1,3,4,0,5,0,1,4,0,1,42,0,53,2,4,16,24,2,0,1,4,1,2,26,41,115,1,2,0,1,3,2,3,9,1,1,19,1,13,7,7,0,1,3,30,7,10,1,149,29,230,252,251,189,252,255,254,247,245,253,202,0,243,254,197,236,240,253,249,252,251,241,133,120,54,240,249,235,130,196,36,249,253,252,216,228,212,233,1 HP:i:2 PC:i:210 PS:i:2459
Truncated VCF file:
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##source=Clair3
##clair3_version=1.0.0
##FILTER=<ID=LowQual,Description="Low quality variant">
##FILTER=<ID=RefCall,Description="Reference call">
##INFO=<ID=P,Number=0,Type=Flag,Description="Result from pileup calling">
##INFO=<ID=F,Number=0,Type=Flag,Description="Result from full-alignment calling">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled genotype likelihoods rounded to the closest integer">
##FORMAT=<ID=AF,Number=1,Type=Float,Description="Estimated allele frequency in the range of [0,1]">
##contig=<ID=contig_1,length=2568895>
##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phase set identifier">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE
contig_1 2459 . C A 10.87 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2465 . C A 10.99 PASS F GT:GQ:DP:AF:PS 0|1:10:24:0.2917:2459
contig_1 2475 . TACCCCAACCCCAACCCCAACCCCA T 9.78 PASS P GT:GQ:DP:AF 0/1:9:24:0.2083
contig_1 2588 . C T 9.03 PASS P GT:GQ:DP:AF:PS 0|1:9:24:0.375:2459
contig_1 2594 . C T 14.76 PASS F GT:GQ:DP:AF:PS 0|1:14:24:0.375:2459
contig_1 2600 . C T 16.54 PASS F GT:GQ:DP:AF:PS 0|1:16:24:0.4167:2459
contig_1 2618 . T TA 15.35 PASS F GT:GQ:DP:AF 0/1:15:24:0.375
contig_1 2628 . TA T 14.22 PASS F GT:GQ:DP:AF 0/1:14:24:0.3333
contig_1 2804 . G T 19.19 PASS F GT:GQ:DP:AF:PS 0|1:19:24:0.375:2459
contig_1 2894 . G T 23.33 PASS F GT:GQ:DP:AF:PS 0|1:23:24:0.3333:2459
contig_1 3300 . AACCCT A 17.11 PASS F GT:GQ:DP:AF 0/1:17:24:0.5833
contig_1 3305 . T TA 8.56 PASS F GT:GQ:DP:AF 0/1:8:24:0.375
contig_1 66619 . AT A 0 LowQual F GT:GQ:DP:AF 1/1:0:23:0.1739
GTF file:
contig_1 Phasing exon 2459 400591 . + . gene_id "2459"; transcript_id "2459.1";
contig_1 Phasing exon 641089 1001211 . + . gene_id "641089"; transcript_id "641089.1";
contig_1 Phasing exon 1230613 1244186 . + . gene_id "1230613"; transcript_id "1230613.1";
contig_1 Phasing exon 1582658 1670184 . + . gene_id "1582658"; transcript_id "1582658.1";
contig_1 Phasing exon 1858102 1886017 . + . gene_id "1858102"; transcript_id "1858102.1";
contig_1 Phasing exon 2054129 2503491 . + . gene_id "2054129"; transcript_id "2054129.1";
Truncated FASTA reference:
contig_1
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
CCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC
Thanks a lot in advance!!
Best,
Andreas