Coder Social home page Coder Social logo

poisonalien / maftools Goto Github PK

View Code? Open in Web Editor NEW
431.0 22.0 218.0 73.99 MB

Summarize, Analyze and Visualize MAF files from TCGA or in-house studies.

Home Page: http://bioconductor.org/packages/release/bioc/html/maftools.html

License: MIT License

R 95.60% C 4.34% CSS 0.06%
maf-files cancer-genomics cancer-genome-atlas tcga genomics bioinformatics r

maftools's Introduction

maftools - An R package to summarize, analyze and visualize MAF files

GitHub closed issues R-CMD-check

Introduction

maftools provides a comprehensive set of functions for processing MAF files and to perform most commonly used analyses in cancer genomics. See here for a detailed usage and a case study.

Installation

#Install from Bioconductor repository
BiocManager::install("maftools")

#Install from GitHub repository
BiocManager::install("PoisonAlien/maftools")

Getting started: Vignette and a case study

A complete documentation of maftools using TCGA LAML as a case study can be found here.

Besides the MAF files, maftools also facilitates processing of BAM files. Please refer to below vignettes and sections to learn more.

Citation

Mayakonda A, Lin DC, Assenov Y, Plass C, Koeffler HP. 2018. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Research. PMID: 30341162

Useful links

File Fomats Data portals Annotation tools
Mutation Annotation Format TCGA vcf2maf - for converting your VCF files to MAF
Variant Call Format ICGC annovar2maf - for converting annovar output files to MAF
ICGC Simple Somatic Mutation Format Broad Firehose bcftools csq - Rapid annotations of VCF files with variant consequences
cBioPortal Annovar
PeCan Funcotator
CIViC - Clinical interpretation of variants in cancer
DGIdb - Information on drug-gene interactions and the druggable genome

Useful packages/tools

Below are some more useful software packages for somatic variant analysis

  • TRONCO - Repository of the TRanslational ONCOlogy library (R)
  • dndscv - dN/dS methods to quantify selection in cancer and somatic evolution (R)
  • cloneevol - Inferring and visualizing clonal evolution in multi-sample cancer sequencing (R)
  • sigminer - Primarily for signature analysis and visualization in R. Supports maftools output (R)
  • GenVisR - Primarily for visualization (R)
  • comut - Primarily for visualization (Python)
  • TCGAmutations - pre-compiled curated somatic mutations from TCGA cohorts (from Broad Firehose and TCGA MC3 Project) that can be loaded into maftools (R)
  • somaticfreq - rapid genotyping of known somatic hotspot variants from the tumor BAM files. Generates a browsable/sharable HTML report. (C)

Powered By

maftools's People

Contributors

b-niu avatar biosunsci avatar hpages avatar igordot avatar ishida-md avatar johnmcma avatar jokergoo avatar jwokaty avatar kaigu1990 avatar link-ny avatar lshep avatar mattdowle avatar moxgreen avatar nturaga avatar omarashkar avatar poisonalien avatar qins avatar rdmorin avatar shixiangwang avatar vobencha avatar zhangyz1997 avatar zmiimz avatar zwael avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

maftools's Issues

hg19

I was wondering which hg19 fasta is need for mutational plots

Thanks a lot!

Zayni

Unable to load MAF file using maftools::read.maf (without loading the entire package)

I have developed the habit of calling package functions directly using the :: notation in R. This way, I don't need to worry about function name conflicts. When doing this to load a MAF file using maftools::read.maf(), I run into an error (could not find function "vcr"). I can circumvent this error by loading the entire package, but I would prefer not to have to do this. Can you update the package such that the vcr function is internally available without me having to load the package? Thanks!

I've included below simple steps to reproduce the error and workaround.

> laml.maf = system.file('extdata', 'tcga_laml.maf.gz', package = 'maftools')
> laml = maftools::read.maf(maf = laml.maf, removeSilent = T, useAll = F)
reading maf..
Mutation_Status not found. Assuming all variants are Somatic and validated.
Excluding 475 silent variants.
[...]
Creating oncomatrix (this might take a while)..
Error in ifelse(test = length(as.character(x)) > 1, no = as.character(x),  : 
  could not find function "vcr"


> library(maftools)
> laml = maftools::read.maf(maf = laml.maf, removeSilent = T, useAll = F)
reading maf..
Mutation_Status not found. Assuming all variants are Somatic and validated.
Excluding 475 silent variants.
[...]
Creating oncomatrix (this might take a while)..
Sorting..
Summarizing..
[...]
Frequently mutated genes..
[...]
Done !

use to target-seq or WEX

Hi,
This tool is really very helpful.
I wonder if it need to do gene length correction when I use oncoplot for my target sequencing data or exom sequencing data.
Thanks
Shirley

error in R studio

Hi

Recently, I have downloaded the latest version of maftools in Rstudio in Mac. However, when I was loading the library, there is an error generated:

library("maftools")
Error: package or namespace load failed for ‘maftools’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtracklayer/libs/rtracklayer.so':
dlopen(/Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtracklayer/libs/rtracklayer.so, 6): Library not loaded: /usr/local/opt/openssl/lib/libssl.1.0.0.dylib
Referenced from: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rtracklayer/libs/rtracklayer.so
Reason: image not found

Any advise will be deeply appreciated.

order annotation covariate

Hello,
Is there a way to order the covariate in the annotation file?
e.g. in FAB_classification, when drawing oncoplot, ordered by M1, M2, M3...
Thanks

Sorting breaks annotation order in oncostrip

I believe I have found a bug in the oncostrip function. The effect of this bug is that the annotation of samples is not reordered when the sample mutation matrix is sorted (i.e. the default behavior). I think adding the following code to the function to replace the single line setting "bot.anno" (now in the else block) fixes this.

if(sort){
sorted.order = colnames(mat)
anno.df.sorted = as.data.frame(anno.df[sorted.order,])
rownames(anno.df.sorted) = sorted.order
colnames(anno.df.sorted) = colnames(anno.df)
bot.anno = ComplexHeatmap::HeatmapAnnotation(anno.df.sorted)
}else{
bot.anno = ComplexHeatmap::HeatmapAnnotation(anno.df)
}

ExtractSignatures Error

Dear Author,

I used MAF file from Oncotator. Every step was fine, including 'TrinucleotideMatrix'. But, when I try 'extractSignatures', I saw the following error. Could you help me with this? Thankyou

AP1.hg19.tnm = trinucleotideMatrix(maf = AP1, ref_genome = '/Users/hg19_chromosome.fa', prefix = 'chr', add = TRUE, ignoreChr = 'chrM', useSyn = TRUE)
AP1.hg19.sign = extractSignatures(mat = AP1.hg19.tnm)
Estimating best rank..
Timing stopped at: 0.002 0 0.002
Timing stopped at: 0.002 0 0.002
Timing stopped at: 0.001 0 0.001
Timing stopped at: 0.002 0 0.003
Timing stopped at: 0.002 0 0.003
Error in (function (...) : All the runs produced an error:
-#1 [r=2] -> none of the packages are loaded [in call to 'path.package']
-#2 [r=3] -> none of the packages are loaded [in call to 'path.package']
-#3 [r=4] -> none of the packages are loaded [in call to 'path.package']
-#4 [r=5] -> none of the packages are loaded [in call to 'path.package']
-#5 [r=6] -> none of the packages are loaded [in call to 'path.package']

Cheers,
James

can not read maf file in maftools

I've some problems to read Maf Files (Mutation Annotation Format) in R as follows:

tmp.maf=read.maf(maf = "tmp.maf", removeSilent = TRUE, useAll = FALSE)
reading maf..
Mutation_Status not found. Assuming all variants are Somatic and validated.
Excluding 0 silent variants.

Creating oncomatrix (this might take a while)..
Sorting..
Error in oncomat.copy[, colnames(mdf)]
and the tmp.maf is:

Hugo_Symbol Entrez_Gene_Id NCBI_Build Chromosome Start_Position End_Position Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele2 Protein_Change Gene dbSNP_RS Tumor_Sample_Barcode
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 XH0152
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 XY0039
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 XY0056
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 XJ0007
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 SS0012
SAMD11 148398 GRCh37 1 865545 865545 Missense_Mutation SNP G A p.R28Q ENSG00000187634 rs201186828 SS0177
SAMD11 148398 GRCh37 1 874762 874762 Missense_Mutation SNP C T p.R210C ENSG00000187634 rs139437968 PRB0454
NOC2L 26155 GRCh37 1 880502 880502 Missense_Mutation SNP C T p.R693Q ENSG00000188976 rs74047418 XY0033
NOC2L 26155 GRCh37 1 880922 880922 Missense_Mutation SNP C T p.D677N ENSG00000188976 rs187444884 CC_HS0268
NOC2L 26155 GRCh37 1 881784 881784 Missense_Mutation SNP C T p.V601M ENSG00000188976 rs199697037 CH0128

However, when I delete the "NOC2L" line, it works, very confusing!

Would u help? thanks!

Best,
John

Reported kataegis foci don't always correspond to regions of hypermutation

I've manually verified this, but it appears that the rainfallPlot function just returns the the foci that are identified by the cpt.mean function in the changepoint package. However, the points that the cpt.mean function returns are only that - the points where the distribution changes. In some cases, the points that are returned are points that correspond to an increase in the distance between foci. For example, this may happen if there is a region of kataegis between foci 51 and 75. In this case, both points will be returned, however, we care about the region, rather than the specific point. In cases like this, a user will be able to infer the regions.

However, I've also observed cases, where it appears that the reported foci represents an increase in distance between SNPS, but is lacking a 'leading foci' that corresponds to the start of a region of hypermutation.

I've also observed cases where the 'regions of hypermutation' don't represent a tight clustering of foci, i.e. the distance between SNPs is still quite large (1000+ nt between SNPs).

error in strsplit when plot "coOncoplot"

Hello,
I used subsetMaf function to subset my dataset by gender. When I plotted coOncoplot with these 2 dataset, an error occured:

genes=c("USH2A","ASXL3","ARID1A","DCHS1","FMN2","GNAS")
coOncoplot(m1 = male.maf, m2 = female.maf, m1Name = 'male', m2Name = 'female', genes=genes)
Error in strsplit(x = variant.classes, split = ";", fixed = TRUE) :
non character parameter

Is there a problem with my gene list?
Thanks

Error in "plotmafSummary"

I just updated the package via github

#Install maftools from github repository.
library("devtools")
install_github(repo = "PoisonAlien/maftools")

But when I tried to use plotmafSummary for demo data.

require(maftools)
#read TCGA maf file for LAML
laml.maf = system.file('extdata', 'tcga_laml.maf.gz', package = 'maftools')
laml = read.maf(maf = laml.maf, removeSilent = T, useAll = F)
plotmafSummary(maf = laml, rmOutlier = T, addStat = 'median', dashboard = TRUE)

It returned errors

Error in zero_range(from) : x must be length 1 or 2
In addition: Warning messages:
1: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 
2: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 
3: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 
4: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 
5: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 
6: `legend.margin` must be specified using `margin()`. For the old behavior use legend.spacing 

Here is my session information

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] maftools_0.99.55    Biobase_2.33.2      BiocGenerics_0.19.2 devtools_1.12.0    

loaded via a namespace (and not attached):
  [1] nlme_3.1-128                bitops_1.0-6                doParallel_1.0.10          
  [4] RColorBrewer_1.1-2          httr_1.2.1                  prabclus_2.2-6             
  [7] GenomeInfoDb_1.9.14         tools_3.3.1                 R6_2.2.0                   
 [10] KernSmooth_2.23-15          DBI_0.5-1                   lazyeval_0.2.0             
 [13] colorspace_1.2-6            trimcluster_0.1-2           nnet_7.3-12                
 [16] GetoptLong_0.1.5            withr_1.0.2                 curl_2.1                   
 [19] git2r_0.15.0                chron_2.3-47                pkgmaker_0.22              
 [22] labeling_0.3                slam_0.1-38                 rtracklayer_1.33.12        
 [25] caTools_1.17.1              diptest_0.75-7              scales_0.4.0.9003          
 [28] DEoptimR_1.0-6              mvtnorm_1.0-5               robustbase_0.92-6          
 [31] NMF_0.20.6                  stringr_1.1.0               digest_0.6.10              
 [34] Rsamtools_1.25.2            cometExactTest_0.1.3        XVector_0.13.7             
 [37] changepoint_2.2.2           BSgenome_1.41.2             GlobalOptions_0.0.10       
 [40] rstudioapi_0.6              RSQLite_1.0.0               shape_1.4.2                
 [43] zoo_1.7-13                  gtools_3.5.0                mclust_5.2                 
 [46] BiocParallel_1.7.9          DPpackage_1.1-6             dendextend_1.3.0           
 [49] dplyr_0.5.0                 VariantAnnotation_1.19.11   RCurl_1.95-4.8             
 [52] magrittr_1.5                modeltools_0.2-21           wordcloud_2.5              
 [55] Matrix_1.2-7.1              Rcpp_0.12.7                 munsell_0.4.3              
 [58] S4Vectors_0.11.13           stringi_1.1.2               whisker_0.3-2              
 [61] MASS_7.3-45                 SummarizedExperiment_1.3.82 zlibbioc_1.19.0            
 [64] flexmix_2.3-13              gplots_3.0.1                plyr_1.8.4                 
 [67] grid_3.3.1                  gdata_2.17.0                ggrepel_0.5                
 [70] lattice_0.20-34             Biostrings_2.41.4           cowplot_0.6.3              
 [73] splines_3.3.1               GenomicFeatures_1.25.20     circlize_0.3.9             
 [76] ComplexHeatmap_1.11.7       GenomicRanges_1.25.94       rjson_0.2.15               
 [79] fpc_2.1-10                  rngtools_1.2.4              reshape2_1.4.1             
 [82] codetools_0.2-15            biomaRt_2.29.2              stats4_3.3.1               
 [85] XML_3.98-1.4                data.table_1.9.6            foreach_1.4.3              
 [88] gtable_0.2.0                kernlab_0.9-25              assertthat_0.1             
 [91] ggplot2_2.1.0.9001          gridBase_0.4-7              xtable_1.8-2               
 [94] class_7.3-14                survival_2.39-5             tibble_1.2                 
 [97] iterators_1.0.8             GenomicAlignments_1.9.6     AnnotationDbi_1.35.4       
[100] registry_0.3                memoise_1.0.0               IRanges_2.7.14             
[103] cluster_2.0.4          

Can you help?
Thanks,

Error thrown in oncodrive function

I am getting this error sometimes when I run oncodrive:
my.sig = oncodrive(maf =mymaf, AACol = aacol, pvalMethod = 'zscore',minMut = 5)

Estimating background scores from synonymous variants..
| | 1%Error in matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, :
length of 'dimnames' [2] not equal to array extent
Calls: oncodrive ... Ops.data.table -> NextMethod -> Ops.data.frame -> matrix
Execution halted

I can't seem to find the cause of this issue.

Convert Dataframe to MAF object

Hello,
I have a txt file with all the MAF required columns (plus others) that I would like to use maftools with. Is there a way to convert this dataframe to a MAF object?

Double-counting in gene summary calculation

I have been running oncodriveClust in maftools and found that genes with total mutation counts below minMut were sometimes found in the result. I traced this to an issue in the function that calculates the gene-level summary of mutations.

Unless I'm mistaken, the fix is an easy one:

===================================
if(any(colnames(hs.cast) %in% c('Amp', 'Del'))){
hs.cast.cnv = hs.cast[,colnames(hs.cast)[colnames(hs.cast) %in% c('Amp', 'Del')], with =FALSE]
hs.cast.cnv$CNV_total = rowSums(x = hs.cast.cnv)

hs.cast = hs.cast[,!colnames(hs.cast)[colnames(hs.cast) %in% c('Amp', 'Del')], with =FALSE]
hs.cast[,total:=rowSums(hs.cast[,2:ncol(hs.cast), with = FALSE])]

hs.cast = cbind(hs.cast, hs.cast.cnv)
hs.cast = hs.cast[order(total, CNV_total, decreasing = TRUE)]

}else{
hs.cast[,total:=rowSums(hs.cast[,2:ncol(hs.cast), with = FALSE])]
hs.cast = hs.cast[order(total, decreasing = TRUE)]
#the line below needs to be deleted
hs.cast[,total:=rowSums(hs.cast[,2:ncol(hs.cast), with = FALSE])]
hs.cast[order(total, decreasing = TRUE)]
}

==================================

#normal functionality

which(s.old[,Hugo_Symbol]=="ADCYAP1R1")
[1] 2259
s.old[2259,]
Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins Missense_Mutation Nonsense_Mutation
1: ADCYAP1R1 0 0 0 0 2 0
Nonstop_Mutation Splice_Site Translation_Start_Site total MutatedSamples
1: 0 0 0 4 1

#after change to code

s.f=getGeneSummary(laml.f)
which(s.f[,Hugo_Symbol]=="ADCYAP1R1")
[1] 2259
s.f[2259,]
Hugo_Symbol Frame_Shift_Del Frame_Shift_Ins In_Frame_Del In_Frame_Ins Missense_Mutation Nonsense_Mutation
1: ADCYAP1R1 0 0 0 0 2 0
Nonstop_Mutation Splice_Site Translation_Start_Site total MutatedSamples
1: 0 0 0 2 1

#actual MAF input (I pruned out the latter columns):

grep ADCYAP1R1 in.maf
ADCYAP1R1 0 . GRCh37 chr7 31132339 31132339 + Missense_Mutation SNP A A G novel
ADCYAP1R1 0 . GRCh37 chr7 31132340 31132340 + Missense_Mutation SNP G G C novel

lollipopPlot error

I just installed the latest maftools, and has an error

Error in `[.data.table`(prot.dat, , .N, .(Variant_Classification, conv,  : 
  column or expression 2 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]
In addition: Warning message:
In lollipopPlot(maf = SCLC.maf, gene = "TP53", AACol = "HGVSp_Short",  :
  NAs introduced by coercion

I think it has to do with the data.table package. It was working before.

Thanks,
Tommy

oncoplot - incorporation of GISTIC region with no SNV

Hi,
I just recently found this package and it's great.
I was wondering, though - the incorporation of GISTIC data in the oncoplot is really useful, as these sorts of sequencing studies will increasingly use tools list GISTIC. However, as it works right now, the oncoplot won't show CNV in a gene unless a sample has an SNV in it.
For instance, 95% of my samples have a homozygous loss of a particular gene, which then obviously has no SNVs, but I cannot show it on the oncoplot along with the other genes. The genes that have het/hom losses and SNVs will show up.
Is this possible to implement somehow?
Thanks for your great work.
Sophie

plotmafSummary error ( j out of bounds )

Hi, All

When I try to use the maftools with converted maf format from mutect vcf (vcf2maf), I got the error i n below.

laml2 = read.maf(maf = '~/Downloads/TN1508R0520-1_TN1508R0519-1.maf', removeSilent = T, useAll = T)

TN1508R0520-1_TN1508R0519-1.maf.zip

plotmafSummary(maf = laml2,  addStat = 'median', dashboard = TRUE)
Error in `[.data.table`(vt.plot.dat, , c(2:4), with = FALSE) : 
  j out of bounds
> laml2
An object of class  MAF 
                       ID summary Mean Median
1:             NCBI_Build  GRCh37   NA     NA
2:                 Center       .   NA     NA
3:                Samples       1   NA     NA
4:                 nGenes     493   NA     NA
5:      Missense_Mutation     507  507    507
6:      Nonsense_Mutation      22   22     22
7:            Splice_Site       1    1      1
8: Translation_Start_Site       2    2      2
9:                  total     532  532    532

Can you check it please?

rainfallPlot ignore mutations in X and Y chromosomes?

Hi,

I used rainfallPlot(), the resulting figure does not show any mutations points on X and Y chromosomes. do you filter out those first?

screenshot 2016-11-16 16 58 07

Although 4 change points were detected, I do not see any clustered mutations on the four arrows.

Thanks,
Tommy

plotSignatures issue

Hi,
I tried to use extractSignatures function and found the result was not consistent with the plotted figure.
The result shown Signature 3 & 4 were similar to validated Signature_4, however, from the picture you could see they are different. The same thing as Signature_1 & 2.
Can you help to check if I am wrong.

> laml.tnm = trinucleotideMatrix(maf = laml, ref_genome = 'E:/ucsc.hg19.fasta', 
+                                prefix = 'chr', add = TRUE, ignoreChr = 'chr23', useSyn = TRUE)
reading fasta (this might take few minutes)..
Extracting adjacent bases..
matrix of dimension 87x96
> laml.sign = extractSignatures(mat = laml.tnm, nTry = 6, plotBestFitRes = FALSE)
Estimating best rank..
  method   seed rng metric rank sparseness.basis sparseness.coef      rss      evar silhouette.coef silhouette.basis residuals niter cpu
2 brunet random   2     KL    2        0.4802616       0.5718631 18204.83 0.9573528       1.0000000        1.0000000  4308.167   440  NA
3 brunet random   4     KL    3        0.4520756       0.4341126 14054.08 0.9670764       0.7536191        0.8272047  4035.479   790  NA
4 brunet random   3     KL    4        0.4284171       0.4341540 13504.15 0.9683647       0.5846974        0.7104043  3846.275  1310  NA
5 brunet random   3     KL    5        0.4386150       0.4470788 12955.52 0.9696500       0.4601231        0.6525860  3687.892  1760  NA
6 brunet random   1     KL    6        0.4405360       0.4818809 12114.70 0.9716197       0.3969700        0.5699692  3546.226  2000  NA
  cpu.all nrun cophenetic dispersion silhouette.consensus
2      NA   10  0.9731426  0.9330321            0.9612811
3      NA   10  0.9806881  0.8537614            0.9163375
4      NA   10  0.9710437  0.8224865            0.8598038
5      NA   10  0.9127233  0.6518430            0.5939005
6      NA   10  0.8393772  0.6170696            0.4315838
Using 4 as a best-fit rank based on decreasing cophenetic correlation coefficient.
Comparing against experimentally validated 21 signatures.. (See Alexandrov et.al Nature 2013 for details.)
Found Signature_1 most similar to validated Signature_1B. Correlation coeff: 0.775910702869783 
Found Signature_2 most similar to validated Signature_1B. Correlation coeff: 0.353931858288915 
Found Signature_3 most similar to validated Signature_4. Correlation coeff: 0.488376357569433 
Found Signature_4 most similar to validated Signature_4. Correlation coeff: 0.149722296925071 
> plotSignatures(laml.sign)

image

hg19 fasta

I was wondering which hg19 fasta is need for mutational plots

Thanks a lot!

Zayni

lollipopPlot fails when called with a MAF file containing only 1 protein change

Error in createOncoMatrix(maf, chatty = verbose) : object 'gene' not found

I'm calling the lollipopPlot via
laml = read.maf(maf = paste0("../../li_",site,".maf"), removeSilent = T, useAll = F) l=lollipopPlot(maf = laml, gene = 'TP53', AACol = 'Protein_Change', labelPos = 'all' )

The MAF file has only 3 rows: the header, a single data row, a blank row. Duplicating the data row does the trick but messes the diagram because it plots the fictitious point.

Gene symbols are converted to date

Great package. However, I found that somehow—I can't figure out where in the code—the read.maf function converts certain gene symbols to dates à la Excel.

url not working in oncotate.R

Hi, fyi, I get an error running the oncotate() function,

fetching annotation from Oncotator. This might take a while..
|========================== | 25%
Show Traceback

Rerun with Debug
Error in file(con, "r") : cannot open the connection

and changing line 49 to rec.url = paste('http://portals.broadinstitute.org/oncotator/mutation',rec,sep = '/') worked.

Rainfall plot assumes MAF rows are ordered on start position

I have found that Rainfall plot gets a lot of NA values from the diff function when the chromosomal positions in the original MAF are not ordered. Is it possible that the MAF data could be ordered this way before running the function? I am not sure it is safe to assume the users will always have sorted MAFs.

laml.plus.gistic function throw an error

Hi
When i try integrating maf and gistic files i get this error, command error and seesion info below...

command:
laml.plus.gistic = read.maf(maf = laml, removeSilent = TRUE, useAll = FALSE, gisticAllLesionsFile = "C:/Users/akrishan/Desktop/gistic/qvaluerelaxed/1421851/segmentation.all_lesions.conf_90.txt", gisticAmpGenesFile = "C:/Users/akrishan/Desktop/gistic/qvaluerelaxed/1421851/segmentation.amp_genes.conf_90.txt",gisticDelGenesFile = "C:/Users/akrishan/Desktop/gistic/qvaluerelaxed/1421851/segmentation.del_genes.conf_90.txt", isTCGA = TRUE)

error:
reading maf..
Error in as.character.default(x) :
no method for coercing this S4 class to a vector

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base

other attached packages:
[1] maftools_1.0.30 Biobase_2.34.0 BiocGenerics_0.20.0 mvtnorm_1.0-5
[5] plyr_1.8.4 BiocInstaller_1.24.0 lazyeval_0.2.0

loaded via a namespace (and not attached):
[1] nlme_3.1-128 bitops_1.0-6 doParallel_1.0.10
[4] RColorBrewer_1.1-2 prabclus_2.2-6 GenomeInfoDb_1.10.1
[7] tools_3.3.2 R6_2.2.0 KernSmooth_2.23-15
[10] DBI_0.5-1 colorspace_1.3-2 trimcluster_0.1-2
[13] nnet_7.3-12 GetoptLong_0.1.5 pkgmaker_0.22
[16] labeling_0.3 rtracklayer_1.34.1 slam_0.1-40
[19] diptest_0.75-7 caTools_1.17.1 scales_0.4.1
[22] DEoptimR_1.0-8 robustbase_0.92-7 NMF_0.20.6
[25] stringr_1.1.0 digest_0.6.10 Rsamtools_1.26.1
[28] cometExactTest_0.1.3 XVector_0.14.0 changepoint_2.2.2
[31] BSgenome_1.42.0 GlobalOptions_0.0.10 RSQLite_1.1-1
[34] shape_1.4.2 zoo_1.7-14 mclust_5.2
[37] BiocParallel_1.8.1 DPpackage_1.1-6 gtools_3.5.0
[40] dendextend_1.3.0 dplyr_0.5.0 VariantAnnotation_1.20.2
[43] RCurl_1.95-4.8 magrittr_1.5 modeltools_0.2-21
[46] wordcloud_2.5 Matrix_1.2-7.1 Rcpp_0.12.8
[49] munsell_0.4.3 S4Vectors_0.12.1 stringi_1.1.2
[52] whisker_0.3-2 MASS_7.3-45 SummarizedExperiment_1.4.0
[55] zlibbioc_1.20.0 flexmix_2.3-13 gplots_3.0.1
[58] grid_3.3.2 gdata_2.17.0 ggrepel_0.6.5
[61] lattice_0.20-34 Biostrings_2.42.1 cowplot_0.7.0
[64] splines_3.3.2 GenomicFeatures_1.26.2 circlize_0.3.9
[67] ComplexHeatmap_1.12.0 GenomicRanges_1.26.1 rjson_0.2.15
[70] fpc_2.1-10 rngtools_1.2.4 reshape2_1.4.2
[73] codetools_0.2-15 biomaRt_2.30.0 stats4_3.3.2
[76] XML_3.98-1.5 data.table_1.10.0 foreach_1.4.3
[79] gtable_0.2.0 kernlab_0.9-25 assertthat_0.1
[82] ggplot2_2.2.0 gridBase_0.4-7 xtable_1.8-2
[85] class_7.3-14 survival_2.40-1 tibble_1.2
[88] iterators_1.0.8 GenomicAlignments_1.10.0 AnnotationDbi_1.36.0
[91] registry_0.3 memoise_1.0.0 IRanges_2.8.1
[94] cluster_2.0.5

Gistic amplifications are consistently dropped

I was having a hard time pulling in Gistic results. Once I found a file that would successfully be parsed, I noticed that none of the expected amplifications were being shown. I came across this line in the readGistic function:

all.lesions.melt = all.lesions.melt[value %in% 1]

I believe that this removes all of the gains. I changed it to:

all.lesions.melt = all.lesions.melt[value %in% c(1,2)]

Now it seems to work.

oncoplot writes on existing plot

Using maftools version 1.0.40.

When I call oncoplot, it does not clear an existing plot window, but instead draws in to of the existing material. I suspect that the problem arises in the use of ComplexHeatmap some where, but haven't been able to track down exactly where.

writeup/DOI

This is a great toolkit you have assembled. Are you planning to write it up, deposit a preprint, submit to BioC, or get a DOI for your repository? It is not a trivial amount of work and I would like to cite you for it.

How to use subsetMaf to subset to Variant_Classification 5'UTR? (i.e. escape single quote)

I am not sure how to escape the single quote after the number 5 in the term 5'UTR correctly, so that it works in the function subsetMaf. This same problem will also apply to 3'UTR, 3'Flank and 5'Flank.

Here is what I tried so far:

laml.maf = system.file('extdata', 'tcga_laml.maf.gz', package = 'maftools')
laml = read.maf(maf = laml.maf, removeSilent = TRUE, useAll = FALSE)
subsetMaf(maf = laml, includeSyn = T, query='''Variant_Classification == "5'UTR"''')
subsetMaf(maf = laml, includeSyn = T, query='Variant_Classification == "5''UTR"')
subsetMaf(maf = laml, includeSyn = T, query='Variant_Classification == "5'''UTR"')

However, these lines all result in:
Error: unexpected string constant in "subsetMaf(maf = laml, includeSyn = T, query='''Variant_Classification == "5'"

When I instead remove the single quote from 5'UTR,
subsetMaf(maf = laml, includeSyn = T, query='Variant_Classification == "5UTR"')
I do not get an error message, but an empty output (as no Variant classification 5UTR is known).

I created a small test file, where I replaced all 5'UTR by 5UTR, and now the previous line of code works fine, resulting in the expected number of observations.

However, I am sure that there is some smart way to escape the quote character?

Segfault when loading Gistic results

I am unable to successfully load a set of Gistic results along with a MAF file for a cohort of ~100 patients. Are there any known bugs or things I may not be doing correctly that could cause this?

reading maf..
NOTE: Removed 1 duplicated variants
Using all variants.
Excluding 18821 silent variants.
ID N
1: Samples 95
2: 3'Flank 720
3: 3'UTR 615
4: 5'Flank 1256
5: 5'UTR 494
6: IGR 182
7: Intron 9459
8: RNA 1212
9: Silent 4882
10: Targeted_Region 1
Processing Gistic files..

*** caught segfault ***
address 0x7c1, cause 'memory not mapped'

Traceback:
1: rbindlist(l, use.names, fill, idcol)
2: data.table::.rbind.data.table(...)
3: rbind(deparse.level, ...)
4: rbind(ampGenes, delGenes)
5: readGistic(gisticAllLesionsFile = gisticAllLesionsFile, gisticAmpGenesFile = gisticAmpGenesFile, gisticDelGenesFile = gisticDelGenesFile, isTCGA = isTCGA)
6: read.maf(maf = "tcga_lohr_namefix_cut.maf", useAll = TRUE, gisticAmpGenesFile ="amp_genes.conf_75.txt", gisticDelGenesFile = "del_genes.conf_75.txt", gisticAllLesionsFile = "all_lesions.conf_75.txt")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

subsetMaf (and depending functions?) not working on data imported with annovarToMaf

Description
subsetMaf() behaves weird / throws error when applied on a maf object created using annovarToMaf(). Other functions like oncoplot, rainfallplot etc behave weird, too, probably due to to the fact that Tumor_Sample_Barcode column is populated with gene names.

Example

generate maf object from provided annovar file

var.annovar <- system.file("extdata", "variants.hg19_multianno.txt", package = "maftools")
var.annovar.maf <- annovarToMaf(annovar = var.annovar, Center = 'CSI-NUS', refBuild = 'hg19', tsbCol = 'Tumor_Sample_Barcode', table = 'ensGene', header = TRUE)

Subsetting results in gene names as Tumor_Sample_Barcode; gives error if mafObj=TRUE:

subsetMaf(maf = var.annovar.maf, query = "Variant_Classification == 'Missense_Mutation'")
#the above works, but gene names appear in Tumor_Sample_Barcode column

subsetMaf(maf = var.annovar.maf, query = "Variant_Classification == 'Missense_Mutation'", mafObj=TRUE)

Creating oncomatrix (this might take a while)..
Error in createOncoMatrix(maf.dat) : object 'gene' not found

When maf object is generated from a maf file, everything seems fine:

laml.input <- system.file("extdata", "tcga_laml.maf.gz", package = "maftools")
laml <- read.maf(maf = laml.input, useAll = FALSE)

Subsetting works in this case:

subsetMaf(maf = laml, query = "Variant_Classification == 'Missense_Mutation'")
subsetMaf(maf = laml, query = "Variant_Classification == 'Missense_Mutation'", mafObj=TRUE)

Further Info:

mafTools was installed from git.

sessionInfo()
R Under development (unstable) (2016-04-26 r70550)                                                                                            [32/1996]
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] RColorBrewer_1.1-2   maftools_0.99.45     devtools_1.12.0     
 [4] BiocInstaller_1.23.9 NMF_0.20.6           cluster_2.0.4       
 [7] rngtools_1.2.4       pkgmaker_0.22        registry_0.3        
[10] Biobase_2.33.3       BiocGenerics_0.19.2 

loaded via a namespace (and not attached):
 [1] nlme_3.1-128                bitops_1.0-6               
 [3] httr_1.2.1                  doParallel_1.0.10          
 [5] prabclus_2.2-6              GenomeInfoDb_1.9.8         
 [7] tools_3.4.0                 R6_2.1.3                   
 [9] KernSmooth_2.23-15          DBI_0.5                    
[11] colorspace_1.2-6            trimcluster_0.1-2          
[13] nnet_7.3-12                 GetoptLong_0.1.4           
[15] withr_1.0.2                 curl_1.2                   
[17] git2r_0.15.0                chron_2.3-47
[19] rtracklayer_1.33.12         labeling_0.3               
[21] slam_0.1-38                 diptest_0.75-7             
[23] caTools_1.17.1              scales_0.4.0               
[25] DEoptimR_1.0-6              mvtnorm_1.0-5              
[27] robustbase_0.92-6           stringr_1.1.0              
[29] digest_0.6.10               Rsamtools_1.25.1           
[31] cometExactTest_0.1.3        XVector_0.13.7             
[33] changepoint_2.2.1           BSgenome_1.41.2            
[35] GlobalOptions_0.0.10        RSQLite_1.0.0              
[37] shape_1.4.2                 zoo_1.7-13                 
[39] mclust_5.2                  BiocParallel_1.7.8         
[41] DPpackage_1.1-6             gtools_3.5.0               
[43] dendextend_1.3.0            dplyr_0.5.0                
[45] VariantAnnotation_1.19.10   RCurl_1.95-4.8             
[47] magrittr_1.5                modeltools_0.2-21          
[49] wordcloud_2.5               Matrix_1.2-7.1             
[51] Rcpp_0.12.7                 munsell_0.4.3              
[53] S4Vectors_0.11.13           stringi_1.1.1              
[55] whisker_0.3-2               MASS_7.3-45                
[57] SummarizedExperiment_1.3.82 zlibbioc_1.19.0            
[59] flexmix_2.3-13              gplots_3.0.1               
[61] plyr_1.8.4                  grid_3.4.0                 
[63] gdata_2.17.0                ggrepel_0.5                
[65] lattice_0.20-33             Biostrings_2.41.4          
[67] cowplot_0.6.2               splines_3.4.0              
[69] GenomicFeatures_1.25.16     circlize_0.3.8             
[71] ComplexHeatmap_1.11.6       GenomicRanges_1.25.93      
[73] rjson_0.2.15                fpc_2.1-10                 
[75] reshape2_1.4.1              codetools_0.2-14           
[77] biomaRt_2.29.2              stats4_3.4.0               
[79] XML_3.98-1.4                data.table_1.9.6           
[81] foreach_1.4.3               gtable_0.2.0               
[83] kernlab_0.9-24              assertthat_0.1             
[85] ggplot2_2.1.0               gridBase_0.4-7             
[87] xtable_1.8-2                class_7.3-14               
[89] survival_2.39-5             tibble_1.2                 
[91] iterators_1.0.8             memoise_1.0.0              
[93] GenomicAlignments_1.9.6     AnnotationDbi_1.35.4       
[95] IRanges_2.7.15

Thanks for looking into it!

Can't plot certain genes only

I only want to plot the significantly mutated genes with oncoplot.

This is what I type:
oncoplot(maf = mm, genes = c("NRAS", "KRAS", "BRAF", "DIS3", "TP53", "ACTG1", "TRAF3"))

And this is the error message I get:
Error in numMat[geneOrder[geneOrder %in% rownames(numMat)], ] :
subscript out of bounds

File uploading

Hi,
getting errors this morning. I am a Mac User

reading maf..
Error in data.table::fread(input = maf, sep = "\t", stringsAsFactors = FALSE, :
Input is either empty or fully whitespace after the skip or autostart. Run again with verbose=TRUE.
It was working well yesterday. Not sure what is going on?
Can you please help

SKCM MAF from GDC can not be read

Hi,

Looks like a really nice package. With the example file it does work like a charm. I just have the problem that when I try to look at the SKCM data availabe from GDC, I'm not able to read the maf file.
The following error occurs:
reading maf..
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names

Maybe it is a trivial problem, but beeing a novice to R I wasn't able to solve it. Any suggestion to be ablte to load the file?
Thanks
David

plotmafSummary insists on reusing old plot windows

Using maftools 1.0.40 with R 3.3.0 and BioConductor 3.4 on a Windows machine. (All of BioConductor has just been upgraded to the latest version.)

I have two different MAF files, and want to compare the results of plotmafSummary in side-by-side windows. However, even if I call "windows()" to initialize a new device, it insists on plotting the second version in the first window (even though it was marked as "inactive" when the command was issued). That is, you get this behavior:

plotmafSummary(maf2) # appears in dev 2
windows() # creates new dev 3
plotmafSummary(maf3) # still appears in dev 2

You can work around this by creating a bunch of windows and killing the lower-numbered ones, which may hint at the source of the problem. In other words, if you use the following commands, you can display three different MAF files:

windows() # open dev 2
windows() # dev 3
windows() # dev 4
# manually kill devices 2 and 3
plotmafSummary(maf4) # shows in dev4
windows() # new dev 2
windows() # new dev 3
# manually kill dev 2
plotmafSummary(maf3) # shows in dev 3
windows() # new dev 2
plotmafSummary(maf2) # shows in dev 2

Anything else, however, and subsequent calls to plotmafSummary appear in the window with the lowest device number.

lollipopPlot list error

maftools has been working great, but I just tried to recreate a previous lollipopPlot using the same code and got this error:

Error: (list) object cannot be coerced to type 'double'

Any suggestions? Thanks!

Error in readGistic

library(maftools)
r2=readGistic(gisticAllLesionsFile='all_lesions.conf_99.txt', gisticAmpGenesFile='amp_genes.conf_99.txt', gisticDelGenesFile='del_genes.conf_99.txt')
Processing Gistic files..

*** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
1: rbindlist(l, use.names, fill, idcol)
2: data.table::.rbind.data.table(...)
3: rbind(deparse.level, ...)
4: rbind(ampGenes, delGenes)
5: readGistic(gisticAllLesionsFile = "all_lesions.conf_99.txt", gisticAmpGenesFile = "amp_genes.conf_99.txt", gisticDelGenesFile = "del_genes.conf_99.txt")

VCF to MAF conversion?

Hi,

This is a great tool and I found it very useful.
most of my variant calls are in VCF format, to use this tool, I have to convert it to MAF first.
Is it possible to do this conversion inside the tool?
Now, I am thinking to use vcf2maf from Cyriac.

Thanks,
Ming

Error when plotting "plotmafSummary"

Hi,
I can successfully plot maf summary plots using your demo data. But when I created my own plot, an error jumped out:
`> plotmafSummary(maf = laml, rmOutlier = TRUE, addStat = 'median', dashboard = TRUE)

Error in [.data.table(vt.plot.dat, , c(2:4), with = FALSE) :
j out of bounds
`
Can you figure out what's the problem?
Thanks!

After subsetMaf, error in oncoplot

laml = read.maf(maf = maf_file, removeSilent = T, useAll = T)
oncoplot(maf = laml, top = 10)
dev.off()
null device 
          1 

and I try to subsetting.

laml = subsetMaf(maf = laml, genes = target_gene_list, mafObj = TRUE)
Creating oncomatrix (this might take a while)..
Sorting..
oncoplot(maf = laml, top = 10)
Error in check.length("fill") : 
  'gpar' element 'fill' must not be length 0

Can you check this please?

Problem converting annovar format to maf format using annovarToMaf

Hi,

I have annovar format files and would like to convert them to maf format. I'm having trouble converting them. It says Tumor_Sample_Barcode not found, though I have added it in my file.

Here is the code, I have used and the error I get.

# Reading the annovar file
ann.file <- read.delim("Annovar/sample.hg19_multianno.txt")
ann.file$Tumor_Sample_Barcode <- "Sample_Test" # add tumor sample column
write.table(ann.file, file="ann_test.txt", row.names = FALSE, 
            sep = "\t", quote = FALSE)

> colnames(ann.file)
 [1] "Chr"                  "Start"                "End"                 
 [4] "Ref"                  "Alt"                  "Func.refGene"        
 [7] "Gene.refGene"         "GeneDetail.refGene"   "ExonicFunc.refGene"  
[10] "AAChange.refGene"     "X1000g2015aug_all"    "SIFT_score"          
[13] "SIFT_pred"            "Polyphen2_HVAR_score" "Polyphen2_HVAR_pred" 
[16] "cosmic70"             "esp6500siv2_all"      "snp138"              
[19] "Tumor_Sample_Barcode"

# The column 'Tumor_Sample_Barcode' exists
> 'Tumor_Sample_Barcode' %in% colnames(ann.file) 
[1] TRUE

> var.annovar.maf = annovarToMaf(annovar = "ann_test.txt", 
                               Center = 'Pitt', refBuild = 'hg19', 
                               tsbCol = "Tumor_Sample_Barcode",
                               header = TRUE)
Error in `[.data.table`(ann, , ann.mand, with = FALSE) : 
  column(s) not found: Tumor_Sample_Barcode

Can you give me suggestions on how to proceed?

Thanks,
Anish.

Package not loaded error in function extractSignatures()

Hi.
I have installed the package from github, and I get the following error:

maf_sig = extractSignatures(mat = maf.tnm, nTry = 6, plotBestFitRes = FALSE)
Estimating best rank..
Timing stopped at: 0.001 0 0.001
Timing stopped at: 0.001 0.001 0.002
Timing stopped at: 0.001 0 0.002
Timing stopped at: 0.001 0.001 0.001
Timing stopped at: 0.003 0.001 0.005
Error in (function (...) : All the runs produced an error:
-#1 [r=2] -> none of the packages are loaded [in call to 'path.package']
-#2 [r=3] -> none of the packages are loaded [in call to 'path.package']
-#3 [r=4] -> none of the packages are loaded [in call to 'path.package']
-#4 [r=5] -> none of the packages are loaded [in call to 'path.package']
-#5 [r=6] -> none of the packages are loaded [in call to 'path.package']

Error in lollipopPlot

Hi,
I am getting an error when trying to use the lollipopPlot function for certain genes. It is working well for some genes.
Here is the command :
FOXO1.lpop = lollipopPlot(maf = laml, gene = 'FOXO1', AACol = 'amino_acid_change',labelPos="all")
Error in FUN(X[[i]], ...) : subscript out of bounds

Thanks for your help,
Valentine

error in extractSignatures: none of the packages are loaded

Hi,

I am using the latest version of maftools on github.

SCLC.sig = extractSignatures(mat = SCLC.tnm, nTry = 6, plotBestFitRes = FALSE)
Estimating best rank..
Timing stopped at: 0.002 0 0.003 
Timing stopped at: 0.001 0 0.002 
Timing stopped at: 0.001 0.001 0.001 
Timing stopped at: 0.002 0 0.001 
Timing stopped at: 0.001 0 0.002 
Error in (function (...)  : All the runs produced an error:
	-#1 [r=2] -> none of the packages are loaded [in call to 'path.package']
	-#2 [r=3] -> none of the packages are loaded [in call to 'path.package']
	-#3 [r=4] -> none of the packages are loaded [in call to 'path.package']
	-#4 [r=5] -> none of the packages are loaded [in call to 'path.package']
	-#5 [r=6] -> none of the packages are loaded [in call to 'path.package']

what packages am I missing?

Thanks.
Tommy

A little bug in line 239 in summarizeMaf.R

In your code: colnames(mdf) = gsub(pattern = "^X", replacement = "", colnames(mdf))
which means you consider that our samples will be marked as numeric, and in R , it will add a "X" for the sample name.
But if our sample's name begin with X, you also remove the "X" , which cause a bug just like below:
oncomat.copy[, colnames(mdf)] : subscript out of bounds

We should change our sample's name as the only solution now.
May you can fix that bug.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.