Project-specific workflow for EPIC methylaytion analysis included in the publication LINK.
All analysis was performed in R [1] (version 3.6.3). R functions are referenced in the following format packageName::functionName()
.
Methylation data was generated using Infinium methylation EPIC array. The samples included in this study were a subset of a larger cohort of 96 sample, processed collectively. EPIC methylation array IDAT files for all 96 samples were jointly loaded using the minfi package (version 1.32.0) [2]. Both sample quality and probe position quality was assessed using the minifi::detectionP()
function and samples or positions with estimated p-values equal to or greater than 0.05 were excluded. The full array of 96 samples were normalised collectively using the minfi::preprocessFunnorm()
method to regress out known variability detected in the control probes. Probes overlapping with SNP positions (MAF >= 0.05) were excluded using the minfi::dropLociWithSnps()
function. Probes on non-autosomal chromosomes were also exlcuded as they were not considered necessary for the analysis and may have led to unintended stratification of samples. Known cross-reactive probes were excluded by probe name using the list generated by Pidsley et al. [3]. After sample and probe exclusion, a subset of samples corresponding to the PCC/PGL samples described in the publication were extracted for analysis.
Probe beta values were calculated using the minfi::getBeta()
function which extracts beta values as Beta = Meth / (Meth + Unmeth + offset)
, where offset is utilised to avoid division by small numerical values. Mean beta values for each sample were then computed across all probes included in a given method (full, top 20% variance etc.) and compared between wild-type and mutant PCC/PPGL samples using a Mann-whitney U test (stats::wilcox.test()
function).
Gene loci and promotor CpG island data for KDM4A, KDM4B, and KDM4C loci was downloaded from the UCSC genome track browser [4]. Regions were intersected with methylation data using the package GenomicRanges package (version 1.38.0) [6] and beta values between wild-type and mutant PCC/PPGL samples at CpG islands spanning the gene loci and CpG islands attributable to the promotor region (2kb upstream of the start codon), using a Mann-whitney U test (stats::wilcox.test()
function).
Differentially methylated region analysis was performed using the bumphunter package (version 1.28.0) [5]. Differential methylation was tested between wild-type and mutant PCC/PPGL samples using a methylation cutoff of 0.2
and 100
permutations. Identified bumps were intersected with annotated gene loci using bumphunter::matchGenes()
function, where gene annotation data was extracted from the TxDb.Hsapiens.UCSC.hg19.knownGene package (version 3.2.2) [7].
Correlations between mean beta values and empirically-determined succinate concentrations was performed using a spearmans rank correlation test (stats::cor.test()
function).
To be added on data archive submission.
- [1] R Project for Statistical Computing
- [2] Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays
- [3] Critical evaluation of the Illumina MethylationEPIC BeadChip microarray for whole-genome DNA methylation profiling
- [4] Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser
- [5] Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies
- [6] Software for Computing and Annotating Genomic Ranges
- [7] Annotation package for TxDb object(s)