Tumor shape is a key factor that affects tumor growth and metastasis. We propose a topological feature computed by persistent homology to characterize tumor progression from digital pathology and radiology images and examines its effect on the time-to-event data. Two case studies are conducted using consecutive from lung cancer pathology imaging data from the National Lung Screening Trial (NLST) and brain tumor radiology imaging data from the The Cancer Imaging Archive (TCIA). The results of both studies show that the topological features predict survival prognosis after adjusting clinical variables, and the predicted high-risk groups have worse survival outcomes than the low-risk groups. Also, the topological shape features found to be positively associated with survival hazards are irregular and heterogeneous shape patterns, which are known to be related to tumor progression. Persistence homology over the cubical complex is computed by GUDHI. The persistence diagram txt files have three columns of dimension, birth, and death.
simulation_1_data_generation.ipynb
: Generate binary tumor images, SEDT-2 transform, and compute persistent homology for scenario 1simulation_1_result.R
: Fit the Cox and the functional Cox proportional-hazards models and draw summary plots for scenario 1simulation_2_data_generation.ipynb
: Generate binary tumor images, SEDT-2 transform, and compute persistent homology for scenario 2simulation_2_result.R
: Fit the Cox and the functional Cox proportional-hazards models and draw summary plots for scenario 2
- For the NLST datasets, the raw imaging data are available at NLST. The users need to fill out data request form and apply for permission in order to download the data. We only provided three example datasets.
- The size of distance transform data is too large, so we only include three examples of the NLST dataset in the image subfolder.
lung_sedt3_persistent_homology.ipynb
: Generate binary tumor images, SEDT-3 transform, and compute persistent homology for the NLST lung cancer pathology imageslung_fcoxph_functions.R
: Functions of the Cox and the functional Cox proportional-hazards models for the NLST lung cancer pathology imageslung_fcoxph_main.R
: Fit the Cox and the functional Cox proportional-hazards models and draw summary plots for the NLST lung cancer pathology imageslung_cv_mdw_function.R
: Functions for finding smoothing paramters for the maximum distance weight using cross validationslung_cv_linear_function.R
: Functions for finding smoothing paramters for the linear weight using cross validationslung_cv_pwgk_function.R
: Functions for finding smoothing paramters for the persistence weighted Gaussian kernel weight using cross validationslung_cv_main.R
: Find smoothing parameters under the three weightsclinical_info_lung.Rdata
: clinical data for for the NLST lung cancer patients
- Due to the size issue, the simulated pixel-rearranged images are not provided.
lung_simulation_data_generation.ipynb
: Generate the lung cancer pathology images with false shape information, SEDT-3 transform, and compute persistent homologylung_simulation_functions.R
: Functions of the Cox and the functional Cox proportional-hazards models for the the pixel-rearranged lung cancer pathology imageslung_simulation_main.R
: Fit the Cox and the functional Cox proportional-hazards models and draw summary plots for the the pixel-rearranged lung cancer pathology imagesclinical_info_lung.Rdata
: clinical data for for the NLST lung cancer patients
- For the TCIA datasets, the imaging data are available at the public repository. The users need to download image data from the repository.
- The size of distance transform data is too large, so we only include three distance transofrm examples of the TCIA dataset in the sedt subfolder. However, the same distance transform images could be computed given images.
brain_sedt2_persistent_homology.ipynb
: Generate binary tumor images, SEDT-2 transform, and compute persistent homology for the TCIA brain tumor imagesbrain_fcoxph_functions.R
: Functions of the Cox and the functional Cox proportional-hazards models for the TCIA brain tumor imagesbrain_fcoxph_main.R
: Fit the Cox and the functional Cox proportional-hazards models and draw summary plots for the TCIA brain tumor imagesbrain_fcoxph_sect.R
: Fit the functional Cox proportional-hazards model and draw summary plots for the TCIA brain tumor images using the Smooth Euler Characteristic Transform (SECT) of Crawford et al. (2019)brain_cv_mdw_function.R
: Functions for finding smoothing paramters for the maximum distance weight using cross validationsbrain_cv_linear_function.R
: Functions for finding smoothing paramters for the linear weight using cross validationsbrain_cv_pwgk_function.R
: Functions for finding smoothing paramters for the persistence weighted Gaussian kernel weight using cross validationsbrain_cv_main.R
: Find smoothing parameters under the three weightsclinical_data_brain.csv
: clinical data for for the TCIA brain tumor patients