The gene_expression_preprocess from nieshw

gene_expression_preprocess's Introduction

Gene Expression Data Pre-processing

Because the raw gene expression data consist of missing and repeated gene Entrez ID, dataset pre-processing was performed. Missing and repeated data can lead to poor survival analysis and the incorrect interpretation of predictors like the diagnosis stage [1]. Based on [2,3 the missing gene Entrez IDs were removed, and the gene expression values of the repeated gene Entrez IDs were averaged across all of the samples. Table 1 presents the de-tails of the gene expression data used in this research.

References

Nur, U.; Shack, L.G.; Rachet, B.; Carpenter, J.R.; Coleman, M.P. Modelling relative survival in the presence of incomplete data: A tutorial. Int. J. Epidemiol. 2009, 39, 118–128.
Liu, W.; Wang, W.; Tian, G.; Xie, W.; Lei, L.; Liu, J.; Huang, W.; Xu, L.; Li, E. Topologically inferring pathway activity for precise survival outcome prediction: Breast cancer as a case. Mol. Biosyst. 2017, 13, 537–548.
Mohammed, A.; Biegert, G.; Adamec, J.; Helikar, T. Identification of potential tissue-specific cancer biomarkers and devel-opment of cancer versus normal genomic classifiers. Oncotarget 2017, 8, 85692–85715, doi:10.18632/oncotarget.21127.

Recommend Projects

nieshw / gene_expression_preprocess Goto Github PK

gene_expression_preprocess's Introduction

gene_expression_preprocess's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent