Coder Social home page Coder Social logo

Comments (3)

ChristopherEeles avatar ChristopherEeles commented on August 16, 2024

Hi @calisirkubra,

Without inclusion of a reproducible example demonstrating the change in centroid values or cluster assignments, it is difficult for me to help you with that issue. We have not made changes to the included package data in the last year nor have we made any changes to code implemented in the package. I would also need to know information about your R environment, as returned by the sessionInfo function.

Regarding TCGABioLinks, I am not familiar with that package nor the methods they employ for their sub-typing analysis. If you think there is an issue with their implementation I would recommend opening an issue on their repository. I suspect they will also request a minimally reproducible example to investigate any bugs you may have found.

As noted in #22 and #26, the genefu package was designed primarily for subtyping Affymetrix microarray data, and while it may produce useful result from RNA sequencing data that is something that would require justification and a careful consideration of differences in the design of the microarray platform used to derive the original signature and the RNA sequencing experiment use for your specific data. We do not recommend simply plugging data into the package an expecting useful result, you will need to apply your scientific best judgment as to whether the results are correct or useful outside the original microarray context this package as designed for.

Best,
Christopher Eeles
Software Developer
Haibe-Kains Lab | PM-Research
University Health Network

from genefu.

calisirkubra avatar calisirkubra commented on August 16, 2024

Dear @ChristopherEeles ,

You can find the centroids and session info below. I have additionally two other questions.

  1. Aren't these centroids should be same whatever data I use since they are the official centroids from Parker 2009?
  2. What are your suggestions for RNA-SEQ data adjustment for genefu package?

pam50.robust$centroids (now)
Basal Her2 LumA LumB Normal
ACTR3B 0.71833189 -0.48166568 0.009981070 -0.19055133 0.46572287
ANLN 0.53737230 0.26693161 -0.579245716 0.09880418 -0.83693959
BAG1 -0.57450687 -0.47607287 0.758221161 -0.40545862 0.31655297
BCL2 -0.11876043 -0.15791396 0.287487440 -0.44133950 0.53397887
BIRC5 0.30048864 0.40573310 -0.881434366 0.60385078 -0.87663642
BLVRA -0.64267751 0.33533604 0.042042017 0.69120496 -0.16341281
CCNB1 0.19120814 0.13547665 -0.491662114 0.50317636 -0.54526931
CCNE1 0.56027103 0.06687223 -0.430291227 -0.01666143 -0.25547606
CDC20 0.39969524 0.00835552 -0.469044010 -0.07041247 -0.04550481
CDC6 0.15941828 0.58900682 -0.612824305 0.51089597 -0.59575217
CDCA1 0.47240017 -0.02381921 -0.712520819 0.58962688 -0.37053337
CDH3 0.50836201 0.21088969 -0.513649344 -1.41913444 0.75792062
CENPF 0.48297629 -0.02926616 -0.543740234 0.27822856 -0.07058307
CEP55 0.56774889 0.27638102 -0.746721735 0.46001576 -1.16237419
CXXC5 -0.92038581 -0.24155061 0.467411571 0.32133502 0.05090144
EGFR -0.03041685 -0.09638262 0.009162963 -0.41240126 0.34163708
ERBB2 -0.80835398 1.75984423 0.608191264 0.15965187 -0.87023846
ESR1 -2.74651309 -1.51311125 2.161411882 1.60589991 -0.41828235
EXO1 0.42809036 0.04929719 -0.567474505 0.14124128 -0.45078054
FGFR4 -0.27123802 0.82177815 0.170811925 -0.24703604 0.85747278
FOXA1 -2.62694672 0.02282715 1.017457421 0.36075779 -0.78281211
FOXC1 1.49045147 -0.94717419 -0.174957960 -1.56485496 1.11154786
GPR160 -1.05497467 0.58319483 0.685489973 0.71440760 -0.42356847
GRB7 -0.27612859 1.03065778 0.041568986 0.08775089 0.24171099
KIF2C 0.20357258 -0.16510205 -0.505394668 -0.18289071 -0.39001448
KNTC2 0.60035617 0.04254679 -0.588220989 0.38670684 -1.06962886
KRT14 0.09682672 -0.44364614 0.368375943 -0.63944697 1.73568631
KRT17 0.48256553 -0.33783710 0.014209862 -1.46374293 1.75959844
KRT5 0.50664042 -0.42826178 0.215320068 -0.91160727 1.78511690
MAPT -0.42582927 -0.35750654 0.700622718 -0.19034057 0.11782850
MDM2 -0.25136621 -0.10672868 0.141957430 -0.13377904 0.27421401
MELK 0.52303387 0.19801311 -0.582088108 0.44793463 -0.74376468
MIA 1.57827637 -0.90489862 -0.165258584 -1.42292627 2.03885956
MKI67 0.47653745 0.06566236 -0.501871622 -0.14521787 -0.16600406
MLPH -0.33997246 -0.19522866 0.339304418 -0.45614992 0.75075837
MMP11 -0.55603767 0.50675876 -0.006255090 0.33419931 -2.32698512
MYBL2 0.38989345 0.20526358 -0.843569931 0.46728199 -0.60170475
MYC 0.17876381 -1.04683283 -0.090830821 0.01526440 1.02917620
NAT1 -0.93684895 -0.08998849 2.922786792 0.47078804 -0.36327376
ORC6L 0.21630480 0.20440245 -0.352220667 0.11062765 -0.25587949
PGR -0.42913339 -0.27940992 0.445785003 -0.44883984 0.12601148
PHGDH 0.63451887 -0.18662586 -0.398682234 -1.03013932 0.66043775
PTTG1 0.26413189 0.05580989 -0.634468270 0.24972528 -0.54978126
RRM2 0.15620468 0.68272489 -0.950760200 0.35066384 -1.12105493
SFRP1 0.98798846 -1.04820267 0.131566364 -1.72045826 2.43628867
SLC39A6 -1.05112505 -0.69573646 2.061459075 1.65330302 0.11688969
TMEM45B -1.10945818 1.33063617 0.446242045 0.37568823 0.03620891
TYMS 0.44980090 0.05294490 -0.644602075 0.49260652 -0.72698945
UBE2C 0.21853415 0.06108060 -0.519818399 0.29279931 -0.40889468
UBE2T 0.38990890 0.28453681 -0.539259391 0.73895213 -0.95238101

pam50.robust$centroids (one moth ago)
  | Basal | Her2 | LumA | LumB | Normal
ACTR3B | 0.718331891 | -0.481665675 | 0.00998107 | -0.190551328 | 0.465722871
ANLN | 0.537372301 | 0.266931609 | -0.579245716 | 0.098804179 | -0.836939593
BAG1 | -0.574506867 | -0.476072868 | 0.758221161 | -0.405458622 | 0.316552973
BCL2 | -0.11876043 | -0.157913959 | 0.28748744 | -0.441339498 | 0.533978871
BIRC5 | 0.300488641 | 0.405733099 | -0.881434366 | 0.603850777 | -0.876636424
BLVRA | -0.642677513 | 0.335336041 | 0.042042017 | 0.691204962 | -0.163412812
CCNB1 | 0.191208143 | 0.135476652 | -0.491662114 | 0.503176358 | -0.545269312
CCNE1 | 0.560271028 | 0.066872232 | -0.430291227 | -0.01666143 | -0.255476058
CDC20 | 0.399695242 | 0.00835552 | -0.46904401 | -0.070412466 | -0.04550481
CDC6 | 0.159418279 | 0.58900682 | -0.612824305 | 0.510895969 | -0.595752175
CDCA1 | 0.472400168 | -0.023819207 | -0.712520819 | 0.589626883 | -0.370533365
CDH3 | 0.508362012 | 0.210889692 | -0.513649344 | -1.419134437 | 0.757920624
CENPF | 0.482976288 | -0.02926616 | -0.543740234 | 0.278228556 | -0.070583075
CEP55 | 0.567748894 | 0.276381022 | -0.746721735 | 0.460015762 | -1.162374186
CXXC5 | -0.920385813 | -0.241550612 | 0.467411571 | 0.32133502 | 0.050901436
EGFR | -0.030416849 | -0.096382621 | 0.009162963 | -0.412401259 | 0.341637082
ERBB2 | -0.80835398 | 1.759844231 | 0.608191264 | 0.159651874 | -0.870238456
ESR1 | -2.746513086 | -1.513111253 | 2.161411882 | 1.605899914 | -0.418282349
EXO1 | 0.428090356 | 0.049297194 | -0.567474505 | 0.141241282 | -0.450780537
FGFR4 | -0.271238025 | 0.821778152 | 0.170811925 | -0.247036038 | 0.857472777
FOXA1 | -2.626946721 | 0.022827151 | 1.017457421 | 0.360757795 | -0.782812106
FOXC1 | 1.490451472 | -0.947174192 | -0.17495796 | -1.564854964 | 1.111547864
GPR160 | -1.054974672 | 0.583194826 | 0.685489973 | 0.714407601 | -0.423568467
GRB7 | -0.276128586 | 1.03065778 | 0.041568986 | 0.087750887 | 0.241710991
KIF2C | 0.20357258 | -0.165102048 | -0.505394668 | -0.182890713 | -0.390014484
KNTC2 | 0.600356167 | 0.042546792 | -0.588220989 | 0.386706844 | -1.06962886
KRT14 | 0.096826723 | -0.443646142 | 0.368375943 | -0.639446966 | 1.73568631
KRT17 | 0.482565528 | -0.337837101 | 0.014209862 | -1.463742934 | 1.759598437
KRT5 | 0.506640416 | -0.428261778 | 0.215320068 | -0.91160727 | 1.785116895
MAPT | -0.425829273 | -0.357506541 | 0.700622718 | -0.190340574 | 0.117828498
MDM2 | -0.251366205 | -0.106728681 | 0.14195743 | -0.133779037 | 0.274214011
MELK | 0.523033872 | 0.198013115 | -0.582088108 | 0.44793463 | -0.743764676
MIA | 1.578276368 | -0.904898621 | -0.165258584 | -1.422926267 | 2.038859556
MKI67 | 0.476537448 | 0.06566236 | -0.501871622 | -0.145217868 | -0.166004063
MLPH | -0.339972459 | -0.195228658 | 0.339304418 | -0.456149915 | 0.750758365
MMP11 | -0.556037672 | 0.50675876 | -0.00625509 | 0.334199309 | -2.326985119
MYBL2 | 0.389893452 | 0.20526358 | -0.843569931 | 0.46728199 | -0.601704754
MYC | 0.178763812 | -1.046832832 | -0.090830821 | 0.015264397 | 1.029176203
NAT1 | -0.936848946 | -0.089988492 | 2.922786792 | 0.470788042 | -0.363273764
ORC6L | 0.216304797 | 0.204402449 | -0.352220667 | 0.11062765 | -0.255879493
PGR | -0.429133389 | -0.279409916 | 0.445785003 | -0.448839844 | 0.126011482
PHGDH | 0.63451887 | -0.186625862 | -0.398682234 | -1.030139318 | 0.660437753
PTTG1 | 0.264131894 | 0.055809895 | -0.63446827 | 0.249725281 | -0.54978126
RRM2 | 0.156204676 | 0.682724889 | -0.9507602 | 0.35066384 | -1.12105493
SFRP1 | 0.987988459 | -1.048202667 | 0.131566364 | -1.720458262 | 2.436288668
SLC39A6 | -1.051125052 | -0.695736457 | 2.061459075 | 1.653303022 | 0.116889694
TMEM45B | -1.109458181 | 1.330636172 | 0.446242045 | 0.375688226 | 0.036208914
TYMS | 0.449800897 | 0.052944897 | -0.644602075 | 0.492606521 | -0.726989454
UBE2C | 0.218534147 | 0.061080598 | -0.519818399 | 0.292799306 | -0.408894685
UBE2T | 0.389908898 | 0.284536813 | -0.539259391 | 0.738952133 | -0.952381005

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.0.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4 grid stats graphics grDevices utils datasets methods base

other attached packages:
[1] htmlwidgets_1.5.4 randomForest_4.7-1 png_0.1-7
[4] heatmaply_1.3.0 viridis_0.6.2 viridisLite_0.4.0
[7] forcats_0.5.1 stringr_1.4.0 purrr_0.3.4
[10] readr_2.1.2 tidyr_1.2.0 tibble_3.1.7
[13] tidyverse_1.3.1 plotly_4.10.0 gplots_3.1.3
[16] edgeR_3.36.0 limma_3.50.3 DESeq2_1.34.0
[19] SummarizedExperiment_1.24.0 MatrixGenerics_1.6.0 matrixStats_0.62.0
[22] GenomicRanges_1.46.1 GenomeInfoDb_1.30.1 IRanges_2.28.0
[25] S4Vectors_0.32.4 InteractiveComplexHeatmap_1.2.0 shinycssloaders_1.0.0
[28] dplyr_1.0.9 ComplexHeatmap_2.10.0 RColorBrewer_1.1-3
[31] DT_0.23 shinydashboard_0.7.2 shiny_1.7.1
[34] TCGAbiolinks_2.22.4 caret_6.0-92 lattice_0.20-45
[37] ggplot2_3.3.6 rmeta_3.0 xtable_1.8-4
[40] genefu_2.26.0 AIMS_1.26.0 Biobase_2.54.0
[43] BiocGenerics_0.40.0 e1071_1.7-9 iC10_1.5
[46] iC10TrainingData_1.3.1 impute_1.68.0 pamr_1.56.1
[49] cluster_2.1.3 biomaRt_2.50.3 survcomp_1.44.1
[52] prodlim_2019.11.13 survival_3.3-1

loaded via a namespace (and not attached):
[1] utf8_1.2.2 R.utils_2.11.0 tidyselect_1.1.2 RSQLite_2.2.14
[5] AnnotationDbi_1.56.2 TSP_1.2-0 BiocParallel_1.28.3 pROC_1.18.0
[9] munsell_0.5.0 codetools_0.2-18 future_1.25.0 withr_2.5.0
[13] colorspace_2.0-3 filelock_1.0.2 knitr_1.39 rstudioapi_0.13
[17] listenv_0.8.0 labeling_0.4.2 GenomeInfoDbData_1.2.7 farver_2.1.0
[21] bit64_4.0.5 downloader_0.4 parallelly_1.31.1 vctrs_0.4.1
[25] generics_0.1.2 ipred_0.9-12 xfun_0.31 BiocFileCache_2.2.1
[29] R6_2.5.1 doParallel_1.0.17 clue_0.3-60 seriation_1.3.5
[33] locfit_1.5-9.5 bitops_1.0-7 cachem_1.0.6 DelayedArray_0.20.0
[37] assertthat_0.2.1 promises_1.2.0.1 scales_1.2.0 nnet_7.3-17
[41] gtable_0.3.0 globals_0.15.0 timeDate_3043.102 rlang_1.0.2
[45] clisymbols_1.2.0 genefilter_1.76.0 systemfonts_1.0.4 GlobalOptions_0.1.2
[49] splines_4.1.2 lazyeval_0.2.2 ModelMetrics_1.2.2.2 broom_0.8.0
[53] yaml_2.3.5 modelr_0.1.8 BiocManager_1.30.17 reshape2_1.4.4
[57] crosstalk_1.2.0 backports_1.4.1 httpuv_1.6.5 rsconnect_0.8.25
[61] tools_4.1.2 lava_1.6.10 ellipsis_0.3.2 kableExtra_1.3.4
[65] jquerylib_0.1.4 proxy_0.4-26 Rcpp_1.0.8.3 plyr_1.8.7
[69] progress_1.2.2 zlibbioc_1.40.0 RCurl_1.98-1.6 prettyunits_1.1.1
[73] rpart_4.1.16 GetoptLong_1.0.5 fontawesome_0.2.2 haven_2.5.0
[77] fs_1.5.2 magrittr_2.0.3 data.table_1.14.2 circlize_0.4.15
[81] reprex_2.0.1 hms_1.1.1 TCGAbiolinksGUI.data_1.14.1 mime_0.12
[85] evaluate_0.15 XML_3.99-0.9 readxl_1.4.0 mclust_5.4.9
[89] gridExtra_2.3 shape_1.4.6 compiler_4.1.2 KernSmooth_2.23-20
[93] crayon_1.5.1 R.oo_1.24.0 htmltools_0.5.2 later_1.3.0
[97] tzdb_0.3.0 geneplotter_1.72.0 lubridate_1.8.0 DBI_1.1.2
[101] SuppDists_1.1-9.7 dbplyr_2.1.1 MASS_7.3-57 rappdirs_0.3.3
[105] Matrix_1.4-1 cli_3.3.0 R.methodsS3_1.8.1 parallel_4.1.2
[109] gower_1.0.0 pkgconfig_2.0.3 registry_0.5-1 recipes_0.2.0
[113] xml2_1.3.3 foreach_1.5.2 svglite_2.1.0 annotate_1.72.0
[117] bslib_0.3.1 hardhat_0.2.0 webshot_0.5.3 XVector_0.34.0
[121] rvest_1.0.2 digest_0.6.29 Biostrings_2.62.0 cellranger_1.1.0
[125] rmarkdown_2.14 dendextend_1.15.2 curl_4.3.2 gtools_3.9.2
[129] rjson_0.2.21 lifecycle_1.0.1 nlme_3.1-157 jsonlite_1.8.0
[133] survivalROC_1.0.3 fansi_1.0.3 pillar_1.7.0 KEGGREST_1.34.0
[137] fastmap_1.1.0 httr_1.4.3 glue_1.6.2 iterators_1.0.14
[141] bit_4.0.4 sass_0.4.1 class_7.3-20 stringi_1.7.6
[145] bootstrap_2019.6 blob_1.2.3 caTools_1.18.2 memoise_2.0.1
[149] future.apply_1.9.0

from genefu.

ChristopherEeles avatar ChristopherEeles commented on August 16, 2024

Hi @calisirkubra,

The differences in those two centroid matrices looks like rounding error to me. It's possible that the rounding is due to a change to the print method and the actual data is the same. You could check with by looking at the equality of the two matrices in memory.

This issue talks a bit more about the options(digits) setting: https://stackoverflow.com/questions/4540649/retain-numerical-precision-in-an-r-data-frame.

Either way, the difference---if it is a real one---is in the deep decimal places and in unlikely to affect the classification results.

Regarding the use of RNA-sequencing data we generally do not recommend this. If you plan to do so, please use your best scientific judgment. See #22 for a discussion of some of the issues with using microarray derived signatures on RNA-sequencing data.

Best,
Christopher Eeles
Software Developer
Haibe-Kains Lab | PM-Research
University Health Network

from genefu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.