aertslab / cistopic Goto Github PK
View Code? Open in Web Editor NEWcisTopic: Probabilistic modelling of cis-regulatory topics from single cell epigenomics data
cisTopic: Probabilistic modelling of cis-regulatory topics from single cell epigenomics data
Hi,
This may well be user error, but I'm struggling to locate the likelihoods generated by runModels when returnType = 'selectedModel'.
[email protected]
Object of class "data.frame"
data frame with 0 columns and 0 rows
I'd love to be able to create a plot similar to what is generated by default with selectModel, but this only seems to work when returnType has the default value, and it sure would be handy to have for a larger set of tested models.
Thanks!
Hi!
The tutorial writes "For initializing the cisTopic object:
Starting from the bam files and predefined regions [Reference running time: 0.4 sec/cell]
pathToBams <- 'data/bamfiles/'
bamFiles <- paste(pathToBams, list.files(pathToBams), sep='')
regions <- 'data/regions.bed' "
and your paper said"a BED file with candidate regulatory regions (for example, from peak calling on the aggregate or the bulk profile)."
So, if my single cell data is marked by H3K36me3, should I use bulk H3K36me3 WT data to call peaks for region bed file ? Or use aggregated single cell data?
Hi,
One more question:
Also 5k PBMC tutorial:
cisTopicObject <- topicsRcisTarget(cisTopicObject, genome='hg19', pathToFeather, reduced_database=FALSE, nesThreshold=3, rocthr=0.005, maxRank=20000, nCores=24)
gives me an error:
Error in openFeather(path) : IO error: lseek failed
Help would be appreciated!
Hi,
i ran into the following issue:
library(feather)
cisTopicObject_d0 <- topicsRcisTarget(cisTopicObject_d0, genome='mm9', pathToFeather, reduced_database=FALSE, nesThreshold=3, rocthr=0.005, maxRank=20000, nCores=24)
[1] "Exporting data to clusters..."
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
24 nodes produced errors; first error: package or namespace load failed for ‘RcisTarget’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
there is no package called ‘feather’
as you see, the 'feather' package loads without issues. However, topicsRcisTarget produces this error. I vaguely recall a conversation with authors of the 'scenic' package on feather v0.3.3 (which is what i have installed) not being compatible and on having to roll it back to v0.3.1 as far as i remember. Is this the case with cisTopic too?
Thank you!
Joe
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] rtracklayer_1.40.6 R.utils_2.9.0 R.oo_1.22.0 R.methodsS3_1.7.1
[5] Seurat_3.0.2 ggplot2_3.2.0 RcisTarget_1.5.0 feather_0.3.3
[9] cisTopic_0.2.1 BiocParallel_1.14.2 doParallel_1.0.15 iterators_1.0.12
[13] foreach_1.4.7 densityClust_0.3 org.Mm.eg.db_3.7.0 TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.4
[17] GenomicFeatures_1.34.8 AnnotationDbi_1.44.0 Biobase_2.42.0 ChIPseeker_1.18.0
[21] rGREAT_1.14.0 GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 IRanges_2.16.0
[25] S4Vectors_0.20.1 BiocGenerics_0.28.0 data.table_1.12.2 fastcluster_1.1.25
[29] ComplexHeatmap_1.20.0 Rtsne_0.15 umap_0.2.2.0 Rsubread_1.32.4
[33] httpuv_1.5.1
loaded via a namespace (and not attached):
[1] reticulate_1.10 tidyselect_0.2.5 htmlwidgets_1.3 RSQLite_2.1.1
[5] munsell_0.5.0 codetools_0.2-16 ica_1.0-2 DT_0.8
[9] future_1.14.0 withr_2.1.2 colorspace_1.4-1 GOSemSim_2.8.0
[13] rstudioapi_0.10 ROCR_1.0-7 DOSE_3.8.2 gbRd_0.4-11
[17] listenv_0.7.0 Rdpack_0.11-0 urltools_1.7.3 GenomeInfoDbData_1.2.0
[21] polyclip_1.10-0 bit64_0.9-7 farver_1.1.0 vctrs_0.2.0
[25] R6_2.4.0 rsvd_1.0.2 bitops_1.0-6 fgsea_1.8.0
[29] gridGraphics_0.4-1 DelayedArray_0.8.0 assertthat_0.2.1 promises_1.0.1
[33] SDMTools_1.1-221.1 scales_1.0.0 ggraph_1.0.2 enrichplot_1.2.0
[37] gtable_0.3.0 npsurv_0.4-0 globals_0.12.4 rlang_0.4.0
[41] zeallot_0.1.0 GlobalOptions_0.1.0 splines_3.5.1 lazyeval_0.2.2
[45] europepmc_0.3 yaml_2.2.0 reshape2_1.4.3 backports_1.1.4
[49] qvalue_2.14.1 tools_3.5.1 ggplotify_0.0.4 gridBase_0.4-7
[53] gplots_3.0.1.1 RColorBrewer_1.1-2 ggridges_0.5.1 Rcpp_1.0.1
[57] plyr_1.8.4 progress_1.2.2 zlibbioc_1.28.0 purrr_0.3.2
[61] RCurl_1.95-4.12 prettyunits_1.0.2 pbapply_1.4-1 GetoptLong_0.1.7
[65] viridis_0.5.1 cowplot_1.0.0 zoo_1.8-6 SummarizedExperiment_1.10.1
[69] ggrepel_0.8.1 cluster_2.1.0 magrittr_1.5 DO.db_2.9
[73] circlize_0.4.6 triebeard_0.3.0 lmtest_0.9-37 RANN_2.6
[77] fitdistrplus_1.0-14 matrixStats_0.54.0 hms_0.5.0 lsei_1.2-0
[81] mime_0.7 xtable_1.8-4 XML_3.98-1.20 AUCell_1.7.1
[85] gridExtra_2.3 shape_1.4.4 compiler_3.5.1 biomaRt_2.36.1
[89] tibble_2.1.3 KernSmooth_2.23-15 crayon_1.3.4 htmltools_0.3.6
[93] later_0.8.0 snow_0.4-3 tidyr_0.8.2 DBI_1.0.0
[97] tweenr_1.0.1 MASS_7.3-51.4 boot_1.3-23 Matrix_1.2-17
[101] gdata_2.18.0 metap_1.1 igraph_1.2.2 pkgconfig_2.0.2
[105] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 rvcheck_0.1.3 GenomicAlignments_1.18.1 plotly_4.9.0
[109] xml2_1.2.0 annotate_1.60.1 lda_1.4.2 XVector_0.22.0
[113] bibtex_0.4.2 stringr_1.4.0 digest_0.6.19 tsne_0.1-3
[117] sctransform_0.2.0 graph_1.60.0 Biostrings_2.48.0 fastmatch_1.1-0
[121] GSEABase_1.42.0 shiny_1.3.2 Rsamtools_1.32.3 gtools_3.8.1
[125] rjson_0.2.20 nlme_3.1-141 jsonlite_1.6 viridisLite_0.3.0
[129] pillar_1.4.2 lattice_0.20-38 httr_1.4.1 plotrix_3.7-6
[133] survival_2.44-1.1 GO.db_3.6.0 glue_1.3.1 FNN_1.1.2.1
[137] UpSetR_1.4.0 png_0.1-7 bit_1.1-14 ggforce_0.2.2
[141] stringi_1.4.3 blob_1.2.0 doSNOW_1.0.18 caTools_1.17.1.1
[145] memoise_1.1.0 dplyr_0.8.1 irlba_2.3.3 future.apply_1.3.0
[149] ape_5.2
I'm experiencing a lot of issues with installing the dependencies for this package from Bioconductor and CRAN, so I would like to make a conda package for it.
According to this GitHub issue, that can only be done if the repository has a release tag. Could you please add one?
Hi I am new to cisTopic, Is there anyway to subset a list of cells?
Hi @cbravo93
Is it appropriate to run cisTopic on single cell ATAC dataset without aggregation in Cell Ranger? Alternatively, would you recommend only running it on aggregated datasets?
Thanks!
Hi,
I was wondering if you have a propose way to analyse multiple 10X runs at once. I'm not very familiar with the R data structure that cisTopic
uses; however, it would be good to have something like an equivalent of concatenate
function from anndata
.
Any suggestions? :)
Hi cisTopic team,
Thank you for developing the cisTopic software. We’re trying it on our scATAC data and find some promising results. But we have several concerns about our results. It will be great if you could provide some suggestions.
We tested different number of topics, but results showed the more topics the more stable model is in our data (attached figure1). Do you have any idea the reason of this?
We also noticed that some of the topics are similar to each other. Is there any good way to merge those similar topics? Is that ok to average the z-score or probabilities for these topics? Or do you think that we’d better manually select lower number of topics in selectModel() step?
Do you have any idea that how many times that each peak/region is really meaningful in contribution to topics in general? I noticed that when the algorithm builds region score, it seems that almost all peaks are used. However, some peaks have very limited contributions. After running binarizecisToipcs() to binarize topics, there're only about 20% of peaks passed the cutoff and saved in the results [email protected] and used for downstream functional and pathway analysis. But the rest 80% of the peaks do not have meaningful contribution to any topics. (the attached figure2). And some peaks are used more than 15 times in contributing to different topics. Is that normal? How could we interpret this result?
Thank you so much!!
Dear cisTopic team / aertslab,
First of all, thank you for the great package. It's been working really well for me.
I am using cisTopic in a jupyter notebook. In this context, I had to modify cellTopicHeatmap to get the annotations to display well:
I changed
annotation <- ComplexHeatmap::HeatmapAnnotation(df = object.cell.data[,colorBy,drop=FALSE], col = colVars, which='column', width = unit(5, "mm"))
to either
annotation <- ComplexHeatmap::HeatmapAnnotation(df = object.cell.data[,colorBy,drop=FALSE], col = colVars, which='column')
or
annotation <- ComplexHeatmap::HeatmapAnnotation(df = object.cell.data[,colorBy,drop=FALSE], col = colVars, which='column', height = unit(5, "mm"))
(I am guessing it should be height?! )
I ran the same code just from the console with png() and pdf() and had the same issue. I might be missing something, but I am guessing 'width' should either be 'height' or removed? I haven't tried running cisTopic in RStudio. It might not be a problem there since your vignette displays everything just fine? Could possibly also be due to a different ComplexHeatmap version (sessionInfo below).
Thanks,
Christoph
sessionInfo():
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS/LAPACK: <path to conda env>/libopenblasp-r0.3.7.so
locale:
[1] C
attached base packages:
[1] grid stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] scatterplot3d_0.3-41 plotly_4.9.1 ggplot2_3.2.1
[4] ComplexHeatmap_2.2.0 fastcluster_1.1.25 cisTopic_0.3.0
loaded via a namespace (and not attached):
[1] reticulate_1.14 R.utils_2.9.2
[3] tidyselect_1.0.0 RSQLite_2.2.0
[5] AnnotationDbi_1.48.0 htmlwidgets_1.5.1
[7] BiocParallel_1.20.1 Rtsne_0.15
[9] munsell_0.5.0 codetools_0.2-16
[11] ica_1.0-2 pbdZMQ_0.3-3
[13] future_1.16.0 withr_2.1.2
[15] RcisTarget_1.4.0 colorspace_1.4-1
[17] Biobase_2.46.0 uuid_0.1-2
[19] Seurat_3.1.2 stats4_3.6.1
[21] ROCR_1.0-7 gbRd_0.4-11
[23] listenv_0.8.0 Rdpack_0.11-0
[25] repr_1.1.0 GenomeInfoDbData_1.2.2
[27] lgr_0.3.3 farver_2.0.3
[29] bit64_0.9-7 vctrs_0.2.1
[31] float_0.2-3 BiocFileCache_1.10.2
[33] R6_2.4.1 GenomeInfoDb_1.22.0
[35] clue_0.3-57 rsvd_1.0.2
[37] AnnotationFilter_1.10.0 bitops_1.0-6
[39] DelayedArray_0.12.2 assertthat_0.2.1
[41] promises_1.1.0 SDMTools_1.1-221.2
[43] scales_1.1.0 gtable_0.3.0
[45] npsurv_0.4-0 Cairo_1.5-10
[47] globals_0.12.5 seqLogo_1.52.0
[49] rlang_0.4.4 zeallot_0.1.0
[51] GlobalOptions_0.1.1 text2vec_0.6
[53] splines_3.6.1 rtracklayer_1.46.0
[55] lazyeval_0.2.2 reshape2_1.4.3
[57] GenomicFeatures_1.38.2 backports_1.1.5
[59] httpuv_1.5.2 tools_3.6.1
[61] feather_0.3.5 gplots_3.0.1.2
[63] RColorBrewer_1.1-2 BiocGenerics_0.32.0
[65] ggridges_0.5.2 Rcpp_1.0.3
[67] plyr_1.8.5 base64enc_0.1-3
[69] progress_1.2.2 zlibbioc_1.32.0
[71] purrr_0.3.3 RCurl_1.98-1.1
[73] prettyunits_1.1.1 openssl_1.4.1
[75] GetoptLong_0.1.8 pbapply_1.4-2
[77] cowplot_1.0.0 S4Vectors_0.24.3
[79] zoo_1.8-7 SummarizedExperiment_1.16.1
[81] ggrepel_0.8.1 cluster_2.1.0
[83] magrittr_1.5 data.table_1.12.8
[85] circlize_0.4.6 lmtest_0.9-37
[87] RANN_2.6.1 mlapi_0.1.0
[89] fitdistrplus_1.0-14 matrixStats_0.55.0
[91] hms_0.5.3 lsei_1.2-0
[93] mime_0.9 evaluate_0.14
[95] xtable_1.8-4 RhpcBLASctl_0.20-17
[97] XML_3.99-0.3 AUCell_1.6.1
[99] shape_1.4.4 IRanges_2.20.2
[101] gridExtra_2.3 compiler_3.6.1
[103] biomaRt_2.42.0 tibble_2.1.3
[105] KernSmooth_2.23-16 crayon_1.3.4
[107] R.oo_1.23.0 htmltools_0.4.0
[109] later_1.0.0 snow_0.4-3
[111] tidyr_1.0.0 RcppParallel_4.4.4
[113] DBI_1.1.0 dbplyr_1.4.2
[115] MASS_7.3-51.5 rappdirs_0.3.1
[117] Matrix_1.2-18 R.methodsS3_1.8.0
[119] gdata_2.18.0 parallel_3.6.1
[121] metap_1.1 igraph_1.2.4.2
[123] GenomicRanges_1.38.0 pkgconfig_2.0.3
[125] getPass_0.2-2 GenomicAlignments_1.22.1
[127] rsparse_0.3.3.4 IRdisplay_0.7.0
[129] foreach_1.4.8 annotate_1.64.0
[131] lda_1.4.2 XVector_0.26.0
[133] bibtex_0.4.2 stringr_1.4.0
[135] digest_0.6.24 sctransform_0.2.0
[137] RcppAnnoy_0.0.14 tsne_0.1-3
[139] graph_1.64.0 Biostrings_2.54.0
[141] leiden_0.3.3 uwot_0.1.5
[143] GSEABase_1.46.0 curl_4.3
[145] shiny_1.4.0 Rsamtools_2.2.1
[147] gtools_3.8.1 rjson_0.2.20
[149] lifecycle_0.1.0 nlme_3.1-143
[151] jsonlite_1.6.1 viridisLite_0.3.0
[153] askpass_1.1 BSgenome_1.54.0
[155] pillar_1.4.3 lattice_0.20-38
[157] fastmap_1.0.1 httr_1.4.1
[159] survival_3.1-8 glue_1.3.1
[161] png_0.1-7 iterators_1.0.12
[163] bit_1.1-15.1 stringi_1.4.5
[165] blob_1.2.1 doSNOW_1.0.18
[167] caTools_1.18.0 memoise_1.1.0
[169] IRkernel_1.1 dplyr_0.8.3
[171] irlba_2.3.3 future.apply_1.4.0
[173] ape_5.3
Hi,
I have an experiment with four different time points using ATAC seq where the samples at the four time points come from different subjects. All the subject were treated the same at baseline and some of the subjects were used for sample collection at three different time points. So i have a total of four time points including the baseline samples. I analyzed the four time points separately and created separate cisTopic objects
My question is: can the four predMatSumByGene matrices generated in this process per the PBMC tutorials be 'combined' to get an estimation on gene accessibility changes over time? Or would you recommend a different method?
Thank you!
Sorry, one more issue i keep running into:
> TObject <- GREAT(TObject, genome='hg19', fold_enrichment=2, geneHits=1, sign=0.05, request_interval=10)
Error in download.file(url, destfile = file, quiet = TRUE) :
cannot open URL 'http://great.stanford.edu/public/cgi-bin/readJsFromFile.php?path=/scratch/great/tmp/results/20190604-public-3.0.0-hUvmbp.d/EnsemblGenes.js'
In addition: Warning message:
In download.file(url, destfile = file, quiet = TRUE) :
cannot open URL 'http://great.stanford.edu/public/cgi-bin/readJsFromFile.php?path=/scratch/great/tmp/results/20190604-public-3.0.0-hUvmbp.d/EnsemblGenes.js': HTTP status was '403 Forbidden'
failed to download, try after 30s
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] RcisTarget_1.5.0 feather_0.3.3 cisTopic_0.2.1
[4] BiocParallel_1.14.2 doParallel_1.0.14 iterators_1.0.10
[7] foreach_1.4.4 densityClust_0.3 org.Hs.eg.db_3.6.0
[10] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.34.8 AnnotationDbi_1.44.0
[13] Biobase_2.42.0 ChIPseeker_1.18.0 rGREAT_1.14.0
[16] GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 IRanges_2.16.0
[19] S4Vectors_0.20.1 BiocGenerics_0.28.0 data.table_1.12.2
[22] fastcluster_1.1.25 ComplexHeatmap_1.20.0 Rtsne_0.15
[25] umap_0.2.2.0 Rsubread_1.32.4
loaded via a namespace (and not attached):
[1] snow_0.4-3 circlize_0.4.6 fastmatch_1.1-0 plyr_1.8.4 igraph_1.2.2
[6] lazyeval_0.2.2 GSEABase_1.42.0 splines_3.5.1 ggplot2_3.1.1 gridBase_0.4-7
[11] urltools_1.7.3 digest_0.6.18 htmltools_0.3.6 GOSemSim_2.8.0 viridis_0.5.1
[16] GO.db_3.6.0 gdata_2.18.0 lda_1.4.2 magrittr_1.5 memoise_1.1.0
[21] Biostrings_2.48.0 annotate_1.60.1 matrixStats_0.54.0 R.utils_2.8.0 enrichplot_1.2.0
[26] prettyunits_1.0.2 colorspace_1.4-1 blob_1.1.1 ggrepel_0.8.0 dplyr_0.7.8
[31] crayon_1.3.4 RCurl_1.95-4.12 jsonlite_1.6 graph_1.60.0 bindr_0.1.1
[36] survival_2.44-1.1 glue_1.3.1 polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.28.0
[41] XVector_0.22.0 UpSetR_1.4.0 GetoptLong_0.1.7 DelayedArray_0.8.0 shape_1.4.4
[46] scales_1.0.0 DOSE_3.8.2 DBI_1.0.0 Rcpp_1.0.0 plotrix_3.7-5
[51] viridisLite_0.3.0 xtable_1.8-4 progress_1.2.2 gridGraphics_0.4-1 reticulate_1.10
[56] bit_1.1-14 europepmc_0.3 httr_1.4.0 fgsea_1.8.0 FNN_1.1.2.1
[61] gplots_3.0.1.1 RColorBrewer_1.1-2 R.methodsS3_1.7.1 pkgconfig_2.0.2 XML_3.98-1.19
[66] farver_1.1.0 later_0.7.5 ggplotify_0.0.3 tidyselect_0.2.5 rlang_0.3.4
[71] reshape2_1.4.3 munsell_0.5.0 tools_3.5.1 RSQLite_2.1.1 doMC_1.3.5
[76] ggridges_0.5.1 stringr_1.4.0 yaml_2.2.0 npsurv_0.4-0 bit64_0.9-7
[81] fitdistrplus_1.0-14 caTools_1.17.1.1 purrr_0.3.2 ggraph_1.0.2 bindrcpp_0.2.2
[86] mime_0.6 R.oo_1.22.0 DO.db_2.9 xml2_1.2.0 biomaRt_2.36.1
[91] compiler_3.5.1 rstudioapi_0.10 lsei_1.2-0 tibble_2.1.2 tweenr_1.0.1
[96] stringi_1.2.4 lattice_0.20-38 Matrix_1.2-17 pillar_1.4.1 triebeard_0.3.0
[101] GlobalOptions_0.1.0 cowplot_0.9.4 bitops_1.0-6 httpuv_1.4.5 AUCell_1.7.1
[106] rtracklayer_1.40.6 qvalue_2.14.1 R6_2.4.0 promises_1.0.1 KernSmooth_2.23-15
[111] gridExtra_2.3 codetools_0.2-16 boot_1.3-22 MASS_7.3-51.4 gtools_3.8.1
[116] assertthat_0.2.1 SummarizedExperiment_1.10.1 rjson_0.2.20 GenomicAlignments_1.18.1 Rsamtools_1.32.3
[121] GenomeInfoDbData_1.2.0 doSNOW_1.0.16 hms_0.4.2 rvcheck_0.1.3 ggforce_0.2.2
[126] shiny_1.2.0
Hello @cbravo93 !
I have got new insights on my dataset using cisTopic! Great package. This time I would like to know whether is possible to use signaturesHeatmap
to display the cluster-info (e.g. 'densityClust') similar to cellTopicHeatmap
Thanks in advance!
Hello, is it possible to share counts_Lake.Rds file?
Thanks!
Hi,
Following the PBMC data analysis vignette, i ran into the error below:
dclust_d3 <- densityClust(DRdist_d3,gaussian=T)
Distance cutoff calculated to 2.378223
dclust_d3 <- findClusters(dclust_d3, rho = 50, delta = 2.5)
Error in cluster[i] <- cluster[higherDensity[which.min(findDistValueByRowColInd(x$distance, :
replacement has length zero
Any ideas why this may be happening? If i look at the dclust_d3 object, it has 'NA' in the 'clusters' slot. I assume this is why the error is produced; however, i am not sure why there 'NA' in the 'clusters' slot in the first place.
Thanks,
Joe
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] RcisTarget_1.5.0 feather_0.3.3
[3] cisTopic_0.2.1 BiocParallel_1.14.2
[5] doParallel_1.0.14 iterators_1.0.10
[7] foreach_1.4.4 densityClust_0.3
[9] org.Mm.eg.db_3.7.0 TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.4
[11] GenomicFeatures_1.34.8 AnnotationDbi_1.44.0
[13] Biobase_2.42.0 ChIPseeker_1.18.0
[15] rGREAT_1.14.0 GenomicRanges_1.34.0
[17] GenomeInfoDb_1.18.2 IRanges_2.16.0
[19] S4Vectors_0.20.1 BiocGenerics_0.28.0
[21] data.table_1.12.2 fastcluster_1.1.25
[23] ComplexHeatmap_1.20.0 Rtsne_0.15
[25] umap_0.2.2.0 Rsubread_1.32.4
[27] httpuv_1.5.1
loaded via a namespace (and not attached):
[1] snow_0.4-3 backports_1.1.4
[3] circlize_0.4.6 fastmatch_1.1-0
[5] plyr_1.8.4 igraph_1.2.2
[7] lazyeval_0.2.2 GSEABase_1.42.0
[9] splines_3.5.1 ggplot2_3.2.0
[11] gridBase_0.4-7 urltools_1.7.3
[13] digest_0.6.19 htmltools_0.3.6
[15] GOSemSim_2.8.0 viridis_0.5.1
[17] GO.db_3.6.0 gdata_2.18.0
[19] lda_1.4.2 magrittr_1.5
[21] memoise_1.1.0 Biostrings_2.48.0
[23] annotate_1.60.1 matrixStats_0.54.0
[25] R.utils_2.9.0 enrichplot_1.2.0
[27] prettyunits_1.0.2 colorspace_1.4-1
[29] blob_1.2.0 ggrepel_0.8.1
[31] dplyr_0.8.1 crayon_1.3.4
[33] RCurl_1.95-4.12 jsonlite_1.6
[35] graph_1.60.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[37] zeallot_0.1.0 survival_2.44-1.1
[39] glue_1.3.1 polyclip_1.10-0
[41] gtable_0.3.0 zlibbioc_1.28.0
[43] XVector_0.22.0 UpSetR_1.4.0
[45] GetoptLong_0.1.7 DelayedArray_0.8.0
[47] shape_1.4.4 scales_1.0.0
[49] DOSE_3.8.2 DBI_1.0.0
[51] Rcpp_1.0.1 plotrix_3.7-6
[53] xtable_1.8-4 viridisLite_0.3.0
[55] progress_1.2.2 gridGraphics_0.4-1
[57] reticulate_1.10 bit_1.1-14
[59] europepmc_0.3 DT_0.7
[61] htmlwidgets_1.3 httr_1.4.0
[63] fgsea_1.8.0 FNN_1.1.2.1
[65] gplots_3.0.1.1 RColorBrewer_1.1-2
[67] R.methodsS3_1.7.1 pkgconfig_2.0.2
[69] XML_3.98-1.20 farver_1.1.0
[71] ggplotify_0.0.3 tidyselect_0.2.5
[73] rlang_0.4.0 reshape2_1.4.3
[75] later_0.8.0 munsell_0.5.0
[77] tools_3.5.1 RSQLite_2.1.1
[79] ggridges_0.5.1 stringr_1.4.0
[81] yaml_2.2.0 npsurv_0.4-0
[83] bit64_0.9-7 fitdistrplus_1.0-14
[85] caTools_1.17.1.1 purrr_0.3.2
[87] ggraph_1.0.2 mime_0.7
[89] R.oo_1.22.0 DO.db_2.9
[91] xml2_1.2.0 biomaRt_2.36.1
[93] compiler_3.5.1 rstudioapi_0.10
[95] lsei_1.2-0 tibble_2.1.3
[97] tweenr_1.0.1 stringi_1.4.3
[99] lattice_0.20-38 Matrix_1.2-17
[101] vctrs_0.2.0 pillar_1.4.2
[103] triebeard_0.3.0 GlobalOptions_0.1.0
[105] cowplot_1.0.0 bitops_1.0-6
[107] AUCell_1.7.1 rtracklayer_1.40.6
[109] qvalue_2.14.1 R6_2.4.0
[111] promises_1.0.1 KernSmooth_2.23-15
[113] gridExtra_2.3 codetools_0.2-16
[115] boot_1.3-23 MASS_7.3-51.4
[117] gtools_3.8.1 assertthat_0.2.1
[119] SummarizedExperiment_1.10.1 rjson_0.2.20
[121] GenomicAlignments_1.18.1 Rsamtools_1.32.3
[123] GenomeInfoDbData_1.2.0 doSNOW_1.0.16
[125] hms_0.5.0 rvcheck_0.1.3
[127] ggforce_0.2.2 shiny_1.3.2
Hi @cbravo93,
Not so much about the library, but about the analysis itself: to give the topics meaning, in the tutorial you used 3 Chip-seq peak files from (I assume) separate bulk Chip-seq experiments. Are there any publicly available Chip-seq files for a range of transcription factors that you would recommend?
Getting to the later stages of the tutorial mentioned above, I realised my bam files were not UCSC-style (chromosomes were named 1,2,3,... instead of chr1,chr2,chr3,...) - I fixed that and now have new bam files, but I can't read them and create a cisTopicObject
Some relevant info: Before it used to work with the same code but just non-UCSC bams and
I also made the aggregate pseudo-bulk UCSC-style for this
Code:
pathToBams <- '/blabla/picard_bam_files_UCSC_style_test/'
bamFiles <- paste(pathToBams, list.files(pathToBams), sep='')
regions <- '/blabla/UCSC_style_peak_aggregated_scATAC_individual.narrowPeak'
cisTopicObject <- createcisTopicObjectFromBAM(bamFiles, regions, project.name='bla')
ERROR: invalid parameter: '/blabla/picard_bam_files_UCSC_style_test/sample_1.bam'
Error in Rsubread::featureCounts(bamfiles, annot.ext = regions_frame, :
No counts were generated.
(while trying to fix this I also run this (Vignette packages):
source("https://bioconductor.org/biocLite.R")
biocLite(c('Rsubread', 'umap', 'Rtsne', 'ComplexHeatmap', 'fastcluster', 'data.table', 'rGREAT', 'ChIPseeker', 'TxDb.Hsapiens.UCSC.hg19.knownGene', 'org.Hs.eg.db'))
but that didn't help)
I'd be happy if you could help.
Thanks
Hi @cbravo93 !
Had a conceptual question: having read the paper I still struggle to understand how the algorithm groups regions into topics? Based on what? Is it based on co-occurence? Similar patterns of other regions? Basically how can you interpret this relationship?
Thank you.
Hi!
I tried to generate BW file after get region scores.
library(TxDb.Mmusculus.UCSC.mm9.knownGene)
txdb<-TxDb.Mmusculus.UCSC.mm9.knownGene
getBigwigFiles(cisTopicObject, path='output/cisTopics_asBW', seqlengths=seqlengths(txdb))
However, I get the error message below:
Error in FUN(extractROWS(unlisted_X, IRanges(X_elt_start[i], X_elt_end[i])), :
BigWig ranges cannot overlap
Looking forward to your Suggestions!
Hi,
I am getting an error in the 5k PBMC tutorial with the line below:
cisTopicObject <- getCistromeEnrichment(cisTopicObject, topic=1, TFname='SPI1', aucellRankings = aucellRankings,
aucMaxRank = 0.05*nrow(aucellRankings), plot=FALSE)
Using 0 cores.
Error in mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed, :
'mc.cores' must be >= 1
I tried a couple of possible solutions to this but could not get it to work.
Suggestions would be appreciated!
Thank you!
Hi,
I noticed that the cisTarget database only has ranking by regions for hg19. If my accessibility data is mapped to mm9, is the best option to liftover to hg19 to do motif enrichment analysis?
Thank you!
E
Looking forward to trying this impressive package! Might you have a workflow for inputting 10X Chromium scATAC data?
Hello,
First of all, thanks for publishing this work. It's really useful code and very helpful.
I was wondering about this line:
Line 316 in cfcf509
This looks a little different from what's in the Methods section of the paper, which has the sum of the logarithms in denominator.
I also wondered if there was an explanation available for how to interpret region scores, perhaps a reference where this score is introduced? I haven't seen it before.
Thanks again.
Hi,
it seems that several functions are currently not defined in the package, including runModels
and runWrapLDAModels
.
Best,
Wolfgang
[in]:
devtools::install_github("aertslab/cisTopic")
[out]:
Downloading GitHub repo aertslab/cisTopic@master
Skipping 15 packages ahead of CRAN: GenomicRanges, S4Vectors, SummarizedExperiment, BiocGenerics, IRanges, XVector, GenomeInfoDb, zlibbioc, GenomicAlignments, Biobase, annotate, AnnotationDbi, DelayedArray, BiocParallel, GenomeInfoDbData
✔ checking for file ‘/tmp/RtmpktT8Ub/remotes2f89111f2a0/aertslab-cisTopic-8fd1432/DESCRIPTION’
─ preparing ‘cisTopic’:
✔ checking DESCRIPTION meta-information
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘cisTopic_0.3.0.tar.gz’ (7.7s)
Installing package into ‘/home/mvinyard/R/x86_64-pc-linux-gnu-library/3.6’
(as ‘lib’ is unspecified)
Error: Failed to install 'cisTopic' from GitHub:
(converted from warning) installation of package ‘/tmp/RtmpktT8Ub/file2f892524899b/cisTopic_0.3.0.tar.gz’ had non-zero exit status
Traceback:
1. devtools::install_github("aertslab/cisTopic")
2. pkgbuild::with_build_tools({
. ellipsis::check_dots_used(action = getOption("devtools.ellipsis_action",
. rlang::warn))
. {
. remotes <- lapply(repo, github_remote, ref = ref, subdir = subdir,
. auth_token = auth_token, host = host)
. install_remotes(remotes, auth_token = auth_token, host = host,
. dependencies = dependencies, upgrade = upgrade, force = force,
. quiet = quiet, build = build, build_opts = build_opts,
. build_manual = build_manual, build_vignettes = build_vignettes,
. repos = repos, type = type, ...)
. }
. }, required = FALSE)
3. install_remotes(remotes, auth_token = auth_token, host = host,
. dependencies = dependencies, upgrade = upgrade, force = force,
. quiet = quiet, build = build, build_opts = build_opts, build_manual = build_manual,
. build_vignettes = build_vignettes, repos = repos, type = type,
. ...)
4. tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
. stop(remote_install_error(remotes[[i]], e))
. })
5. tryCatchList(expr, classes, parentenv, handlers)
6. tryCatchOne(expr, names, parentenv, handlers[[1L]])
7. value[[3L]](cond)
Hi!
I would like to report a bug when running runWarpLDAModels()
with exactly one topic.
The error message is
Error in alpha/t : non-numeric argument to binary operator
After checking the R code, I finally located the error, which is in line 644 of RunModels.R. The t
in this line should be topic
, I think.
Best
Van1yu3
Hi @cbravo93 ,
Is it possible to export the topic-signature matrix similar to how modelMatSelection exports either topic-cell or topic-region matrices?
Thank you.
Sincerely,
Anna Arutyunyan
The 'LDA' algorithm is a creative and effective method to handle large scATAC dimensions. Unlike scRNA, which often genes x cells, ATAC peaks are hundreds of times more than gene numbers. However, in my own analysis experience, I got separative clusters (and further different developmental trajectories) of cells, which were often different with sequence-depth. I wonder that : 1) does the cisTopic LDA algorithm treat low-depth cells properly? will it possible that these clusters were only separated due to sequence depth? 2) should I filter these low-depth (in fact, not so low-depth at my first glance)
Please help and any suggestions will be good. I just want to keep off false-positive results but do not to miss the true findings.
p.s. the results of signac(seurat extension package), episcanpy & scanpy, cicero look similar.
The package is very intuitive and provides great insight.
I've noticed that the GREAT function gives an error with mm9 but not with mm10:
cisTopicObject <- GREAT(cisTopicObject, genome='mm9', fold_enrichment=2, geneHits=1, sign=0.05, request_interval=10)
Error in submitGreatJob(coord, species = genome, request_interval = request_interval, :
GREAT encountered a user error (message from GREAT web server)
Will I run into issues switching back and forth between mm9 and mm10 in the workflow as the feather file is mm9? i.e. can I run the other analysis with mm10 up until GREAT and then switch to mm9 for TF motif enrichment and formation of cistromes?
Alternatively, is there any reason why GREAT isn't accepting the above command with mm9 that can be fixed?
Hi,
I am using the following tutorial, in step 24 am getting the following error. i got the same data until step 23, there is no problem in that. please try to help me. am using R3.5.1 and checked with R.3.6.1 also, and got the same error message
https://nbviewer.jupyter.org/github/pinellolab/scATAC-benchmarking/blob/master/Real_Data/Buenrostro_2018/run_methods/cisTopic/cisTopic_buenrostro2018.ipynb?flush_cache=true
step 24
logLikelihoodByIter(cisTopicObject, select=c(10, 20, 25, 30, 35, 40))
error
Error in scales::hue_pal(l = 60:100) : length(l) == 1 is not TRUE
Calls: logLikelihoodByIter ... unique -> col2rgb -> %in% -> -> stopifnot
Hi,
When running topicsRcisTarget
on the hg38 on liftover to hg19 I get this error.
Error in .column_indexes_feather(x, i) : undefined columns: chr1-reg496, chr1-reg497, chr1-reg498, chr1-reg500, chr1-reg976, chr1-reg977, chr1-reg978, chr1-reg979, chr1-reg980, chr1-reg1117, chr1-reg1119, chr1-reg1120, chr1-reg1121, chr1-reg2014, chr1-reg2016, chr1-reg2017, chr1-reg2018, chr1-reg2310, chr1-reg2312, chr1-reg2314, chr1-reg2315, chr1-reg2316, chr1-reg2317, chr1-reg2318, chr1-reg6260, chr1-reg6261, chr1-reg6262, chr1-reg6467, chr1-reg6468, chr1-reg6618, chr1-reg6619, chr1-reg6620, chr1-reg6621, chr1-reg6622, chr1-reg6623, chr1-reg6624, chr1-reg6634, chr1-reg6635, chr1-reg6637, chr1-reg6638, chr1-reg6639, chr1-reg6641, chr1-reg6642, chr1-reg6643, chr1-reg6644, chr1-reg6645, chr1-reg6891, chr1-reg6892, chr1-reg6893, chr1-reg6972, chr1-reg7710, chr1-reg7711, chr1-reg7712, chr1-reg8470, chr1-reg8776, chr1-reg8778, chr1-reg8779, chr1-reg8967, chr1-reg9123, chr1-reg9125, chr1-reg9127, chr1-reg9128, chr1-reg9609, chr1-reg9783, chr1-reg9784, chr1-reg9785, chr1-reg9787, chr1-reg9788, chr1-reg9789, chr1-reg9790, Calls: topicsRcisTarget ... as_tibble -> [ -> [.feather -> .column_indexes_feather Execution halted
Any guess on why this is happening?
Thank you!
``
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 18.04.2 LTS
Matrix products: default
BLAS/LAPACK: /home/jovyan/my-conda-envs/myenvSC/lib/libopenblasp-r0.3.7.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] TxDb.Hsapiens.UCSC.hg38.knownGene_3.10.0
[2] GenomicFeatures_1.38.0
[3] org.Hs.eg.db_3.10.0
[4] AnnotationDbi_1.48.0
[5] Biobase_2.46.0
[6] rtracklayer_1.46.0
[7] GenomicRanges_1.38.0
[8] GenomeInfoDb_1.22.0
[9] IRanges_2.20.1
[10] S4Vectors_0.24.1
[11] BiocGenerics_0.32.0
[12] R.utils_2.9.2
[13] R.oo_1.23.0
[14] R.methodsS3_1.7.1
[15] cisTopic_0.2.2
loaded via a namespace (and not attached):
[1] bitops_1.0-6 matrixStats_0.55.0
[3] bit64_0.9-7 progress_1.2.2
[5] httr_1.4.1 tools_3.6.1
[7] backports_1.1.5 R6_2.4.1
[9] DBI_1.0.0 npsurv_0.4-0
[11] tidyselect_0.2.5 prettyunits_1.0.2
[13] curl_4.3 bit_1.1-14
[15] compiler_3.6.1 AUCell_1.8.0
[17] graph_1.64.0 RcisTarget_1.6.0
[19] DelayedArray_0.12.0 askpass_1.1
[21] rappdirs_0.3.1 stringr_1.4.0
[23] digest_0.6.23 Rsamtools_2.2.1
[25] XVector_0.26.0 pkgconfig_2.0.3
[27] htmltools_0.4.0 dbplyr_1.4.2
[29] fastmap_1.0.1 rlang_0.4.2
[31] RSQLite_2.1.4 shiny_1.4.0
[33] BiocParallel_1.20.0 dplyr_0.8.3
[35] RCurl_1.95-4.12 magrittr_1.5
[37] feather_0.3.5 GenomeInfoDbData_1.2.2
[39] Matrix_1.2-18 Rcpp_1.0.3
[41] lda_1.4.2 stringi_1.4.3
[43] MASS_7.3-51.4 SummarizedExperiment_1.16.0
[45] zlibbioc_1.32.0 plyr_1.8.5
[47] BiocFileCache_1.10.2 grid_3.6.1
[49] blob_1.2.0 promises_1.1.0
[51] crayon_1.3.4 doSNOW_1.0.18
[53] lattice_0.20-38 Biostrings_2.54.0
[55] splines_3.6.1 annotate_1.64.0
[57] hms_0.5.2 zeallot_0.1.0
[59] pillar_1.4.2 codetools_0.2-16
[61] biomaRt_2.42.0 XML_3.98-1.20
[63] glue_1.3.1 lsei_1.2-0
[65] data.table_1.12.8 vctrs_0.2.0
[67] httpuv_1.5.2 foreach_1.4.4
[69] openssl_1.4.1 purrr_0.3.3
[71] assertthat_0.2.1 mime_0.7
[73] xtable_1.8-4 later_1.0.0
[75] survival_3.1-8 tibble_2.1.3
[77] snow_0.4-3 iterators_1.0.10
[79] GenomicAlignments_1.22.1 memoise_1.1.0
[81] fitdistrplus_1.0-14 GSEABase_1.48.0
``
Hi,
I'm having this issue when plotting the heatmap with cellTopicHeatmap (as in this tutorial):
Code:
cellTopicHeatmap(cisTopicObject, method='Probability', colorBy=c('celltype', 'TREATMENT'), cluster_rows = FALSE, cluster_columns = FALSE)
Error message:
Error in ComplexHeatmap::Heatmap(data.matrix(topic.mat), col = colorPal(20), :
formal argument "cluster_columns" matched by multiple actual arguments
My suspicion is that the argument cluster_columns is forced to TRUE, which for me is an issue since I would prefer to group columns according to their status is either treatment or celltype and not be clustered.
Thank you.
I have installed all the dependencies cisTopic requires on R-3.5.2, but while trying to install the package from devtools or source I keep hitting this error on line 261 from an equal-sign operator.
Unsure if this is due to a change in the code-base causing some syntactical error on R.
`> devtools::install_github("aertslab/cisTopic")
Downloading GitHub repo aertslab/cisTopic@master
Skipping 1 packages not available: text2vec
✔ checking for file ‘/tmp/Rtmp5FkfaK/remotese17e72b625e9/aertslab-cisTopic-3e3cd00/DESCRIPTION’ ...
─ preparing ‘cisTopic’:
✔ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
─ building ‘cisTopic_0.3.0.tar.gz’ (6.3s)
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Working on Arabidopsis, which is anot a USCS supported genome, any ideas on how to deal with getting this info:
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
getBigwigFiles(cisTopicObject, path='output/cisTopics_asBW', seqlengths=seqlengths(txdb))
and other areas that need genome annotation info
Some files, like cisTopicObject_pbmc.Rds from the 10X 5k PBMCs and the data/bamfiles/ folder from the melanoma tutorial are missing.
Hi,
Would you guys mind sharing how to easily obtain the bulk ATAC narrowpeak files used in the PBMC tutorial as reference for enrichment testing (Corces et al. 2016 Nat Genet)?
Thanks,
Joe
Hello,
I was wondering whether it would be possible to upload some of the datasets used in the paper. Namely, I am interested in exploring the processed data (inputs to cisTopic) from the sections "Simulated epigenomes from FACS-sorted bulk ATAC-seq profiles from the hematopoietic system" and "scATAC-seq from FACS-sorted single-cell populations from the hematopoietic system"
Thanks!
I notice that this package has been updated to version 0.3.0. However, the function ‘runCGSModels’, which is said to be equivalent to ‘runModels’ in the version 0.2.1, return some error.
More specifically, I tried running the tutorial on simulated single cell epigenomes from melanoma cell line , and changed the function ‘runModels’ in the tutorial to ‘runCGSModels’. The error message is returned as follows:
cisTopicObject_tmp <- runCGSModels(cisTopicObject, topic=c(2, 5:15, 20, 25), seed=987, nCores=13, burnin = 120, iterations = 150, addModels=FALSE)
[1] "Formatting data..."
[1] "Exporting data..."
[1] "Running models..."
Error in do.ply(i) :
task 1 failed - "cannot coerce type 'closure' to vector of type 'double'"
Moreover, when I reset the argument ‘nCores=1’, the error message is different from the previous one:
cisTopicObject_tmp <- runCGSModels(cisTopicObject, topic=c(2, 5:15, 20, 25), seed=987, nCores=1, burnin = 120, iterations = 150, addModels=FALSE)
[1] "Formatting data..."
[1] "Running models..."
| | 0%Error in lda.collapsed.gibbs.sampler(cellList, topic, regionList, num.iterations = iterations, :
object 'iterations' not found
Therefore, there probably exists some bug in the cisTopic v0.3.0. Hope that my error message helps.
Hi, I want to access the estimated topic distributions and probabilities. How do I do this from a cisTopic fit? The documentation is not clear on this.
cisTopicObject<- getCistromes(cisTopicObject, annotation = 'Both', nCores=1)
Column 4 ['rep..TF_lowConf...length.peaks..'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names. use.names='check' (default from v1.12.2) emits this message and proceeds as if use.names=FALSE for backwards compatibility. See news item 5 in v1.12.2 for options to control this message.
Column 4 ['rep..TF_lowConf...length.peaks..'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names. use.names='check' (default from v1.12.2) emits this message and proceeds as if use.names=FALSE for backwards compatibility. See news item 5 in v1.12.2 for options to control this message.
Column 4 ['rep..TF_lowConf...length.peaks..'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names. use.names='check' (default from v1.12.2) emits this message and proceeds as if use.names=FALSE for backwards compatibility. See news item 5 in v1.12.2 for options to control this message.
This message keeps repeating.....
Suggestions?
Thanks,
Hi,
Sorry, one more issue to open. Once again, following the 10x vignette.
plotFeatures(cisTopicObject_d0, method='tSNE', target='cell', topic_contr=NULL, colorBy=c('nCounts', 'nAcc','densityClust', 'graphBasedClusters_CRA'), cex.legend = 0.8,
factor.max=.75, dim=2, legend=TRUE, col.low='darkgreen', col.mid='yellow', col.high='brown1', intervals=10)
Error in plotFeatures(cisTopicObject_d0, method = "tSNE", target = "cell", :
The variable graphBasedClusters_CRA is not included in the cell data. Please check and re-run.
This is strange because i did run
graphBasedClusters_CRA_d0 <- read.table(pathTographBasedClusters_CRA_d0, sep=',', header=TRUE, row.names = 1)
colnames(graphBasedClusters_CRA_d0) <- 'graphBasedClusters_CRA'
graphBasedClusters_CRA_d0[,1] <- as.factor(graphBasedClusters_CRA_d0[,1])
cisTopicObject_d0 <- addCellMetadata(cisTopicObject_d0, graphBasedClusters_CRA_d0)
which all ran just fine. Ideas would be appreciated.
Also,
cellTopicHeatmap(cisTopicObject_d0, method='Probability', colorBy=c('densityClust')
Error in ComplexHeatmap::HeatmapAnnotation(df = object.cell.data[, colorBy, :
elements incol
should be named vectors.
cisTopicObject_d0 <- addCellMetadata(cisTopicObject_d0, densityClust_d0)
ran without issues for this object.
Thank you,
Joe
Hi @cbravo93,
I have dataset from two conditions that I would like to compare. However, when I combined the peak count matrix and analyzed it using cisTopic, I noticed batch effect. I'm thinking about projecting one sample to the existing topic space of the other.
And I find this in your preprint: "Additionally, we projected the FAC-sorted single cell profiles (Optix-GFP+ and sens-GFP+) with at least 70% of the fragments within regulatory regions into the existing topic space. Briefly, the topic-cell distributions of the new cells were estimated by multiplying the binary count matrix (cell-regions) by the region-topic distributions of the existing models. The estimated topic-cell contributions were merged with the topic-cell distributions of the original cells, normalized (by Z-Score) and batch effects were corrected with Harmony (v1.0)102."
Is it possible to share the script how you perform this analysis?
Thanks.
Jason
Hi,
How do we obtain signatures for embryonic mouse kidneys when analyzing scATAC dataset?
Thanks!
Getting this error trying to load an aggregation of two samples with the new CellRanger
data_folder <- '/media/breunighp/SSD/ssrnaseq/scATAC/Aggr/filtered_peak_bc_matrix/'
metrics <- 'singlecell.csv'
cisTopicObjectt <- createcisTopicObjectFrom10Xmatrix(data_folder, metrics, project.name = "cisTopicProject")
Error in dimnamesGets(x, value) :
invalid dimnames given for “dgTMatrix” object
I can load it into cisTOPIC by using 10X's recommendation for creating a matrix and going through the workflow that way.
This is the first time I've had an issue with a 10X dataset and I'm wondering if it is due to the aggregation or the new version of CellRanger?
Unfortunately not able to access and download the 'hg19-regions-9species.all_regions.mc8nr.feather' from 'https://resources.aertslab.org/cistarget/databases/homo_sapiens/hg19/refseq_r45/mc8nr/region_based/' . Could you please direct me to alternate source or let me know when this would be made available?
Thanks,
Praveen
Hi,
selectModel fails when fitting the model with runCGSModels using as the topic argument a single number (e.g. topic=c(30)
).
The error message that I get is
Error in `$<-.data.frame`(`*tmp*`, "second_derivative", value = c(-Inf, :
replacement has 2 rows, data has 1
Calls: selectModel -> $<- -> $<-.data.frame
I also get an error when running runCGSModels with only two topic numbers (e.g. topic=c(29,30)
), but then the error is different:
Error in plot.window(...) : need finite 'xlim' values
Calls: selectModel ... plot -> plot -> plot.default -> localWindow -> plot.window
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
Execution halted
When run the model for more then two topic numbers (e.g. topic=c(29,30,31)
) it seems to work.
Best,
Wolfgang
Good day!
I have been using this excellent package for analyzing tumor biology. Now, I have another type of dataset (four samples) where two of them are control cells in two different time points (7 and 28d), and the other two are treated cells and collected at the same time points. I would like to know how I can create a cisTopic object that contains the four datasets and perform comparative analysis and topic modeling. I am interested in checking on the same dimensional space, how the treatment affects the chromatin accessibility (cell memory), and which features don't change.
I have checked your tutorial (cisTopic on simulated single-cell epigenomes from melanoma cell line) and seems an approach that can be used in my case for this data. Still, I do not know how to create the object with the info for the four datasets and whether it can be used for my porpuses later in the downstream analysis.
Thank you in advance for your help!
Hi,
I am following the 5k PBMC tutorial. The line
cisTopicObject <- getSignaturesRegions(cisTopicObject, Bulk_ATAC_signatures, labels=labels, minOverlap = 0.4)
throws an error:
Error in getSignaturesRegions(cisTopicObject, Bulk_ATAC_signatures, labels = labels, :
There is at least a signature with the same label:Bcell, CD34-Bone-Marrow, CD34-Cord-Blood, CD4Tcell, CD8Tcell, Mono, NKcell. Please, rename it.
as far as i can tell, there are no duplicate labels in the signatures. Help would be appreciated!
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRblas.so
LAPACK: /sc/wo/app/R/v3.5.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 grid stats graphics grDevices utils datasets methods base
other attached packages:
[1] plyr_1.8.4 fitdistrplus_1.0-14 npsurv_0.4-0 lsei_1.2-0
[5] survival_2.44-1.1 MASS_7.3-51.4 AUCell_1.7.1 scatterplot3d_0.3-41
[9] plotly_4.9.0 ggplot2_3.1.1 Matrix_1.2-17 cisTopic_0.2.1
[13] BiocParallel_1.14.2 doParallel_1.0.14 iterators_1.0.10 foreach_1.4.4
[17] densityClust_0.3 org.Hs.eg.db_3.6.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.34.8
[21] AnnotationDbi_1.44.0 Biobase_2.42.0 ChIPseeker_1.18.0 rGREAT_1.14.0
[25] GenomicRanges_1.34.0 GenomeInfoDb_1.18.2 IRanges_2.16.0 S4Vectors_0.20.1
[29] BiocGenerics_0.28.0 data.table_1.12.2 fastcluster_1.1.25 ComplexHeatmap_1.20.0
[33] Rtsne_0.15 umap_0.2.2.0 Rsubread_1.32.4
loaded via a namespace (and not attached):
[1] snow_0.4-3 circlize_0.4.6 fastmatch_1.1-0 igraph_1.2.2 lazyeval_0.2.2 GSEABase_1.42.0
[7] splines_3.5.1 feather_0.3.3 gridBase_0.4-7 urltools_1.7.3 digest_0.6.18 htmltools_0.3.6
[13] GOSemSim_2.8.0 viridis_0.5.1 GO.db_3.6.0 gdata_2.18.0 lda_1.4.2 magrittr_1.5
[19] memoise_1.1.0 Biostrings_2.48.0 annotate_1.60.1 matrixStats_0.54.0 R.utils_2.8.0 enrichplot_1.2.0
[25] prettyunits_1.0.2 colorspace_1.4-1 blob_1.1.1 ggrepel_0.8.0 dplyr_0.7.8 crayon_1.3.4
[31] RCurl_1.95-4.12 jsonlite_1.6 graph_1.60.0 bindr_0.1.1 glue_1.3.1 polyclip_1.10-0
[37] gtable_0.3.0 zlibbioc_1.28.0 XVector_0.22.0 UpSetR_1.3.3 GetoptLong_0.1.7 DelayedArray_0.8.0
[43] shape_1.4.4 scales_1.0.0 DOSE_3.8.2 DBI_1.0.0 Rcpp_1.0.1 plotrix_3.7-5
[49] viridisLite_0.3.0 xtable_1.8-4 progress_1.2.2 gridGraphics_0.4-1 reticulate_1.10 bit_1.1-14
[55] europepmc_0.3 DT_0.6 htmlwidgets_1.3 httr_1.4.0 fgsea_1.8.0 FNN_1.1.3
[61] gplots_3.0.1.1 RColorBrewer_1.1-2 R.methodsS3_1.7.1 pkgconfig_2.0.2 XML_3.98-1.19 farver_1.1.0
[67] later_0.7.5 ggplotify_0.0.3 tidyselect_0.2.5 rlang_0.3.4 reshape2_1.4.3 munsell_0.5.0
[73] tools_3.5.1 RSQLite_2.1.1 ggridges_0.5.1 stringr_1.4.0 yaml_2.2.0 bit64_0.9-7
[79] caTools_1.17.1.1 purrr_0.3.2 ggraph_1.0.2 bindrcpp_0.2.2 mime_0.6 R.oo_1.22.0
[85] DO.db_2.9 xml2_1.2.0 biomaRt_2.36.1 compiler_3.5.1 rstudioapi_0.10 tibble_2.1.1
[91] tweenr_1.0.1 stringi_1.2.4 lattice_0.20-38 pillar_1.4.0 triebeard_0.3.0 GlobalOptions_0.1.0
[97] cowplot_0.9.4 bitops_1.0-6 httpuv_1.4.5 rtracklayer_1.40.6 qvalue_2.14.1 R6_2.4.0
[103] promises_1.0.1 KernSmooth_2.23-15 gridExtra_2.3 RcisTarget_1.5.0 codetools_0.2-15 boot_1.3-22
[109] gtools_3.8.1 assertthat_0.2.1 SummarizedExperiment_1.10.1 rjson_0.2.20 withr_2.1.2 GenomicAlignments_1.18.1
[115] Rsamtools_1.32.3 GenomeInfoDbData_1.2.0 doSNOW_1.0.16 hms_0.4.2 tidyr_0.8.2 rvcheck_0.1.3
[121] ggforce_0.2.2 shiny_1.2.0
Hi all,
cisTopic is a novel tool for single cell ATAC data analyzing, which applies the latent Dirichlet allocation (LDA) algorithm to reduce dimensions. It was reported that cisTopic can specifically handle cells at low-depth (around 3k per cell). This is very useful because the technique difficulties of single cell experiment and sequencing. Here comes my question: If I understand right, cells from the simulated data in the cisTopic paper (cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data Nature Method 2019) have a generally low-depth character (sup.figure 1,2 and 4). But what if in a real dataset both high depth and low depth cells are captured and sequenced, does cisTopic will treat them properly without bias for depth level?
The rationale of my question comes from the concern about distinguishing of broken cells from true low depth cells. Some tools (like snapATAC and cellranger-atac) will normalize all cells to their depth, some may encourage users to prefilter low-depth cells out. What do you suggest to treat scATAC datasets with both high and low depth cells within?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.