costalab / crosstalker Goto Github PK

View Code? Open in Web Editor NEW

41.0 6.0 10.0 87.86 MB

R package to do the Ligand Receptor Analysis Visualization

Home Page: http://costalab.github.io/CrossTalkeR

License: MIT License

R 4.05% Jupyter Notebook 95.95% CSS 0.01%

cell-cell-communication network single-cell

crosstalker's Introduction

Bioinformatics Lab RWTH Aachen SS '15

The course page can be found here: http://costalab.org/teaching/practical-course-in-bioinformatics

Discovering Context-Specific Sequencing Errors

Certain sequence contexts induce errors in next-generation sequencing reads, as detailed in our publications:

Manuel Allhoff, Alexander Schoenhuth, Marcel Martin, Ivan G. Costa, Sven Rahmann and Tobias Marschall. Discovering motifs that induce sequencing errors. BMC Bioinformatics (proceedings of RECOMB-seq), 2013, 14(Suppl 5):S1, DOI: 10.1186/1471-2105-14-S5-S1.

On this page, we maintain the source code of our program to discover error-causing motifs.

Dependencies

python 2.7 (tested with Python 2.7.3)
HTSeq
pysam
scipy
rpy2

Pipeline Dependencies TODO

Cluster Dependencies

python 2.6 (tested with Python 2.6.6)
HTSeq
pysam
scipy 0.15
numpy > 1.5.1
rpy2

Installation

install required python packages (see Dependencies)
checkout the code from the git repository

Analysis

Run

discovering_cse.py -h

to show the help information.

Our tool considers exactly one chromosome in the genome for the analysis. The default chromosome is 'chr1'. You can change it with the option -c. If the genome does not have chromosomes (for example E. Coli or B. Subtilis genomes), you do not have to use the -c option.

We run the tool, for instance, with the following command:

discovering_cse.py hg19.fasta experiment.bam 6 1 -d 0 -c chr10 > results.data

Here, we consider chromosome 10 (-c chr10) of the human genome (hg19.fasta) and search for 6-grams with one allowed N. The analysis is based on the aligned reads which are contained in 'experiment.bam'. Moreover, we do not filter the output (-d 0) which is stored in 'results.data'

The output results.data is a tab delimited text file and looks like:

Sequence	Occurrence	Forward Match	Backward Match	Forward Mismatch	Backward Mismatch	Strand Bias Score	FER (Forward Error Rate)	RER (Reverse Error Rate)	ERD (Error rate Difference)
CCANTC	12384	35	37	331	9	12.3345029322	0.894557485622	0.222222222222	0.520215753219

We obtain the 6-gram CCANTC wich occurs 12384 times in the genome. It gives the following 2x2 contingency table:

	Match	Mismatch
Foward	35	331
Backward	37	9

This table corresponds to a strand bias score of 12.3345029322.

Align, statistics, SNP calling

This step must be performed before the actual CSE discovery, when BAM files are not available.

The script bio_pipeline.sh allows an easy and almost-painless alignment procedure, as well as statistics generation and finally SNP calling via GATK.

At the very least, it requires two files:

a FASTA reference genome (option -ref) (this MUST be in the current directory)
a SRA reads file (option -sra) (this can be in any directory, or even "fake" with option -sra-nocheck)

bio_pipeline.sh -ref sequence.fasta -sra reads.sra

Without other options, the script will:

assume the organism is haploid (ie, only has one set of chromosomes)
extract the SRA file to FASTQ files, and remove (in simple cases) the adaptor-only files
the previous step also infers whether the SRA is paired-ends or single-ends
align the reads to the reference genome, using bwa-backtrack with default options
convert SAM file to BAM
remove duplicates, re-align near indels, produce a clean BAM file
generate plenty of statistics for the final BAM file
run GATK SNP calling and print the final number of SNPs

Note: the name of the SRA file determines the prefix for the fastq files, the vcf file and so on.

All the files are kept: it is up to you to delete the temporary files, or the unnecessary ones. A log is generated to output.log, containing both the stdout and stderr produced during the execution of bio_pipeline.sh.

The bare command presented above is OK for Illumina short-reads, either paired or single ends.

Among the options, some are notable:

-mem: use bwa-mem instead of backtrack (very important for Ion Torrent, 454)
-mem-pacbio: a special option for PacBio, instead of the previous one (adds "-x pacbio" to the bwa mem options)
-mem-ont2d: a special option for Oxford Nanopore reads (it simply adds "-x ont2d" to the bwa mem options)
-dbq 0: in the (rare) case that some reads have missing qualities for certain bases, default the quality of the missing bases to 0 instead of ditching the read. Might be useful for 454 reads
-nofix: GATK will complain and terminate if some reads' mapping quality is over ~60. This option will let GATK keep going. It is advised that you first check if GATK was right to complain (eg, the quality encoding is non-standard)

Examples

Remove technical reads

Many SRA files contain multiple sequences, all dumped when using fastq-dump --split-files reads.sra. The cases we have experienced are:

4 files: it means reads_1.fastq is technical, reads_2.fastq is not technical, reads_3.fastq is technical, reads_4.fastq is not technical. The first and third file must be removed, the second and fourth must be used in a paired-end alignment
3 files: it means reads_1.fastq is not technical, reads_2.fastq is technical, reads_3.fastq is not technical. The second file must be removed and the other two be used in a paired end alignment
2 files: either they are both non technical, or reads_1.fastq is technical while reads_2.fastq is not. In the first case, both must be used in paired end alignment. In the second case, reads_1.fastq must be removed and reads_2.fastq be used in a single-end alignment

Case 1 and 2 are automatically handled. Case 3, by default, assume the two reads are biological reads and use them in a paired-end alignment. However, if we use the option -fs, it will remove reads_1.fastq and rename reads_2.fastq into reads.fastq, then using it in a single-end alignment.

bio_pipeline.sh -ref sequence.fasta -sra reads.sra -fs
# the only fastq file present here is reads.fastq

FASTQ files already present

Suppose you are directly using fastq files instead of SRA files. What you want is to skip the SRA file check or the script won't run. You need to always have either one fastq or two fastq files, never less or more.

bio_pipeline.sh -ref sequence.fasta -sra reads.sra -sra-nocheck

Note how you still need to specify the sra file name. This is used to extract the prefix (in this case, "reads") so as to fetch the correct fastq files.

# The above command will look, at least, for:
reads_1.fastq
# If the following is also present, it will proceed with
# a paired-end alignment. Otherwise, it will only use the first
# and proceed to single-end alignment
reads_2.fastq

If you have single-end fastq files in the more compact format "reads.fastq", you just need to add the option -fs.

bio_pipeline.sh -ref sequence.fasta -sra reads.sra -sra-nocheck -fs

This will look for reads.fastq and, if found, proceed with a single-end alignment.

crosstalker's People

Contributors

Stargazers

Watchers

Forkers

millersan ronghui1992 ab1402 song984888 rdavis7559 nawa2179 gladelephant ewowiredu wiscannis

crosstalker's Issues

Dependency ‘Rmagic’ is no longer available

"Package ‘Rmagic’ was removed from the CRAN repository."
See message here: https://cran.r-project.org/web/packages/Rmagic/index.html

When I try to install CrossTalkeR, I get the following related error:

Downloading GitHub repo CostaLab/CrossTalkeR@HEAD
Skipping 1 packages not available: Rmagic
── R CMD build ────────────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file ‘/private/var/folders/zd/k6yfv3zs6n16n5x92kl74vtr0000gn/T/RtmpodZlpM/remotes28201ad4d828/CostaLab-CrossTalkeR-894b346/DESCRIPTION’ ...
─ preparing ‘CrossTalkeR’:
✔ checking DESCRIPTION meta-information ...
─ installing the package to build vignettes
-----------------------------------
ERROR: dependency ‘Rmagic’ is not available for package ‘CrossTalkeR’
─ removing ‘/private/var/folders/zd/k6yfv3zs6n16n5x92kl74vtr0000gn/T/RtmpJBUMLD/Rinst2cc02f5748e1/CrossTalkeR’
-----------------------------------
ERROR: package installation failed
Error: Failed to install 'CrossTalkeR' from GitHub:
! System command 'R' failed

I tried using what I believe to be the related GitHub (https://github.com/cran/Rmagic), but was unsuccessful. Is there a workaround, or plans to remove dependency of CrossTalkeR on RMagic?

install error: Duplicate vignette titles

When I try to install the current version, I get this error:

E creating vignettes (2m 2.1s)
duplicated vignette title:
‘CrossTalkeR-HumanMyfib’

--- re-building ‘CrossTalkeR.Rmd’ using rmarkdown
--- finished re-building ‘CrossTalkeR.Rmd’

--- re-building ‘CrossTalker_install_basicusage.rmd’ using rmarkdown
--- finished re-building ‘CrossTalker_install_basicusage.rmd’

--- re-building ‘HumanFibrosis.rmd’ using rmarkdown
Warning: ggrepel: 62 unlabeled data points (too many overlaps). Consider increasing max.overlaps
--- finished re-building ‘HumanFibrosis.rmd’

--- re-building ‘run_liana.rmd’ using rmarkdown
--- finished re-building ‘run_liana.rmd’

Error: Duplicate vignette titles.
Ensure that the %\VignetteIndexEntry lines in the vignette sources
correspond to the vignette titles.
Execution halted
Error: Failed to install 'CrossTalkeR' from GitHub:
! System command 'R' failed

igraph error

Hello,

I was running your software on my data when I came across the following error (below). How do I resolve this ?

R[write to console]: Reading Files

CTR
"filtered_corrected.csv"
[1] 6
EXP
"filtered_corrected.csv"
[1] 6
R[write to console]: Create a Differential Table

R[write to console]: Error in igraph::E<-(*tmp*, value = *vtmp*) : invalid indexing
Calls: ... generate_report -> create_diff_table_wip ->

R[write to console]: In addition:
R[write to console]: Warning message:

R[write to console]: In max(table(final_data$cellpair)) :
R[write to console]:

R[write to console]: no non-missing arguments to max; returning -Inf

Problematic cell type names

Hi! I'm trying to run CrossTalkeR and got this error message:

Error in dimnames(x) <- dn: length of 'dimnames' [1] not equal to array extent
Traceback:

generate_report(paths, genes, sel_columns = c("Ligand", "Receptor",
. "Ligand.Cluster", "Receptor.Cluster", "isReceptor_fst", "isReceptor_scn",
. "MeanLR"), out_path = paste0(output, "/"), threshold = 0,
. out_file = "vignettes_example.html", output_fmt = "html_document",
. report = TRUE)
read_lr_single_condition(lrpaths, sel_columns, out_path, sep = ",",
. colors)
rownames<-(*tmp*, value = sort(unif_celltypes))

I went into the functions that read the data and I guess there's a problem with underscores in cell type names as the underscore is also used for separating cell type names when naming cell type pairs. Am I right?

CellPhoneDB Analysis Hangs up in Python

Hello,

I am trying to generate the liana report. Firstly, I am a bit confused because I can't find in the CrossTalkeR documentation or the liana documentation where it explains how to generate a report for use in CrossTalkeR by using R and not python.

Anyway, when I run the following code block from the HumanFibrosis.rmd vignette, with the fields from my object in place of the generic listing:

for i in set(data.obs['condition']):
    print(i)
    lr=li.method.cellphonedb(data[data.obs['condition']==i],
                          groupby='cell.type',
                          expr_prop=0.1,
                          verbose=True,
                          resource_name='consensus',
                          inplace=False)
    lr.to_csv(f"{i}_lr_liana_consensus_unfiltered.csv")

The python module just hangs up and nothing happens. Any advice would be appreciated.

Error: You should have at least two distinct break values.

Hi. Thanks for this great tool.
I am trying to get through your last code from the cellphonedb tutorial and cannot figure out my problem.
I am comparing 2 cellphonedb-generated files with data successfully extracted as LR (s1a and s2a) and have successfully initiated the code using:

data <- generate_report(paths,
genes=genes1,
out_path='~/Desktop/',
threshold = 0,
out_file='All_DG.html' )

where genes1 = c("TGFB1", "TNF")

Here is what I get:

Reading Files
CTR
"s1a_filtered_corrected.csv"
EXP
"s2a_filtered_corrected.csv"
Create a Differential Table
Calculating CCI Ranking
EXP_x_CTR

Calculating GCI Ranking
EXP_x_CTR
Annotating the top Cell Genes
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
Defining templates
Generating Report
Preparing Single Phenotype Report
Printing CCI CTR
Printing CCI EXP
Preparing Comparative Phenotype Report
Printing CCI EXP_x_CTR
Quitting from lines 133-145 (./Comparative_Condition_cgi.Rmd)
Quitting from lines 79-90 (./Comparative_Condition_cgi.Rmd)
Error: You should have at least two distinct break values.
In addition: Warning messages:
1: In eattrs[[name]][index] <- value :
number of items to replace is not a multiple of replacement length
2: In eattrs[[name]][index] <- value :
number of items to replace is not a multiple of replacement length
3: Removed 1290 rows containing missing values (geom_label_repel).
4: Removed 1291 rows containing missing values (geom_label_repel).
Error: You should have at least two distinct break values.

It looks to me like most of the code has run successfully. I have tried adding colors (2 colors or 7 colors), changing the number of genes to examine (3 genes, 2 genes, 1 gene). I would really appreciate your help with this.
Thanks!

Error: processing vignette 'CrossTalkeR.Rmd' failed with diagnostics:

Hi All,
I had an error when creating vignettes. Please see the error below. Any idea how to fix it?
Thanks!

── R CMD build ───────────────────────────────────────────────────────────────────────────────────────────────────
✔ checking for file 'C:\Users\Yan Liu\AppData\Local\Temp\RtmpkxR5Qo\remotes63a05faf5839\CostaLab-CrossTalkeR-312e957/DESCRIPTION' ...
─ preparing 'CrossTalkeR':
✔ checking DESCRIPTION meta-information ...
─ installing the package to build vignettes
E creating vignettes (57.9s)
--- re-building 'CrossTalkeR.Rmd' using rmarkdown
Quitting from lines 45-71 (./Comparative_Condition_cci.Rmd)
Error: processing vignette 'CrossTalkeR.Rmd' failed with diagnostics:
Length of new attribute value must be 1 or 25, the number of target edges, not 24
--- failed re-building 'CrossTalkeR.Rmd'

SUMMARY: processing the following file failed:
'CrossTalkeR.Rmd'

Error: Vignette re-building failed.
Execution halted
Error: Failed to install 'CrossTalkeR' from GitHub:
! System command 'Rcmd.exe' failed

Error in reading file while using vignette

Hi,
I was trying to run the vignette on my data (two samples - primary tumor and metastatic) and I am getting following error:

Error in read.table(file = file, header = header, sep = sep, quote = quote, : no lines available in input In addition: Warning message: In file(file, "rt") : file("") only supports open = "w+" and open = "w+b": using the former

My data contains read counts- genes are rows and cells as columns.

Group comparison

Hi, I am trying to do the analyse comparing disease to control but i keep receiving this error:

R[write to console]: Error in prcomp.default(all_both, center = TRUE, scale = TRUE) :
cannot rescale a constant/zero column to unit variance

Error in prcomp.default(all_both, center = TRUE, scale = TRUE) :
cannot rescale a constant/zero column to unit variance

Any advice on what I may be doing wrong?

Thanks!

Error when compare two group

Hello, when I run the function named generate_report, this error happened. Thank you.

generate_report_liana fails when reading data

I understand that the integration of CrosstalkR w/ Liana is still under dev, but I am excited to use this combination of packages. I have run Liana and aggregated the results and used the aggregated input for CrossTalkeR. I am getting the following error:

Reading Data
Error in UseMethod("select") : 
  no applicable method for 'select' applied to an object of class "character"

Below is the code I used to generate the Liana object and load it into generate_report_liana"

csf2 <- readRDS("./csf2.rds")
csf.only <- subset(csf2, subset = location == "CSF")
DefaultAssay(csf.only) <- "RNA"
#save cell cluster names as metadata to help squidpy run
csf.only[["clust.names"]] <- Idents(object = csf.only)
csf.only <- SetIdent(csf.only, value = "clust.names")
# Run liana
liana_test <- liana_wrap(csf.only,
                         method = c("sca", "natmi", "logfc", "connectome",
                                    "cellchat", "squidpy", "cellphonedb"),
                         resource = c('OmniPath'),
                         # CellChat requires normalized data
                         cellchat.params = list(.normalize=TRUE))  
saveRDS(liana_test, "./liana_test.rds")
liana_test %>% glimpse()
con.ranks <- liana_test %>% liana_aggregate()

#plotting and celltalkeR analysis
test.analy <- generate_report_liana(
  con.ranks,
  genes = NULL,
  out_path = "~/Documents/Research Projects/scRNA_seq_interact_preditct/",
  threshold = 0,
  colors = NULL,
  out_file = NULL,
  report = "test_liana.html",
  output_fmt = "html_document",
  sel_columns = c("source", "ligand", "target", "receptor", "aggregate_rank")
)

sessionInfo()

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.5.2

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] CrossTalkeR_1.2.1           forcats_0.5.1               stringr_1.4.0              
 [4] dplyr_1.0.7                 purrr_0.3.4                 readr_2.0.2                
 [7] tibble_3.1.5                ggplot2_3.3.5               liana_0.0.1                
[10] SingleCellExperiment_1.14.1 SummarizedExperiment_1.22.0 Biobase_2.52.0             
[13] GenomicRanges_1.44.0        GenomeInfoDb_1.28.4         IRanges_2.26.0             
[16] S4Vectors_0.30.2            BiocGenerics_0.38.0         MatrixGenerics_1.4.3       
[19] matrixStats_0.61.0          SeuratObject_4.0.2         

loaded via a namespace (and not attached):
  [1] statnet.common_4.5.0      rsvd_1.0.5                colorBlindness_0.1.9     
  [4] ica_1.0-2                 svglite_2.0.0             ps_1.6.0                 
  [7] foreach_1.5.1             lmtest_0.9-38             rprojroot_2.0.2          
 [10] crayon_1.4.1              spatstat.core_2.3-0       MASS_7.3-54              
 [13] nlme_3.1-152              backports_1.2.1           reprex_2.0.0             
 [16] GOSemSim_2.18.1           rlang_0.4.11              XVector_0.32.0           
 [19] ROCR_1.0-11               readxl_1.3.1              irlba_2.3.3              
 [22] callr_3.7.0               limma_3.48.1              scater_1.18.6            
 [25] BiocParallel_1.26.2       rjson_0.2.20              bit64_4.0.5              
 [28] glue_1.4.2                rngtools_1.5.2            sctransform_0.3.2        
 [31] processx_3.5.2            vipor_0.4.5               spatstat.sparse_2.0-0    
 [34] AnnotationDbi_1.54.1      Rmagic_2.0.3              CellChat_1.1.3           
 [37] DOSE_3.18.3               spatstat.geom_2.2-2       haven_2.4.1              
 [40] tidyselect_1.1.1          usethis_2.0.1             fitdistrplus_1.1-6       
 [43] tidyr_1.1.4               zoo_1.8-9                 xtable_1.8-4             
 [46] magrittr_2.0.1            scuttle_1.2.1             cli_3.0.1                
 [49] zlibbioc_1.38.0           rstudioapi_0.13           miniUI_0.1.1.1           
 [52] logger_0.2.1              rpart_4.1-15              fastmatch_1.1-3          
 [55] treeio_1.16.2             shiny_1.7.1               BiocSingular_1.8.1       
 [58] clue_0.3-59               pkgbuild_1.2.0            cluster_2.1.2            
 [61] tidygraph_1.2.0           KEGGREST_1.32.0           ggrepel_0.9.1            
 [64] ape_5.5                   listenv_0.8.0             Biostrings_2.60.1        
 [67] png_0.1-7                 future_1.22.1             withr_2.4.2              
 [70] bitops_1.0-7              ggforce_0.3.3             plyr_1.8.6               
 [73] cellranger_1.1.0          dqrng_0.3.0               coda_0.19-4              
 [76] pillar_1.6.3              GlobalOptions_0.1.2       cachem_1.0.6             
 [79] fs_1.5.0                  GetoptLong_1.0.5          clusterProfiler_4.0.5    
 [82] DelayedMatrixStats_1.14.3 vctrs_0.3.8               ellipsis_0.3.2           
 [85] generics_0.1.0            devtools_2.4.2            NMF_0.23.0               
 [88] tools_4.1.0               beeswarm_0.4.0            munsell_0.5.0            
 [91] tweenr_1.0.2              fgsea_1.18.0              DelayedArray_0.18.0      
 [94] fastmap_1.1.0             compiler_4.1.0            pkgload_1.2.1            
 [97] abind_1.4-5               httpuv_1.6.3              sessioninfo_1.1.1        
[100] pkgmaker_0.32.2           plotly_4.10.0             GenomeInfoDbData_1.2.6   
[103] gridExtra_2.3             edgeR_3.34.0              ggnewscale_0.4.5         
[106] lattice_0.20-44           deldir_1.0-2              utf8_1.2.2               
[109] later_1.3.0               RobustRankAggreg_1.1      jsonlite_1.7.2           
[112] scales_1.1.1              ScaledMatrix_1.0.0        tidytree_0.3.5           
[115] pbapply_1.5-0             sparseMatrixStats_1.4.2   lazyeval_0.2.2           
[118] promises_1.2.0.1          doParallel_1.0.16         goftest_1.2-3            
[121] spatstat.utils_2.2-0      reticulate_1.22           sna_2.6                  
[124] checkmate_2.0.0           cowplot_1.1.1             statmod_1.4.36           
[127] Rtsne_0.15                downloader_0.4            uwot_0.1.10              
[130] igraph_1.2.6              survival_3.2-11           yaml_2.2.1               
[133] systemfonts_1.0.2         htmltools_0.5.2           memoise_2.0.0            
[136] Seurat_4.0.4              locfit_1.5-9.4            graphlayouts_0.7.1       
[139] here_1.0.1                viridisLite_0.4.0         digest_0.6.28            
[142] assertthat_0.2.1          mime_0.12                 rappdirs_0.3.3           
[145] registry_0.5-1            RSQLite_2.2.7             yulab.utils_0.0.4        
[148] future.apply_1.8.1        remotes_2.4.0             data.table_1.14.2        
[151] blob_1.2.1                ggsci_2.9                 splines_4.1.0            
[154] Cairo_1.5-12.2            RCurl_1.98-1.5            broom_0.7.8              
[157] hms_1.1.1                 modelr_0.1.8              colorspace_2.0-2         
[160] ggbeeswarm_0.6.0          shape_1.4.6               aplot_0.1.1              
[163] Rcpp_1.0.7                RANN_2.6.1                circlize_0.4.13          
[166] enrichplot_1.12.2         fansi_0.5.0               tzdb_0.1.2               
[169] parallelly_1.28.1         R6_2.5.1                  grid_4.1.0               
[172] ggridges_0.5.3            lifecycle_1.0.1           bluster_1.2.1            
[175] curl_4.3.2                leiden_0.3.9              testthat_3.0.3           
[178] DO.db_2.9                 Matrix_1.3-4              qvalue_2.24.0            
[181] desc_1.3.0                RcppAnnoy_0.0.19          org.Hs.eg.db_3.13.0      
[184] RColorBrewer_1.1-2        iterators_1.0.13          htmlwidgets_1.5.4        
[187] beachmat_2.8.1            polyclip_1.10-0           network_1.17.1           
[190] shadowtext_0.0.9          gridGraphics_0.5-1        rvest_1.0.0              
[193] ComplexHeatmap_2.8.0      mgcv_1.8-36               globals_0.14.0           
[196] patchwork_1.1.1           codetools_0.2-18          lubridate_1.7.10         
[199] GO.db_3.13.0              FNN_1.1.3                 metapod_1.0.0            
[202] gtools_3.9.2              prettyunits_1.1.1         dbplyr_2.1.1             
[205] gridBase_0.4-7            RSpectra_0.16-0           gtable_0.3.0             
[208] DBI_1.1.1                 ggalluvial_0.12.3         ggfun_0.0.4              
[211] tensor_1.5                httr_1.4.2                KernSmooth_2.23-20       
[214] stringi_1.7.5             progress_1.2.2            reshape2_1.4.4           
[217] farver_2.1.0              viridis_0.6.1             ggtree_3.0.4             
[220] xml2_1.3.2                BiocNeighbors_1.10.0      OmnipathR_3.1.4          
[223] ggplotify_0.1.0           scattermore_0.7           scran_1.20.1             
[226] bit_4.0.4                 scatterpie_0.1.7          spatstat.data_2.1-0      
[229] ggraph_2.0.5              pkgconfig_2.0.3

Any help on what is throwing this error would be much appreciated.

Compare CCIs for more than two conditions

Hi team,
I wonder if CrossTalkeR can be used to compare three conditions? Let's say, control, disease and treatment.
Thanks!

integrated or merged data?

I was wondering if this CrossTalkeR pipeline could be applied in merged data from two or more 10x runs.

I think that this makes more sense in integrated data, where cells with shared similar global identity are grouped together despite their differences, and so cellphonedb or CrossTalkeR can further explore LR interactions per condition.

However, in my case, integration is not an option as it mixes populations of cells that are important to be separated in different clusters, so I directly merged data. With merged data, clusters of cells are going to be very similar regardless condition (and differences in cell count per cluster can be also important between conditions). Despite this, when I use CrossTalker I can see enrichment in some LR interactions.

Do you think that CrossTalkeR can be performed with merged data? If not, maybe a solution could be to integrate data just to follow this pipeline, or maybe to take very general clusters (e.g. NK + cd8_eff + gd --> cytotoxic), and so the effect of LR interactions between conditions will be easier to assess.

Vignette not working

Hello,

Thank you guys for making this package! I think that it will be very helpful.

I ran vignette('CrossTalkeR'), but received the following error:

Error in Vignette("crosstalker") : could not find function "Vignette"

How can I fix this? I am having other issues with the package, but I think that the vignette would help me understand my problem. I believe that everything else installed normally, as instructed on the homepage of this repository.

installation error

Dear Team,
Nice work. I have installation issue. Do you know what might cause the issue?
I am running R 4.0 on ubuntu.
Thanks,
Zhang

library(CrossTalkeR)

Bioconductor version 3.12 (BiocManager 1.30.12), ?BiocManager::install for help
Error: package or namespace load failed for ‘CrossTalkeR’:
.onLoad failed in loadNamespace() for 'org.Hs.eg.db', details:
call: l$contains
error: $ operator is invalid for atomic vectors
In addition: Warning messages:
1: call dbDisconnect() when finished working with a connection
2: replacing previous import ‘dplyr::union’ by ‘igraph::union’ when loading ‘CrossTalkeR’
3: replacing previous import ‘dplyr::as_data_frame’ by ‘igraph::as_data_frame’ when loading ‘CrossTalkeR’
4: replacing previous import ‘dplyr::groups’ by ‘igraph::groups’ when loading ‘CrossTalkeR’
5: replacing previous import ‘clusterProfiler::simplify’ by ‘igraph::simplify’ when loading ‘CrossTalkeR’

which LR analysis tools does CrossTalkeR work with?

Hi,

Thanks for your tool. Really appreciate the idea of looking into differential LR interaction based on conditions!

I am having a hard time running it using the output of cellchat. So I thought of running cellphoneDB first, as mentioned in the tutorial. However, I am stuck again and I have a feeling that the workflow of cellphoneDB has changed overtime. The latest version of cellphoneDB does not work the same as it's predecessors. It cannot be used as an executable on command line as mentioned in the tutorial here. Please correct me if I am wrong.

In case of the output from cellchat, I have organised the input as shown in the tutorial for both CTR and EXP but I get the following error :

data <- generate_report(paths,
                         genes,
                         out_path=paste0(output,'/'),
                         threshold=0,
                         out_file = 'vignettes_example.html',
                         output_fmt = "html_document",
                         report = FALSE)

Reading Files
Error in `dplyr::mutate()`:
ℹ In argument: `ccitype = paste(data1[[sel_columns[5]]], data1[[sel_columns[6]]])`.
Caused by error:
! `ccitype` must be size 2863 or 1, not 0.
Run `rlang::last_trace()` to see where the error occurred.

Here's a snapshot of the input :

	source	target	gene_A	gene_B	gene_type_A	gene_type_B	MeanLR
1	Cranial neural crest	Neural 3	wnt5b	fzd10	Ligand	Receptor	0.000714393318220126
2	Cranial neural crest	Cranial neural crest	wnt5b	fzd2	Ligand	Receptor	0.000260300985644489
3	Cranial neural crest	Pharyngeal mesenchyme	wnt5b	fzd2	Ligand	Receptor	0.000462609689846343
4	Cranial neural crest	Sox10+ neural crest	wnt5b	fzd2	Ligand	Receptor	0.000228502229095687
5	Cranial neural crest	Eye	wnt5b	fzd3a	Ligand	Receptor	9.15492336721415E-05

Is this because of the zebrafish genes? Or is it something else? Please let me know. I will be happy to help you with more information from my end if needed!

Cheers.

Error in final code "Error in parse(text = x, srcfile = src) : attempt to use zero-length variable name"

Hi.
I shouldn't have closed the last session. Sorry. I am new to github and to reporting issues.
I have successfully run your code before and obtained 2 htmls (as well as 3 rds files). This time, the code seems to run well until the end and I cannot figure out what I am doing wrong and more importantly, where to look. I realize that it probably is some spelling or backtick problem?, but I cannot catch it. I am using identical code to whatever worked before.
I have tried changing backticks, starting R and Rstudio again, reinstalling crosstalker and rerunning code to arrive at filtered_corrected.csvs. I also engaged 'knitr' just in case. I am using R 4.1.0, Rstudio 1.4.1 on a macbook pro OS 11.4

I will add that everything runs fine with report=FALSE but error persists with genes=NA, report=TRUE

I'm sorry for such a trivial problem but, as before, any help would be much appreciated.

I have attached a sample of the final code with error:

paths <- c('CTR' = '~/Desktop/s1_filtered_corrected.csv',

       'EXP' = '~/Desktop/s2_filtered_corrected.csv')

genes1 <- c('TGFB1', 'FGFR2', 'TNF')
data <- generate_report(paths,

```
                      genes=genes1,
```

                      out_path='~/Desktop/',

```
                      threshold = 0, 
```

                      out_file='All_DG.html')

Reading Files
CTR
"/Desktop/soft_filtered_corrected.csv"
EXP
"/Desktop/stiff_filtered_corrected.csv"
Create a Differential Table
Calculating CCI Ranking
EXP_x_CTR

Calculating GCI Ranking
EXP_x_CTR
Annotating the top Cell Genes
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
'select()' returned 1:many mapping between keys and columns
Defining templates
Generating Report
Preparing Single Phenotype Report
Printing CCI CTR
Printing CCI EXP

Table PCA CGI CTR_ggi

Quitting from lines 5-26 (./Single_Condition_cgi.Rmd)

PCA CGI

Here, a Principal Component Analysis (PCA) was done using the cell-gene interaction topological measures.High ranked observations (>=2$\sigma^{2}$) are labeled and each measure contribution is placed in the coordinate system

PCA Tables

Quitting from lines NA-46 (./Single_Condition_cgi.Rmd)
Quitting from lines NA-46 (./Single_Condition_cgi.Rmd)
Error in parse(text = x, srcfile = src) :
attempt to use zero-length variable name
In addition: Warning messages:
1: In eattrs[[name]][index] <- value :
number of items to replace is not a multiple of replacement length
2: Removed 281 rows containing missing values (geom_label_repel).
3: Removed 291 rows containing missing values (geom_label_repel).

mouse/human gene symbols in CrosstalkeR

Hi,
CrosstalkeR is a great tool for our projects.
I tried the human data and it worked perfect well. but when I tried mouse genes, it seemed to fail to read the gene symbol.
Is it compatible with mouse genes now? or any way to make it work for mouse genes?
Thank you.

below is the error showed in my running:
Annotating the top Cell Genes
'select()' returned 1:many mapping between keys and columns
Error in .testForValidKeys(x, keys, keytype, fks) :
None of the keys entered are valid keys for 'SYMBOL'. Please use the keys method to see a listing of valid arguments.

No CellPhoneDB interacions found in this input.

Hi. I was running the following command:
cellphonedb method statistical_analysis cellphonedb_meta.txt cellphonedb_countv2.txt --threads 30 --counts-data hgnc_symbol
and I was getting the "No CellPhoneDB interacions found in this input." error.
My data has around 82k cells divided into 6 sample groups, so I followed the suggestions on page Teichlab/cellphonedb#309 and modified my command to :

cellphonedb method statistical_analysis cellphonedb_meta.txt cellphonedb_countv2.txt --threads 30 --counts-data hgnc_symbol --subsampling --subsampling-log False --subsampling-n
um-cells 50 --iterations=10
but I am getting the same error. My data has HGNC gene symbols. Any help would be appreciated. Thank you.

costalab / crosstalker Goto Github PK

crosstalker's Introduction

Bioinformatics Lab RWTH Aachen SS '15

Discovering Context-Specific Sequencing Errors

Dependencies

Pipeline Dependencies TODO

Cluster Dependencies

Installation

Analysis

Align, statistics, SNP calling

Examples

Remove technical reads

FASTQ files already present

crosstalker's People

Contributors

Stargazers

Watchers

Forkers

crosstalker's Issues

Table PCA CGI CTR_ggi

PCA CGI

PCA Tables

Recommend Projects

Recommend Topics

Recommend Org