broadinstitute / tangram Goto Github PK
View Code? Open in Web Editor NEWSpatial alignment of single cell transcriptomic data.
License: BSD 3-Clause "New" or "Revised" License
Spatial alignment of single cell transcriptomic data.
License: BSD 3-Clause "New" or "Revised" License
Hi, Thank you for all your help in the running tangram!
For interpreting the project_genes
results, I noticed that it returns a "spot-by-gene AnnData containing spatial gene expression from the single cell data." I am confused on how each spot has one measurement per gene, because I thought that the goal of deconvolution was to get to single cell resolution (which would be multiple cells per spot). If there were two cells in one spot and one cell had very high expression of a marker gene and the other cell had very low expression of the same gene, would the project_genes return a "medium" expression of that particular gene?
I did look at the figure from plot_genes_sc
and noticed that there is a difference in the predicted vs measured, but am confused on what difference I am looking for and exactly what that represents. One of the figures is below.
Finally, I would like to use the single cell resolution to do some neighborhood analysis -- looking at which individual genes have spatial patterns, any pairs / groups of genes that have spatial patterns together, and then repeat with cell type / cluster (which cells have spatial patterns, any pairs / groups of cells that have spatial patterns together). I was looking through the squidpy
tutorial and noticed that there was not any deconvolution in the pipeline. How would you recommend approaching this?
Hello,
I was trying out the Visium example, but couldn't find the marker gene data that is used to subset relevant markers.
The exact file that I think missing is "spacejams_visp_markers.pkl". It would be nice if you can put the download link as you did on snRNAseq data and others.
Best regards,
Heesoo
Hi I'm trying to run the vignette and the kernel keeps dying within seconds of using the "map_cells_to_space" function. Is this a known issue or am I doing something wrong? The code is below (the data is the same as in the Github repo data folder).
ad_sp = sc.read_h5ad('path/test_ad_sp.h5ad')
ad_sc = sc.read_h5ad('path/test_ad_sc.h5ad')
tg.pp_adatas(ad_sc, ad_sp, genes=None)
ad_map = tg.map_cells_to_space(ad_sc, ad_sp,
mode="cells",
num_epochs=100,
device='cpu')
Hello
I'm Joy,a graduate student, trying to use tangram. First, thanks for such a wonderful tool.
I was wondering that basic Tangram model already trained with mouse brain tissue only.
I'm trying to segment cells(from other tissue which is not brain) with the cell segmentation function and proceed with deconvolution on visium data, because if the model is trained only with brain tissue, there is a possibility of analyzing it incorrectly.
If so should i train the model?
i thought that i don't need to train the model when proceed the deconvolution cuz the model already trained some data basically.
if i should train the model to handle other tissue, let me know how to train the model with a few tissue data and scRNA data.
Thank you.
Joy
I've been using your package to deconvolute some visium data following the tutorial. I have successfully made it to the ad_map creation. However, when I try to plot I receive the below error and it looks like it can't find the function. Any help is appreciated!
AttributeError: module 'tangram' has no attribute 'plot_cell_annotation_sc'
Hi @lewlin @ziqlu0722 @gscalia ,
following up on the chat re adding Tangram to Squidpy, I digged again into the repo and followed this tutorial: https://github.com/broadinstitute/Tangram/blob/master/example/1_tutorial_tangram.ipynb, here's my comments (in random order):
tg.pp_adatas
Very useful function, but I was wondering if it would be possible to make everything happen in place
, instead of copying over the anndata and reindexing? I'm thinking of something like this: https://github.com/YosefLab/scvi-tools/blob/b4256ebb84ebebd70fb920f73d13df9b9bbb73db/scvi/data/_anndata.py#L79
docs: https://docs.scvi-tools.org/en/stable/api/reference/scvi.data.setup_anndata.html#scvi.data.setup_anndata
It would boil down to finding a common set of markers from either the two adatas or an external list (as you show in the tutorial) and add it as a boolean series to the two adatas: adata_[space, sc]["markers_tangram"]
. It would be straightforward for the mapper
to subset and create the tensors afterward e.g. here:
Tangram/tangram/mapping_utils.py
Line 132 in 909d70b
tg.map_cells_to_space
Really cool that it accepts AnnData now! This makes it very convenient. There is a weird behaviour with the argument d
(density?), which is set to None
by default and is indeed exposed by the function, but is then re-set to either None
or np.ones(G.shape[0])/G.shape[0]
according to mode
.
Tangram/tangram/mapping_utils.py
Line 152 in 909d70b
In our tutorial we set it to d = np.array(adata_st.obs.cell_count) / adata_st.obs.cell_count.sum()
since we show how to get segmentation masks counts (and coordinates for further plotting at the end) for each spot, with the image container. Is this a bug or the behaviour changed? Also, it's missing from the docstring
ut.project_cell_annotations
Line 114 in 909d70b
adata_space.obsm["[tangram/deconv]_results"]
. In case we have info on the number of cells per spot (e.g. after segmentation), would this still work out?# highest probability a cell i is filtered if F_i > 0.5'
filtered_voxels_to_types = [
(j, adata_sc.obs.cell_subclass[k])
for i, j, k in zip(F_out, resulting_voxels, range(len(adata_sc)))
if i > 0.5
]
I guess such a filter could also follow adata_map.X.T @ df
, and potentially make the matrix sparse?
Meanwhile, I'll open a PR on squidpy for starting the external module addition.
Happy to provide more feedback or contribute, let me know what you think!
best,
Giovanni
Hi,
In the manuscript there is mention and demonstration (Fig 6, 7, 8) of a NN used to determine the coronal depth and automatically register slices to the ABA CCF. Is the automatic image registration and automated region calling capability available as part of Tangram?
Thanks!
Hi, thank you very much for developing such a helpful tool! I am interested in mapping multimodal data such as SHARE-seq to reveal spatial patterns of chromatin accessibility, but I don't know how to do that. I can map snRNA-seq to spatial data, but I have no idea to transfer snATAC-seq profile of the same cells to space. I want to visualize inferred spatial patterns of chromatin accessibility and transcription factor motif scores at single-cell resolution.
Thank you very much if there is a tutorial or code.
Hello,
I've noticed a mismatch between histology pixel coordinates and spots (spatial coordinates) when using the tg.plot_cell_annotation_sc
function, which went away when I manually supplied the scale_factor
argument. If I understand correctly, when adata_st.uns['spatial']
exists, the idea is to use that information, which should include scale factor, to accurately overlap the histology and spatial coords. In particular, my spatial AnnData had the scale factor stored in adata_st.uns['spatial'][sample_name]['scalefactors']['tissue_hires_scalef']
, which I believe is a standard location where it should be found (rather than needing to specify scale_factor
explicitly).
I traced the source of the "mismatch" to the default assignment of scale factor as 0.1, which is passed to sc.pl.spatial
even when adata_st.uns['spatial']
exists. I imagine the default of 0.1 only makes sense when adata_st.uns['spatial']
doesn't exist.
I believe the same issue exists for the tg.plot_genes_sc
function, though I haven't explicitly checked this. Apologies if I'm misunderstanding, and I've stored the scale factor in a non-standard location. Note that I'm using tangram 1.0.2 as installed through pip, and am following along the squidpy tutorial.
Thanks for providing the Tangram software!
Best,
-Nick
Hello,
This is a fairly small problem, but I figured it's still best to create an issue for it- in the most recent tutorial (example/1_tutorial_tangram.ipynb
), the following function is called:
tg.plot_training_scores(ad_map, bins=50, alpha=.5)
This in turn seems to call seaborn.histplot
, a function specific to seaborn>0.11.1, but the environment.yml
file in this repository installs seaborn 0.10.1. When following through the tutorial in my own python file, I get the following error:
Traceback (most recent call last):
File "/dcl02/lieber/ajaffe/SpatialTranscriptomics/LIBD/spython/tangram_testing/example_orig_nick.py", line 55, in <module>
tg.plot_training_scores(ad_map, bins=50, alpha=.5)
File "/users/neagles/.conda/envs/tangram/lib/python3.8/site-packages/tangram/plot_utils.py", line 26, in plot_training_scores
sns.histplot(data=df, y='train_score', bins=10, ax=axs_f[0]);
AttributeError: module 'seaborn' has no attribute 'histplot'
Thus I believe environment.yml
should be updated to require seaborn 0.11.1 instead of 0.10.1 (this fixed the issue for me).
Best,
-Nick
Hi, I assume the following section in your tutorial should rather say 'spatial data' than 'scRNAseq'? At least that's what I see in my data.
Some genes are detected with very different levels of sparsity - typically they are much more sparse in the scRNAseq than in the spatial data. This is due to the fact that technologies like Visium are more prone to technical dropouts.
Hi Tommaso,
Thanks for the great tool! I'm trying to apply it for deconvolution of Visium data, and am puzzled with few questions.
First, what's the best way to normalize the data? In the example you mention that you didn't run log-normalization, but I didn't find actual discussion of that in the paper, while normalization was shown to be the most influential step for scRNA-seq analysis.
Second, what's the best way of selecting genes? I don't expect that we should just blindly take all 30k of them, right? It would be super-valuable to know, whether the method is robust to the gene set or whether one should try different sets until success.
Finally, would it be too bad if I provide constant density (parameter d
) for all spots in the tissue? Our stainings are quite messed up, and I don't think segmentation is possible on them.
Hi, thanks so much to provide the useful tool for a python user.
I am going to use a sc-RNA data to annotate my 10X visum spatial transcriptomes data. I only have the coordinates for each spot, not the cells within the spot. Can I use this tool also ? From the exsample, I can see there is a pkl file which provides the cells number as well as the individual cell coordinates in the spot, which I don't have, or I can see from 10X visum platform, it's impossible to have those information.
I want to use Tangram on my own data,but cannot find the way to create own cell_positon file like Allen1_cell_centroids.pkl?
how to create Allen1_cell_centroids.pkl using my own Visium data
Hi,
I'm very eager to test your pipeline now using my Visium data. However, in your example notebook, there is this section:
'Load cells coordinates on Visium image'
cells_coordinates = mapping.utils.read_pickle('data/Allen1_cell_centroids.pkl')
The file Allen1_cell_centroids.pkl
is not present in your data
folder, and it is not clear how you can reproduce the file.
Could you please give an example on how to generate this file from the Space Ranger output?
Thanks
Hi, I am interested in using Tangram for my research. I have Visium data and 10X genomics snRNA-seq data. I am following the tutorial, which works flawlessly for the data provided, but gives an error when running map_cells_to_space
on my own data.
I am loading the Visium data straight from spaceranger, and I am constructing an anndata object from a .mtx
file for the single nucleus data.
Here is the code that I am running:
import os, sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scanpy as sc
import torch
from scipy import io
from scipy.sparse import coo_matrix, csr_matrix
import anndata
#sys.path.append('/home/exx/git/Tangram/') # uncomment for local import
import tangram as tg
%load_ext autoreload
%autoreload 2
%matplotlib inline
data_dir = 'data/'
# load one visium sample:
ad_sp = sc.read_visium('Visium1/outs/')
# load single-cell data
X = io.mmread("{}zhou_counts.mtx".format(data_dir))
# create anndata object
ad_sc = anndata.AnnData(
X=X.transpose().tocsr()
)
# load sample metadata:
sample_meta = pd.read_csv("{}zhou_meta.csv".format(data_dir))
# load gene names:
with open("{}zhou_gene_names.csv".format(data_dir), 'r') as f:
gene_names = f.read().splitlines()
ad_sc.obs = sample_meta
ad_sc.obs.index = ad_sc.obs['barcode']
ad_sc.obs = ad_sc.obs.drop(labels='barcode', axis=1)
ad_sc.var.index = gene_names
markers = pd.read_csv('data/zhou_marker_genes.csv')
# only keep markers that are in both dataset:
markers = markers[markers.gene.isin(ad_sp.var.index)]
# prepare for mapping
tg.pp_adatas(ad_sc, ad_sp, genes=markers.gene.unique())
assert ad_sc.uns['training_genes'] == ad_sp.uns['training_genes']
ad_map = tg.map_cells_to_space(
adata_sc=ad_sc,
adata_sp=ad_sp,
#device='cpu'
device='cuda'
)
The error when running map_cells_to_space
is the following:
INFO:root:Allocate tensors for mapping.
INFO:root:Begin training with 1500 genes and rna_count_based density_prior in cells mode...
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-130-a433d3224cd2> in <module>
3 adata_sp=ad_sp,
4 #device='cpu'
----> 5 device='cuda'
6 )
/dfs3b/swaruplab/smorabit/bin/software/miniconda3/envs/scvi-env/lib/python3.7/site-packages/tangram/mapping_utils.py in map_cells_to_space(adata_sc, adata_sp, cv_train_genes, cluster_label, mode, device, learning_rate, num_epochs, scale, lambda_d, lambda_g1, lambda_g2, lambda_r, lambda_count, lambda_f_reg, target_count, random_state, verbose, density_prior)
311 )
312 mapper = mo.Mapper(
--> 313 S=S, G=G, d=d, device=device, random_state=random_state, **hyperparameters,
314 )
315
/dfs3b/swaruplab/smorabit/bin/software/miniconda3/envs/scvi-env/lib/python3.7/site-packages/tangram/mapping_optimizer.py in __init__(self, S, G, d, d_source, lambda_g1, lambda_d, lambda_g2, lambda_r, device, adata_map, random_state)
59 self.target_density_enabled = d is not None
60 if self.target_density_enabled:
---> 61 self.d = torch.tensor(d, device=device, dtype=torch.float32)
62
63 self.source_density_enabled = d_source is not None
ValueError: too many dimensions 'matrix'
Hi,
I am using Tangram to align scRNA-Seq reference onto spatial data. My scRNA-Seq reference is a collection of samples from different datasets. As a result, the raw gene expression of the scRNA-Seq reference showed a batch effect due to sample sources. May I ask if you have any suggestions to harmonize the batch effect when aligning to spatial data? Thanks.
Hi,
Thanks for sharing the Tangram. It is a very interesting work. I wonder can the tangram algorithm works well when the ratios of cell types of spatial data and single-cell data are different? For example, cell type A takes up 90% of spatial data, while takes up 10% of single-cell data. Although the "clustering" mode is provided for the data from different samples/tissues, do you still assume that the cell-type ratios of single-cell data and spatial data should be similar?
Look forward to your reply. Thank you very much!
Best,
Cathy
Hello, I am interested in using Tangram to integrate my Visium/single cell data and I wanted to better understand the output of project_cell_annotation
stored in tangram_ct_pred
. Are these values useful on the absolute scale or only relative? I notice that your plotting function standardizes all of these values to 0-1 scale. Thanks!
Thank you for creating Tangram,
incredibly useful and impressive tool, for cell deconvolution in spatial transcriptomics.
I wondered if Tangram has the ability to detect cell type specific expression values after single cell mapping and obtain cell type specific expression values of the cells.
Hi,
Thanks for developing this tool. I have scRNA-seq data and Visium data - but have HE, non-fluorescent images. I am wondering if I can run Tangram to integrate the data? I am particularly interested in the deconvolution approach.
Thanks,
I have been working on my personal data set following the tutorial given by tangram on deconvolution.
All the code till count_cell_annotations seems to work all good and issue no problem.
However, when choosing the specific annotation for count_cell_annotations, the system returns this error.
ad_map, adata_sc, and adata_st were all formatted based on the tangram tutorial and had been working for all the processes.
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_36176/1374048480.py in
----> 1 tg.count_cell_annotations(
2 ad_map,
3 adata_sc,
4 adata_st,
5 annotation="our_annotation",
~\anaconda3\lib\site-packages\tangram\utils.py in count_cell_annotations(adata_map, adata_sc, adata_sp, annotation, threshold)
281
282 for k, v in vox_ct:
--> 283 df_vox_cells.iloc[k, df_vox_cells.columns.get_loc(v)] += 1
284
285 adata_sp.obsm["tangram_ct_count"] = df_vox_cells
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3359 casted_key = self._maybe_cast_indexer(key)
3360 try:
-> 3361 return self._engine.get_loc(casted_key)
3362 except KeyError as err:
3363 raise KeyError(key) from err
~\anaconda3\lib\site-packages\pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
~\anaconda3\lib\site-packages\pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
Your idea is very interesting. And thanks to your useful tutorial link, I've got my result by processing my scRNA-seq data. But cause I use R in my daily work , I'm unfamiliar about how to save the image to PDF after tg.plot_cell_annotation_sc. Hope you could give me a hand.
Thanks a lot!
In Section https://github.com/broadinstitute/Tangram#how-to-run-tangram-at-cell-level, it says
This allows to extend gene throughput, or correct for dropouts, if the single cells have higher quality (or more genes) than single cell data.
Do you mean spatial data instead of single cell data?
Hi, I am interested in using Tangram for my data.
We have multiple types of spatial transcriptome data(Visium/merFISH...) and single-cell data(10X/smart-seq), so we hope to run different data pairs with different parameters to get the best results. Which parameters need to be adjusted only for different types of data pairs? What do these parameters mean?
Thank you,
Longfei Li
I am attempting to use your tool for visium data and was hoping to clarify my analysis pipeline with your great tool. I currently am using publicly available single cell data, but will eventually have matched single cell data for the visium sample. I would ideally like to map the matched single cell data onto the visium spatial slide. From reading your tutorial, it seems that this would have to be done in two steps: deconvolution of the visium spatial spots and then mapping of the single cell data onto the slide. It was unclear from the tutorial if tangram is the appropriate method to be using for both these steps and which functions I should be using. Here are my questions:
As I have never done this type of analysis before, any guidance would be appreciated.
Thank you so much for your time.
Hello.
I've read your tutorial about tangram via the link https://github.com/broadinstitute/Tangram/blob/master/tutorial_tangram_without_squidpy.ipynb. This link contains a snRNA-seq dataset and a slide-seq2 dataset collected from MOp area in adult mouse brain. I want to use this dataset for further scientific research but this website doesn't provide a clear citation. I am wondering whether you could provide the data source or published paper to help me.
Thanks for your help.
Hi, thanks for your excellent work which transfers the cell type annotations onto space by a probabilistic mapping. Utlizing this results, can I calculate the distance between two cell types (not only colocalization)?
Hi,
from the matrix P(spatial spots X cells) I'd like to get the percentages for each cell type per spot.
Starting from the probability matrix spatial spots X cells, I was thinking to proceed in this way:
Do you see any pitfalls in this methodology? Would you recommend any another strategies to calculate cell type percentages per spot? I observed that generalising on cell types starting from individual cell probabilities gives me slightly better predictions rather than calculating the probability for each cell type, but maybe it's data-dependent.
Best,
Carlo
Hi, Thanks to authors for creating such a necessary tool for the spatial transcriptomics field. I have doubt regarding the
ad_map object, whether the sum of the probability of all the spatial cells mapped to each cluster is 1
or the sum of the probability of each spatial cell assigned to all the clusters is 1?
I read the paper, but it was not quite clear to me. When I print the value of b then it is 1. You can see down. What is the reasoning behind each cluster probability to be 1?
ad_map = tg.map_cells_to_space(ad_sc, ad_sp,mode='clusters',cluster_label='knownClusters')
gives ad_map object
AnnData object with n_obs × n_vars = 23 × 39521
obs: 'knownClusters', 'cluster_density'
var: 'uniform_density', 'rna_count_based_density'
uns: 'train_genes_df', 'training_history'
a= np.sum(ad_map.X,axis=0)
b= np.sum(ad_map.X,axis=1)
print(len(a), len(b))
39521, 23
print(b[0:10]) gives
[1.0000064 1.0000123 1.0000033 1.0000218 1.0000048 0.99999535
0.99999803 0.9999933 0.99999917 0.9999956 ]
Hello,
My colleagues and I have been able to use Tangram to successfully map genes onto our Visium spatial dataset thanks to your incredibly helpful tutorial. However, we would like to implement deconvolution (i.e. assigning cell types to each spot of the Visium slide) as part of our Tangram pipeline in order to generate images similar to the one included in the manuscript and attached below for reference but we are unsure how to go about it. I was not able to find functions related to deconvolution in the Git repository with the exception of df_to_cell_types
in utils.py:
Lines 376 to 402 in eb867f5
which is commented out. Is deconvolution forthcoming or am I overlooking something?
Like I mentioned before, we have otherwise been able to implement some of Tangram’s functionality already and we appreciate the help you’ve given us in the past.
Thank you,
Arta Seyedian
Dear Tommaso,
I want to run the mapping-visium-example.ipynb notebook. Currently I could not find where to get the Allen1_cell_count.h5ad dataset. Could you please tell me how to get the dataset?
Thanks!
Hello!
I don't have much experience using PyTorch, and I was wondering if Tangram could be easily modified to parallelize over multiple GPUs? I am trying to map onto a spatial dataset which is quite large (~500k cells) and am running into this error:
RuntimeError: CUDA out of memory.
Tried to allocate 52.38 GiB (GPU 0; 39.59 GiB total capacity;
860.74 MiB already allocated;
37.90 GiB free;
882.00 MiB reserved in total by PyTorch)
If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.
See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The GPUs I am using have a 40GB capacity so this error makes sense to me. Is there a way to split across 2 GPUs in PyTorch? I also understand that using mode = "cluster"
can help alleviate the processing resources required, but was curious about this issue nonetheless.
Thank you!
Hello,
I am followind the tutorial but using a single nucleus dataset as reference, and trying to map it to a visium sample. It seems I am finding a very poor training score on the reference dataset (see attached figure). Is there some way I could improve that?
Thank you for creating and maintaining the package.
It would be super useful if you could add an option to tg.plot_genes_sc()
and tg.plot_cell_annotation_sc()
so that the tissue image is shown in black and white. The pink colour of HE stains tends to blur the spot colours. Also, an option to change the alpha of tissue images and spots would be appreciated. Scanpy has similar options for sc.pl.spatial(visium, color="total_counts", bw=True, alpha_img=0.8)
. Thanks!
Hi,
Thanks for you great work.
Now I'm trying do perform tangram with squidpy. I am confused about the density_prior function. It's said in the help document:
density_prior (str, ndarray or None): Spatial density of spots, when is a string, value can be 'rna_count_based' or 'uniform', when is a ndarray, shape = (numbe r_spots,). This array should satisfy the constraints sum() == 1. If None, the density term is ignored. Default value is 'rna_count_based'.
what is "density_prior" and how the rna_count_based density are computed (using the raw counts, normalized counts or scaled data) ? I got lots of negative values. In my case, the scaled data is my adata.X, and the log normalized data is in adata.raw and the raw counts were not stored in my anndata object, so I think the negative values could be caused by the lack of raw counts. And what does "spatial density" mean ? Is it the cell number in a spot ?
Hi,
was wondering if you plan to release a pip/conda installable version of Tangram? Even just a setup.py file that would make it installable via git would be very useful!
Thank you!
EDIT: just saw #8 looking forward to see it merged!
I was trying to run the project_genes and got an error that my .obs indices were not equal. I looked through the documentation was confused which inputs the project_genes takes: the single cell adata or the spatial adata? In the code the argument name adata_sc
implies that it should be single cell, but the argument descriptions states spatial.
To follow up, what should the obs.index be representing in these adata inputs?
def project_genes(adata_map, adata_sc, cluster_label=None, scale=True):
"""
Transfer gene expression from the single cell onto space.
Args:
adata_map (AnnData): single cell data
adata_sp (AnnData): gene spatial data
cluster_label (AnnData): Optional. Should be consistent with the 'cluster_label' argument passed to map_cells_to_space
function.
scale (bool): Optional. Should be consistent with the 'scale' argument passed to map_cells_to_space
function.
Returns:
AnnData: spot-by-gene AnnData containing spatial gene expression from the single cell data.
"""
Your idea is very interesting. And I'm processing data from nanoball which is a new method for spatial RNA-seq.
And I found your paper because you have tried to assign one voxel to more than one cells, which I think is very meaningful.
But, I have a question about your model: do you add some restrainer to voxels or cells to avoid some distant voxels assigned to cell. In your loss function, I think have no restriction about spatial position, or your restrict it in other part of your model.
My understanding is that the assumption behind Tangram is that the same biological processes generated both the SC data and the ST data, and that ideally both data sets will come from the same sample.
Is Tangram expecting raw counts for the SC data? Or will it still work if the count data is normalized in some way?
More generally, how well can we expect Tangram to work if the SC reference data is from other, biologically similar samples? Or even a composite SC reference built by integrating multiple samples? In this last case we would normally do SCT Transform / Harmony to integrate samples, which is a kind of normalization ...
I guess we can always try it and see, but interested in your thoughts.
Thanks :)
Hello,
First, thanks for your amazing method and easy-to-follow tutorials.
I've used your method to analyze a Visium generated dataset. Our H&E images have low resolution and we can not trust the deconvolution result as we have low AUC score, and segmentation seems to fail in capturing cell counts. I would like to know if you have any suggestion on how we can perform analysis of co-localization using the mapped annotations. Since you also mention in your tutorial that you recommend not performing deconvolution for these type of analyses.
Thanks for your time.
Tangram/tangram/mapping_utils.py
Lines 40 to 41 in 3c87a25
think it'd be useful if this operation is optional.
Hi,
thank you for this very cool method! very interesting paper, am finally getting around and trying out the method on Visium data.
have a couple of questions with respect to data cropping and spatial data shape
spatial data matrix with shape voxels-by-genes. Voxel can contain multiple cells.
what exactly is a voxel in the case of Visium?
the way I understand is that the maximum voxel is a spot, and so the spatial matrix would be of shape (spots, genes)
. The minimum therefore would be a pixel? If that's the case, then the mapping (if perfectly converged) would be identical for all the pixels under each spot?
Does the prior on cell density have shape (voxels,)
?
If the above is true, then if I have cell segmentation info for each spot, would it makes sense to generate voxels with shape voxel_shape < spot_shape
, so to have something like: voxel_1,voxel_2 = spot_n[0:x,:], spot_n[x:,:]
, which would make voxel_1, voxel_2
identical in gene expression space, but with different cell density priors (e.g. 3 cells under voxel_1
and 5 cells under voxel_2
) ?
I run out of memory when I map: what should I do?
Reduce your spatial data in various parts and map each single part. If that is not sufficient, you will need to downsample your single cell data as well.
this would essentially only happen with non-visium data, where there is pixel-level gene expression value ? (or yes if using an atlas as reference)
Thank you !
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.