yezhengstat / adtnorm Goto Github PK

ADTnorm normalizes the cell surface protein measurement of CITE-seq data, facilitating across batches and across studies data integration.

Home Page: https://yezhengstat.github.io/ADTnorm/articles/ADTnorm-tutorial.html

License: GNU General Public License v3.0

R 99.86% Dockerfile 0.14%

cite-seq data-integration single-cell surface-protein-normalization

adtnorm's People

Contributors

Stargazers

Watchers

Forkers

juyeongkim helenlindsay daniel-caron ansonrel

adtnorm's Issues

What is ADTseqDepth?

I am a bit unclear what is 'ADTseqDepth' column in the 'cell_x_feature' demo data.
From the documentation, ADTseqDepth is referred to as 'total UMI per cell', but does this mean it is simply a sum of all antibody reads for each cell? For instance, if I ran a 5-antibody panel sequencing experiment, and obtained a row (representing one cell) with raw reads as following:
CD3 CD4 CD8 CD14 CD19
18 138 13 491 3

Then, is the 'ADTseqDepth' for this cell simply 18+138+13+491+3= 663?

Currently, I don't think I really have a separate 'ADTseqDepth' data, and the only data I receive as an output from the experiment is the raw read matrix for each antibodies, just like the 'cell_x_adt' demo data. (I also get the column of exact unique barcodes used for the cells (e.g. AAGTTGTCTAC for row 1, ATTCTTTCGTTT for row 2, etc.), but I don't think this info would be really relevant.) So, I was wondering how to I substitute the 'ADTseqDepth' in the cell_x_feature parameter.

Furthermore, what could be done if I don't really have a 'sample_status' or 'cell_type_l1' data? There are ways I can surrogate these, but the distinction will not be clear as to which cell is healthy vs tumor. Are all 7 columns in the demo 'cell_x_feature' necessary (and equally important) when ADTnorm is run? Or is it okay if I just provide the 'sample' and 'batch' columns for the 'cell_x_feature'?

Thank you!

Error in (function (unregfd, ximarks, x0marks, x0lim = NULL, WfdPar = NULL,

Hi Ye,

I was trying to run your ADTnorm with a dataset of 15820 cells and 207 features (198 proteins + 9 isotype controls).

The function for normalisation:

  adt206_adt_adtnorm <- ADTnorm(
    cell_x_adt = adt206_raw_adt_ctl,  #Matrix of ADT raw counts in cells (rows) by ADT markers (columns) format.
    cell_x_feature = cell_x_feature, #Matrix of cells (rows) by cell features (columns) such as sample, batch, or other cell-type related information.
    save_outpath = outpath, 
    study_name = "adt198", 
    marker_to_process = NULL, 
    save_fig = TRUE
  )

However, I ran into the below error:

Progress:  Each dot is a curve
..
Progress:  Each dot is a curve
..
Progress:  Each dot is a curve
..
Error in (function (unregfd, ximarks, x0marks, x0lim = NULL, WfdPar = NULL,  : 
  Argument ximarks has values outside of range of unregfd.
In addition: Warning messages:
1: Removed 1 rows containing missing values (`geom_segment()`). 
2: Removed 1 rows containing missing values (`geom_segment()`).

Below I'm showing the input data (hope it will help).

Thank you and I'm looking forward to hearing from you.

Warm regards,
Hsiao-Chi

missing value where TRUE/FALSE needed

Dear Ye Zheng,

First of all, thank you for creating this amazing package.
I have ADT data from single cell DNA-antibody sequencing experiments (rather than CITE-seq), which I want to (1) correct the batch effects across timepoints (within same patient) and (2) correct for the mouse background signals (the IgG's) for better accuracy.

I have two questions:
(1) I am keep getting the error message

Error in if (length(y_valley[x_valley > real_peak[1]]) == 0 || (y_valley[x_valley > :
missing value where TRUE/FALSE needed

after I run the following:
cell_x_adt_norm <- ADTnorm(
cell_x_adt = my_adt,
cell_x_feature = my_feat,
save_outpath = save_outpath,
study_name = run_name2,
marker_to_process = c("CD3", "CD4", "CD8"),
save_intermediate_fig = TRUE
)

Where my_adt is a matrix of 48 antibodies (including three IgG isotypes), and my_feat is a matrix where 'sample', 'batch', 'study_names' are the timepoints. Since I didn't have all the information in the demo data 'cell_x_feature' given, I had to arbitrarily make up some of the variables in my_feat.

(2) Since we have the exact IgG reads as a part of our ADT data, I was thinking if it is compatible to use dsb (denoised and scaled by background) and then use ADTnorm on the same dataset (so the data fed into ADTnorm is already normalized). I'm not too familiar with the math behind it, so if this practice sounds not recommended, please let me know!

Error in lnsrch_morph

Hello,

I am testing ADTnorm with only one CITE-seq PBMC sample that has 138 ADTs and ~17K cells. I am using default params. See code below. However, I am getting this error: "Error in lnsrch_morph(bvecold, fold, grad, pvec, fngrad_morph, morphList, :
Initial slope not negative." Any ideas? Thank you very much in advance!

cell_x_adt_norm = ADTnorm( cell_x_adt = cell_x_adt, cell_x_feature = cells_meta, save_outpath = output_path, study_name = run_name, marker_to_process = NULL, ## setting it to NULL by default will process all available markers in cell_x_adt. bimodal_marker = NULL, ## setting it to NULL will trigger ADTnorm to try different settings to find bimodal peaks for all the markers. brewer_palettes = "Dark2", ## color brewer palettes setting for the density plot save_fig = TRUE )

attempting batch correction with ADTnorm

Hello,

I installed ADTnorm from github using remotes::install_github("yezhengSTAT/ADTnorm", build_vignettes = FALSE)

I have an ADT-seq dataset that was generated in 4 different batches/runs with multiple samples in each batch/run. I am seeing an effect of the run even after normalizing with the DSB method. For example:

I have tried using ADTnorm with various parameters, but still the data is very separated by run. Below is my code along with the resulting UMAP plots:

Option 1:
`#### option 1
A316.VDJ <- readRDS(file = here::here("A316_final_vdj_all.rds"))
save_outpath <- "/Users/spanglerab"
run_name <- "ADTnorm_demoRun"

cell_x_adt <- t(as.data.frame(GetAssayData(A316.VDJ, assay = "Prot", slot = "counts")))
cell_x_feature <- [email protected]

cell_x_feature$sample = factor(cell_x_feature$run)
cell_x_feature$batch = factor(cell_x_feature$run)

cell_x_adt_norm <- ADTnorm(
cell_x_adt = cell_x_adt,
cell_x_feature = cell_x_feature,
save_outpath = save_outpath,
study_name = run_name,
save_intermediate_fig = TRUE
)

A316.VDJ <- SetAssayData(A316.VDJ, assay="Prot",slot = "data", new.data=t(cell_x_adt_norm))

DefaultAssay(A316.VDJ) <- "Prot"
A316.VDJ <- ScaleData(A316.VDJ, features = rownames(A316.VDJ))
A316.VDJ <- RunPCA(A316.VDJ, assay = "Prot", slot = "data", features = rownames(A316.VDJ), reduction.name = "apca")
A316.VDJ <- RunUMAP(A316.VDJ, reduction = "apca", dims = 1:18, assay = "Prot", reduction.name = "prot.umap", reduction.key = "protUMAP_", n.neighbors = 40, min.dist = 0.3, local.connectivity = 3, spread = 3)
pdf(file = here::here("DimPlot_prot_UMAP_all_adt_norm.pdf"))
DimPlot(A316.VDJ, reduction = "prot.umap", label = TRUE, group.by = "run")
dev.off()`

Thanks for your help,

Abby

Missionbio tapestri : error

Hello,

Thanks you for you package !

I'm trying to develop a pipeline of analysis for missionbio multi-omics data. They have something similar to cite-seq but with DNA-seq and protein. So i thought about using your package to normalize my protein read counts.

I used your docker and try to make it work with my data but i'm getting this error :

Error in lnsrch_morph(bvecold, fold, grad, pvec, fngrad_morph, morphList,  : 
  Initial slope not negative.

Perhaps you will have an idea about what i did wrong ?

Thanks you for your time !

Error in chol.default(temp): the leading minor of order 11 is not positive definite

Hello!!

Thank you for this tool! I am currently trying to run ADTnorm in a dataset with 45 antibodies and 97K cells. I am using the following code:

cell_x_feature$sample = factor(unlist(cell_x_feature$sample))
cell_x_feature$batch = factor(unlist(cell_x_feature$sample))

cell_x_adt_norm = ADTnorm(
  cell_x_adt = cell_x_adt, 
  cell_x_feature = cell_x_feature, 
  study_name = study_name,
  marker_to_process = NULL,
  bimodal_marker = NULL)

It seems to run smoothly until reaching the normalization of a specific antibody, when I get this error:

........Error in chol.default(temp) : 
  the leading minor of order 11 is not positive definite
In addition: There were 40 warnings (use warnings() to see them)

Do you know what could be causing it?
Thank you!
Marta

Error in landmarkRegion[[i]]

Thank you for this tool! I'm excited to see how it works.

I'm currently trying to run it on a dataset with 207K cells and 46 CITE-Seq antibodies. I have 18 10X lanes merged into one object. From there, I created a dataframe with the raw ADT counts and exported the batch information as a matrix. I named those objects cell_x_adt (for the counts) and cell_x_feature (for the batch information). I then ran the following code:

cell_x_adt_norm <- ADTnorm(cell_x_adt = cell_x_adt, cell_x_feature = cell_x_feature,
save_outpath = save_outpath, study_name = study_name)

And got the following error:

[1] "ADTnorm will process all the ADT markers from the ADT matrix:hash1, hash2, hash3, hash4, CD69.1, CD107a, CD154, HLA-DR, CD20, CD16, CD14.1, CD123, CD8, CD11b, CD11c, CD3, NKG2A, CD8B, CD4.1, CD27, CD1c, CD2.1, CD66b, CD56, CD206, CD163.1, TCR-Vdelta2, CD127, CD25, CD28.1, CD196-aka-CCR6, CD95, TCR-Va24-Ja18, CD183-aka-CXCR3, CD161, TCR-Va-7point2, TCR-gd-Vd1, CD279-aka-PD1, CD278-aka-ICOS, CX3CR1.1, CD194-aka-CCR4, TCR-Vg9, mIgG1-K-Isotype-Control, mIgG2b-K-Isotype-Control, mIgG2a-K-Isotype-Control, Armenian-Hasmter-IgG-Isotype-Control"
[1] "hash1"
Error in landmarkRegion[[i]] <- matrix(NA, ncol = 2, nrow = length(sample_name_list)) :
attempt to select less than one element in integerOneIndex

Any idea what could be causing this error or what could alleviate it? Thank you!

Edit: I also named save_outpath and study_name earlier in the script.

Unable to download ADTnomr

Hello,

I have been trying to download ADTnorm and use it and I keep running into this error:
ERROR: dependency ?flowStats? is not available for package ?ADTnorm?
I have followed the instructions of remotes::install_github("yezhengSTAT/ADTnorm", build_vignettes = FALSE, lib=libraryPath) and also I have tried to install the flowWorKspace and flowCore individually and that didn't fix the issue. This is my session info:

R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] remotes_2.5.0

loaded via a namespace (and not attached):
[1] ps_1.7.6 prettyunits_1.2.0 crayon_1.5.2 withr_3.0.0 rprojroot_2.0.4 R6_2.5.1 rlang_1.1.3
[8] cli_3.6.2 curl_5.2.0 rstudioapi_0.15.0 callr_3.7.3 tools_4.1.2 compiler_4.1.2 processx_3.8.3
[15] pkgbuild_1.3.1 sessioninfo_1.2.2 tcltk_4.1.2

yezhengstat / adtnorm Goto Github PK

adtnorm's People

Contributors

Stargazers

Watchers

Forkers

adtnorm's Issues

What is ADTseqDepth?

Error in (function (unregfd, ximarks, x0marks, x0lim = NULL, WfdPar = NULL,

missing value where TRUE/FALSE needed

Error in lnsrch_morph

attempting batch correction with ADTnorm

Missionbio tapestri : error

Error in chol.default(temp): the leading minor of order 11 is not positive definite

Error in landmarkRegion[[i]]

Unable to download ADTnomr

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent