Coder Social home page Coder Social logo

mudataseurat's Introduction

MuDataSeurat

R-CMD-check pkgdown

Documentation | Preprint | Discord

MuDataSeurat is a package that provides I/O functionality for .h5mu files and Seurat objects.

You can learn more about multimodal data containers in the mudata documentation.

Installation

remotes::install_github("pmbio/MuDataSeurat")

Quick start

MuDataSeurat provides a set of I/O operations for multimodal data.

MuDataSeurat implements WriteH5MU() that saves Seurat objects to .h5mu files that can be further integrated into workflows in multiple programming languages, including the muon Python library and the Muon.jl Julia library. ReadH5MU() reads .h5mu files into Seurat objects.

MuDataSeurat currently works for Seurat objects of v3 and above.

Writing files

Start with an existing dataset, e.g. a Seurat object with CITE-seq data:

library(SeuratData)
InstallData("bmcite")
bm <- LoadData(ds = "bmcite")

WriteH5MU() allows to save the object into a .h5mu file:

library(MuDataSeurat)
WriteH5MU(bm, "bmcite.h5mu")

Please note that only standardised parts of the object are written to the file, and extra information from specific methods, stored in the Seurat object, might be omitted upon writing the file.

Reading files

bm <- ReadH5MU("bmcite.h5mu")

Please note that only the intersection of cells is currently loaded into the Seurat object due to the object structure limitation. Multimodal embeddings (global .obsm slot) are loaded with the assay.used field set to the default assay. Embeddings names are changed in order to comply with R & Seurat requirements and conventions.

Relevant projects

Other R packages for multimodal I/O include:

mudataseurat's People

Contributors

gtca avatar ilia-kats avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mudataseurat's Issues

Error in WriteH5ADHelper(object, assay, h5, global = TRUE): no slot of name "meta.features" for this object of class "Assay5"

Hello, author. I had an error converting the seruat v5 object to h5ad.
MuDataSeurat::WriteH5AD(a, "a.h5ad",assay="RNA")

Error in WriteH5ADHelper(object, assay, h5, global = TRUE): no slot of name "meta.features"for this object of class "Assay5"
Traceback:

1.MuDataSeurat::WriteH5AD(a, "a.h5ad", assay = "RNA")
2. MuDataSeurat::WriteH5AD(a, "a.h5ad", assay = "RNA")
3. WriteH5ADHelper(object, assay, h5, global = TRUE)

Actions fail because of SeuratData

When configuring the dependencies, SeuratData seems to be installed before Seurat, which fails.

While this should be handled by the build system, it is not for some reason...

I've tested the patch but I still see the same error. Strangely, the error disappears upon forcing my ADT matrix to a sparse matrix using `Seurat::as.sparse`. Now I can load my `mudata` file.

I've tested the patch but I still see the same error. Strangely, the error disappears upon forcing my ADT matrix to a sparse matrix using Seurat::as.sparse. Now I can load my mudata file.

Originally posted by @mdmanurung in #2 (comment)

I have the same issue on the bonemarrow data used to illustrate the package functionality

library(SeuratData)
InstallData("bmcite")
bm <- LoadData(ds = "bmcite")

library(MuDataSeurat)
WriteH5MU(bm, "bmcite.h5mu")


test<- ReadH5MU("bmcite.h5mu")

Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘CD14’, ‘CD19’, ‘CD27’, ‘CD28’, ‘CD34’, ‘CD38’, ‘CD4’, ‘CD69’ 

also reading the same object in python fails (with or without using seurat::as.sparse on the ADT)

import muon as mu

mu.read_h5mu("bmcite.h5mu")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/R405py39/lib/python3.9/site-packages/mudata/_core/io.py", line 380, in read_h5mu
    ad = _read_h5mu_mod(gmods[m], manager, backed not in (None, False))
  File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/R405py39/lib/python3.9/site-packages/mudata/_core/io.py", line 513, in _read_h5mu_mod
    ad = AnnData(**d)
  File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/R405py39/lib/python3.9/site-packages/anndata/_core/anndata.py", line 291, in __init__
    self._init_as_actual(
  File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/R405py39/lib/python3.9/site-packages/anndata/_core/anndata.py", line 521, in _init_as_actual
    self._check_dimensions()
  File "/Users/fabiola.curion/Documents/devel/miniconda3/envs/R405py39/lib/python3.9/site-packages/anndata/_core/anndata.py", line 1843, in _check_dimensions
    raise ValueError(
ValueError: Observations annot. `obs` must have number of rows of `X` (25), but has 30672 rows.

Null categorical codes written incorrectly

Writing

suppressWarnings(SeuratData::InstallData("pbmc3k", force.reinstall = F))
suppressWarnings(data("pbmc3k"))
seuratObj <- suppressWarnings(pbmc3k)

WriteH5AD(seuratObj, "mudata_seurat.h5ad")

Reading

import anndata as ad

ad.read_h5ad("./mudata_seurat.h5ad")
File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/pandas/core/arrays/categorical.py:709, in Categorical.from_codes(cls, codes, categories, ordered, dtype)
    706     raise ValueError("codes need to be array-like integers")
    708 if len(codes) and (codes.max() >= len(dtype.categories) or codes.min() < -1):
--> 709     raise ValueError("codes need to be between -1 and len(categories)-1")
    711 return cls(codes, dtype=dtype, fastpath=True)

ValueError: codes need to be between -1 and len(categories)-1
Full Traceback
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 ad.read_h5ad("./mudata_seurat.h5ad")

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/h5ad.py:236, in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    233     assert False, "unexpected raw format"
    234 elif k in {"obs", "var"}:
    235     # Backwards compat
--> 236     d[k] = read_dataframe(f[k])
    237 else:  # Base case
    238     d[k] = read_elem(f[k])

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/h5ad.py:301, in read_dataframe(group)
    299     return read_dataframe_legacy(group)
    300 else:
--> 301     return read_elem(group)

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/specs/registry.py:183, in read_elem(elem, modifiers)
    178 def read_elem(
    179     elem: Union[H5Array, H5Group, ZarrGroup, ZarrArray],
    180     modifiers: frozenset(str) = frozenset(),
    181 ) -> Any:
    182     """Read an element from an on disk store."""
--> 183     return _REGISTRY.get_reader(type(elem), get_spec(elem), frozenset(modifiers))(elem)

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/specs/methods.py:564, in read_dataframe_0_1_0(elem)
    561 columns = _read_attr(elem.attrs, "column-order")
    562 idx_key = _read_attr(elem.attrs, "_index")
    563 df = pd.DataFrame(
--> 564     {k: read_series(elem[k]) for k in columns},
    565     index=read_series(elem[idx_key]),
    566     columns=list(columns),
    567 )
    568 if idx_key != "_index":
    569     df.index.name = idx_key

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/specs/methods.py:564, in <dictcomp>(.0)
    561 columns = _read_attr(elem.attrs, "column-order")
    562 idx_key = _read_attr(elem.attrs, "_index")
    563 df = pd.DataFrame(
--> 564     {k: read_series(elem[k]) for k in columns},
    565     index=read_series(elem[idx_key]),
    566     columns=list(columns),
    567 )
    568 if idx_key != "_index":
    569     df.index.name = idx_key

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/anndata/_io/specs/methods.py:586, in read_series(dataset)
    584     categories = read_elem(categories_dset)
    585     ordered = bool(_read_attr(categories_dset.attrs, "ordered", False))
--> 586     return pd.Categorical.from_codes(
    587         read_elem(dataset), categories, ordered=ordered
    588     )
    589 else:
    590     return read_elem(dataset)

File ~/miniconda3/envs/seurat-conversion/lib/python3.10/site-packages/pandas/core/arrays/categorical.py:709, in Categorical.from_codes(cls, codes, categories, ordered, dtype)
    706     raise ValueError("codes need to be array-like integers")
    708 if len(codes) and (codes.max() >= len(dtype.categories) or codes.min() < -1):
--> 709     raise ValueError("codes need to be between -1 and len(categories)-1")
    711 return cls(codes, dtype=dtype, fastpath=True)

ValueError: codes need to be between -1 and len(categories)-1

Checking out the file:

import h5py
import pandas as pd

f = h5py.File("./mudata_seurat.h5ad")

pd.value_counts(f["obs"]["seurat_annotations"][:])
 0             697
 1             483
 2             480
 3             344
 4             271
 5             162
 6             155
-2147483648     62
 7              32
 8              14
dtype: int64

It looks like what's happening is that R and pandas encode categorical missing values quite differently. pandas (and anndata) are expecting null values to have codes of -1.

`mu.read` ValueError

Dear author,

I encountered the following issue upon reading a mudata object that was converted from Seurat:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [6], in <module>
----> 1 mdata2 = mu.read("data/processed/mudata.h5mu/ADT")

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/mudata/_core/io.py:409, in read(filename, **kwargs)
    406     return read_h5mu(filepath, **kwargs)
    407 elif m[3] == "":
    408     # .h5mu/<modality>
--> 409     return read_h5ad(filepath, m[2], **kwargs)
    410 elif m[2] == "mod":
    411     # .h5mu/mod/<modality>
    412     return read_h5ad(filepath, m[3], **kwargs)

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/mudata/_core/io.py:372, in read_h5ad(filename, mod, backed)
    370 with h5py.File(filename, hdf5_mode) as f_root:
    371     f = f_root["mod"][mod]
--> 372     return _read_h5mu_mod(f, manager, backed)

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/mudata/_core/io.py:320, in _read_h5mu_mod(g, manager, backed)
    318     elif k != "raw":
    319         d[k] = read_attribute(g[k])
--> 320 ad = AnnData(**d)
    321 if manager is not None:
    322     ad.file = AnnDataFileManager(ad, os.path.basename(g.name), manager)

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/anndata/_core/anndata.py:308, in AnnData.__init__(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, obsp, varp, oidx, vidx)
    306     self._init_as_view(X, oidx, vidx)
    307 else:
--> 308     self._init_as_actual(
    309         X=X,
    310         obs=obs,
    311         var=var,
    312         uns=uns,
    313         obsm=obsm,
    314         varm=varm,
    315         raw=raw,
    316         layers=layers,
    317         dtype=dtype,
    318         shape=shape,
    319         obsp=obsp,
    320         varp=varp,
    321         filename=filename,
    322         filemode=filemode,
    323     )

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/anndata/_core/anndata.py:526, in AnnData._init_as_actual(self, X, obs, var, uns, obsm, varm, varp, obsp, raw, layers, dtype, shape, filename, filemode)
    523 # Backwards compat for connectivities matrices in uns["neighbors"]
    524 _move_adj_mtx({"uns": self._uns, "obsp": self._obsp})
--> 526 self._check_dimensions()
    527 self._check_uniqueness()
    529 if self.filename:

File /exports/para-lipg-hpc/mdmanurung/conda/envs/scanpy/lib/python3.8/site-packages/anndata/_core/anndata.py:1837, in AnnData._check_dimensions(self, key)
   1835     key = {key}
   1836 if "obs" in key and len(self._obs) != self._n_obs:
-> 1837     raise ValueError(
   1838         "Observations annot. `obs` must have number of rows of `X`"
   1839         f" ({self._n_obs}), but has {self._obs.shape[0]} rows."
   1840     )
   1841 if "var" in key and len(self._var) != self._n_vars:
   1842     raise ValueError(
   1843         "Variables annot. `var` must have number of columns of `X`"
   1844         f" ({self._n_vars}), but has {self._var.shape[0]} rows."
   1845     )

ValueError: Observations annot. `obs` must have number of rows of `X` (163), but has 62773 rows.

I then tried to load each modality one by one. I could load my RNA data, but not my ADT. My ADT data has 163 features in it. For both modalities, I have 62773 observations.

Considering that, I am a bit confused by the error. Why would obs of my ADT data expect 163 rows, which should be the number of features?

Thanks for taking the time.

Regards,
Mikhael

dgCMatrix

I pulled down two h5ad files from https://developmental.cellatlas.io/fetal-bone-marrow.
(1) Human fetal BM 10x dataset and (2) Human fetal BM Down syndrome 10x dataset)

The Down syndrome data loads fine using MuDataSeurat::ReadH5AD. The typical data set gives me this error.

d21 <- MuDataSeurat::ReadH5AD("/tmp/fig1b_fbm_scaled_gex_updated_dr_20210104.h5ad")
Warning: Feature names cannot have underscores (''), replacing with dashes ('-')
Error in (function (cl, name, valueClass) :
assignment of an object of class “dgCMatrix” is not valid for @‘scale.data’ in an object of class “Assay”; is(value, "matrix") is not TRUE
In addition: Warning message:
In read_layers_to_assay(h5) :
The var_names from modality have been renamed as feature names cannot contain '
'. E.g. RP11-442N24__B.1 -> RP11-442N24--B.1.

String categories written by MuDataSeurat are read in as bytes by anndata

Using the same setup in #5, with the fix that closed it:

suppressWarnings(SeuratData::InstallData("pbmc3k", force.reinstall = F))
suppressWarnings(data("pbmc3k"))
seuratObj <- suppressWarnings(pbmc3k)

WriteH5AD(seuratObj, "mudata_seurat.h5ad")
import anndata as ad

a = ad.read_h5ad("./mudata_seurat.h5ad")
a.obs
              orig.ident  nCount_RNA  nFeature_RNA seurat_annotations
AAACATACAACCAC  b'pbmc3k'      2419.0           779    b'Memory CD4 T'
AAACATTGAGCTAC  b'pbmc3k'      4903.0          1352               b'B'
AAACATTGATCAGC  b'pbmc3k'      3147.0          1129    b'Memory CD4 T'
AAACCGTGCTTCCG  b'pbmc3k'      2639.0           960      b'CD14+ Mono'
AAACCGTGTATGCG  b'pbmc3k'       980.0           521              b'NK'
...                   ...         ...           ...                ...
TTTCGAACTCTCAT  b'pbmc3k'      3459.0          1153      b'CD14+ Mono'
TTTCTACTGAGGCA  b'pbmc3k'      3443.0          1224               b'B'
TTTCTACTTCCTCG  b'pbmc3k'      1684.0           622               b'B'
TTTGCATGAGAGGC  b'pbmc3k'      1022.0           452               b'B'
TTTGCATGCCTCAC  b'pbmc3k'      1984.0           723     b'Naive CD4 T'

The categorical should be read in as strings. I would also suggest just writing the more recent dataframe and categorical format where everything is more self contained and annotated while you're at it.

can't read mudata created with muon (python)

Hello, thanks for working on interoperability between seurat and mudata!

I can't read a mudata that I created following your multimodal tutorial using ReadH5MU

test<-ReadH5MU("data_test.dir/pbmc_w3_teaseq.h5mu")
Error in dataset[[name]]$read() : attempt to apply non-function

I have no problems loading the object with muon in python

import muon as mu
mu.read_h5mu("data_test.dir/pbmc_w3_teaseq.h5mu")
MuData object with n_obs × n_vars = 5805 × 113187
  obs:  'sample', 'well', 'leiden_multiplex', 'leiden_mofa', 'leiden_wnn'
  var:  'highly_variable', 'gene_ids', 'feature_types', 'genome', 'interval'
  obsm: 'X_mofa', 'X_umap', 'X_wnn_umap'
  varm: 'LFs'
  obsp: 'mofa_connectivities', 'mofa_distances', 'wnn_connectivities', 'wnn_distances'
  3 modalities
    rna:        5805 x 16381
      obs:      'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden'
      var:      'gene_ids', 'feature_types', 'genome', 'interval', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
      uns:      'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'lognorm'
      obsp:     'connectivities', 'distances'
    atac:       5805 x 96760
      obs:      'n_fragments', 'n_duplicate', 'n_mito', 'n_unique', 'altius_count', 'altius_frac', 'gene_bodies_count', 'gene_bodies_frac', 'peaks_count', 'peaks_frac', 'tss_count', 'tss_frac', 'barcodes', 'cell_name', 'well_id', 'chip_id', 'batch_id', 'pbmc_sample_id', 'DoubletScore', 'DoubletEnrichment', 'TSSEnrichment', 'n_genes_by_counts', 'total_counts', 'n_counts', 'leiden'
      var:      'gene_ids', 'feature_types', 'genome', 'interval', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
      uns:      'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'counts', 'lognorm'
      obsp:     'connectivities', 'distances'
    prot:       5805 x 46
      obs:      'total_counts'
      var:      'highly_variable'
      uns:      'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'counts'
      obsp:     'connectivities', 'distances'

I can explore the h5 but it breaks where the error says. it also seems to expect some attributes that I don't have in the mudata

h5 <- open_and_check_mudata("~/Documents/devel/data_test.dir/pbmc_w3_teaseq.h5mu")
metadata <- read_with_index(h5[["obs"]])
dataset = h5[['obs']]
dataset_attr <- tryCatch({
  h5attributes(dataset)
  }, error = function(e) {
  list("_index" = "_index")
  })
  indexcol <- "_index"
if ("_index" %in% names(dataset_attr)) {
  indexcol <- dataset_attr$`_index`
}
dataset_attr
columns <- names(dataset)
columns <- columns[columns != "__categories"]
columns

dataset[["sample"]]$read()

Error in dataset[[name]]$read() : attempt to apply non-function

values_attr <-h5attributes(dataset)
values_attr 
$`column-order`
[1] "sample" "well"  

$`_index`
[1] "_index"

$`encoding-type`
[1] "dataframe"

$`encoding-version`
[1] "0.2.0"

# so the following line will be NULL
# values_attr$categories

any suggestions?

thanks!

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bmcite.SeuratData_0.3.0 pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2        hdf5r_1.3.5             MuDataSeurat_0.0.0.9000 magrittr_2.0.3          datapasta_3.1.0        
 [8] forcats_0.5.1           stringr_1.4.0           dplyr_1.0.8             purrr_0.3.4             readr_2.1.2             tidyr_1.2.0             tibble_3.1.6           
[15] ggplot2_3.3.5           tidyverse_1.3.1        

loaded via a namespace (and not attached):
  [1] readxl_1.4.0          backports_1.4.1       plyr_1.8.7            igraph_1.3.0          lazyeval_0.2.2        splines_4.1.2         listenv_0.8.0         scattermore_0.8      
  [9] digest_0.6.29         htmltools_0.5.2       fansi_1.0.3           tensor_1.5            cluster_2.1.3         ROCR_1.0-11           tzdb_0.3.0            remotes_2.4.2        
 [17] globals_0.14.0        modelr_0.1.8          matrixStats_0.62.0    spatstat.sparse_2.1-0 prettyunits_1.1.1     colorspace_2.0-3      rappdirs_0.3.3        rvest_1.0.2          
 [25] ggrepel_0.9.1         haven_2.4.3           callr_3.7.0           crayon_1.5.1          jsonlite_1.8.0        spatstat.data_2.1-4   survival_3.3-1        zoo_1.8-9            
 [33] glue_1.6.2            polyclip_1.10-0       gtable_0.3.0          leiden_0.3.9          clipr_0.8.0           pkgbuild_1.3.1        future.apply_1.8.1    abind_1.4-5          
 [41] scales_1.1.1          DBI_1.1.2             spatstat.random_2.2-0 miniUI_0.1.1.1        Rcpp_1.0.8.3          viridisLite_0.4.0     xtable_1.8-4          reticulate_1.24      
 [49] spatstat.core_2.4-2   bit_4.0.4             htmlwidgets_1.5.4     httr_1.4.2            anndata_0.7.5.3       RColorBrewer_1.1-3    ellipsis_0.3.2        Seurat_4.1.0         
 [57] ica_1.0-2             pkgconfig_2.0.3       uwot_0.1.11           dbplyr_2.1.1          deldir_1.0-6          utf8_1.2.2            tidyselect_1.1.2      rlang_1.0.2          
 [65] reshape2_1.4.4        later_1.3.0           munsell_0.5.0         cellranger_1.1.0      tools_4.1.2           cli_3.3.0             generics_0.1.2        broom_0.7.12         
 [73] ggridges_0.5.3        fastmap_1.1.0         goftest_1.2-3         processx_3.5.3        bit64_4.0.5           fs_1.5.2              fitdistrplus_1.1-8    RANN_2.6.1           
 [81] pbapply_1.5-0         future_1.24.0         nlme_3.1-157          mime_0.12             formatR_1.12          xml2_1.3.3            compiler_4.1.2        rstudioapi_0.13      
 [89] plotly_4.10.0         curl_4.3.2            png_0.1-7             spatstat.utils_2.3-0  reprex_2.0.1          stringi_1.7.6         ps_1.6.0              lattice_0.20-45      
 [97] Matrix_1.4-1          SeuratDisk_0.0.0.9019 vctrs_0.3.8           pillar_1.7.0          lifecycle_1.0.1       spatstat.geom_2.4-0   lmtest_0.9-40         RcppAnnoy_0.0.19     
[105] addinexamples_0.1.0   data.table_1.14.2     cowplot_1.1.1         irlba_2.3.5           httpuv_1.6.5          patchwork_1.1.1       R6_2.5.1              promises_1.2.0.1     
[113] KernSmooth_2.23-20    gridExtra_2.3         parallelly_1.31.0     codetools_0.2-18      MASS_7.3-56           assertthat_0.2.1      rprojroot_2.0.3       withr_2.5.0          
[121] SeuratObject_4.0.4    sctransform_0.3.3     mgcv_1.8-40           parallel_4.1.2        hms_1.1.1             grid_4.1.2            rpart_4.1.16          Rtsne_0.15           
[129] shiny_1.7.1           lubridate_1.8.0      

WriteH5MU fails

Hi, thanks for working on interoperability between seurat and mudata!

I am failing to save the seurat object offered in this tutorial

reference <- LoadH5Seurat("../data/pbmc_multimodal.h5seurat")
WriteH5MU(reference, "tea.h5mu")

Defining highly variable features...
Defining highly variable features...
Error in self$exists(name) : 
STRING_ELT() can only be applied to a 'character vector', not a 'NULL'

I tried different things like removing some of the reduction methods but still no luck.
the file is created but it breaks somewhere in the process, still haven't figured out where exactly (and ofc it can't be read with ReadH5MU)
Any idea what I should check next?
thank you!

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bmcite.SeuratData_0.3.0 pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2        hdf5r_1.3.5             MuDataSeurat_0.0.0.9000 magrittr_2.0.3          datapasta_3.1.0        
 [8] forcats_0.5.1           stringr_1.4.0           dplyr_1.0.8             purrr_0.3.4             readr_2.1.2             tidyr_1.2.0             tibble_3.1.6           
[15] ggplot2_3.3.5           tidyverse_1.3.1        

loaded via a namespace (and not attached):
  [1] readxl_1.4.0          backports_1.4.1       plyr_1.8.7            igraph_1.3.0          lazyeval_0.2.2        splines_4.1.2         listenv_0.8.0        
  [8] scattermore_0.8       digest_0.6.29         htmltools_0.5.2       fansi_1.0.3           tensor_1.5            cluster_2.1.3         ROCR_1.0-11          
 [15] tzdb_0.3.0            remotes_2.4.2         globals_0.14.0        modelr_0.1.8          matrixStats_0.62.0    spatstat.sparse_2.1-0 prettyunits_1.1.1    
 [22] colorspace_2.0-3      rappdirs_0.3.3        rvest_1.0.2           ggrepel_0.9.1         haven_2.4.3           callr_3.7.0           crayon_1.5.1         
 [29] jsonlite_1.8.0        spatstat.data_2.1-4   survival_3.3-1        zoo_1.8-9             glue_1.6.2            polyclip_1.10-0       gtable_0.3.0         
 [36] leiden_0.3.9          clipr_0.8.0           pkgbuild_1.3.1        future.apply_1.8.1    abind_1.4-5           scales_1.1.1          DBI_1.1.2            
 [43] spatstat.random_2.2-0 miniUI_0.1.1.1        Rcpp_1.0.8.3          viridisLite_0.4.0     xtable_1.8-4          reticulate_1.24       spatstat.core_2.4-2  
 [50] bit_4.0.4             htmlwidgets_1.5.4     httr_1.4.2            anndata_0.7.5.3       RColorBrewer_1.1-3    ellipsis_0.3.2        Seurat_4.1.0         
 [57] ica_1.0-2             pkgconfig_2.0.3       uwot_0.1.11           dbplyr_2.1.1          deldir_1.0-6          utf8_1.2.2            tidyselect_1.1.2     
 [64] rlang_1.0.2           reshape2_1.4.4        later_1.3.0           munsell_0.5.0         cellranger_1.1.0      tools_4.1.2           cli_3.3.0            
 [71] generics_0.1.2        broom_0.7.12          ggridges_0.5.3        fastmap_1.1.0         goftest_1.2-3         processx_3.5.3        bit64_4.0.5          
 [78] fs_1.5.2              fitdistrplus_1.1-8    RANN_2.6.1            pbapply_1.5-0         future_1.24.0         nlme_3.1-157          mime_0.12            
 [85] formatR_1.12          xml2_1.3.3            compiler_4.1.2        rstudioapi_0.13       plotly_4.10.0         curl_4.3.2            png_0.1-7            
 [92] spatstat.utils_2.3-0  reprex_2.0.1          stringi_1.7.6         ps_1.6.0              lattice_0.20-45       Matrix_1.4-1          SeuratDisk_0.0.0.9019
 [99] vctrs_0.3.8           pillar_1.7.0          lifecycle_1.0.1       spatstat.geom_2.4-0   lmtest_0.9-40         RcppAnnoy_0.0.19      addinexamples_0.1.0  
[106] data.table_1.14.2     cowplot_1.1.1         irlba_2.3.5           httpuv_1.6.5          patchwork_1.1.1       R6_2.5.1              promises_1.2.0.1     
[113] KernSmooth_2.23-20    gridExtra_2.3         parallelly_1.31.0     codetools_0.2-18      MASS_7.3-56           assertthat_0.2.1      rprojroot_2.0.3      
[120] withr_2.5.0           SeuratObject_4.0.4    sctransform_0.3.3     mgcv_1.8-40           parallel_4.1.2        hms_1.1.1             grid_4.1.2           
[127] rpart_4.1.16          Rtsne_0.15            shiny_1.7.1           lubridate_1.8.0      

invalid class “DimReduc” object

Hi,

ReadH5MU was giving me the following error:

Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Warning: No columnames present in cell embeddings, setting to 'atacpca_1:50'
Error in validObject(.Object) : 
  invalid class “DimReduc” object: invalid object for slot "feature.loadings" in class "DimReduc": got class "NULL", should be or extend class "matrix"
In addition: Warning messages:
1: In missing_on_read("/var", "global variables metadata") :
  Missing on read: /var. Seurat does not support global variables metadata.
2: In missing_on_read("/varp", "pairwise annotation of variables") :
  Missing on read: /varp. Seurat does not support pairwise annotation of variables.

I managed to fix it by computing the PCA embeddings explicitly. Before that I was plotting UMAPs without explicitly computing PCA before.

Putting it here in case someone encounters the same error. Does the conversion to Seurat require PCA embeddings ? Or maybe it does when UMAP is present ?

Anyway thanks for this compatibility tool!

Best,

GJ

Error occurred when convert Seurat v5 object to h5ad file

When I try to convert Seurat Object to .h5ad file I got an error:

WriteH5AD(seu.filtered, 'data/allcell/allcell_filtered.h5ad')
Error in WriteH5ADHelper(object, assay, h5, global = TRUE) : 
  no slot of name "meta.features" for this object of class "Assay5"

Here is my environment:

R version 4.3.1 (2023-06-16)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /home/software_install/miniconda3/envs/r_envs/lib/libopenblasp-r0.3.21.so;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Asia/Shanghai
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] MuDataSeurat_0.0.0.9000 lubridate_1.9.3         forcats_1.0.0           stringr_1.5.0           dplyr_1.1.3             purrr_1.0.2             readr_2.1.4             tidyr_1.3.0            
 [9] tibble_3.2.1            ggplot2_3.4.4           tidyverse_2.0.0         scCustomize_1.1.3       Seurat_5.0.0            SeuratObject_5.0.0      sp_2.1-1               

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3     rstudioapi_0.15.0      jsonlite_1.8.7         shape_1.4.6            magrittr_2.0.3         spatstat.utils_3.0-4   ggbeeswarm_0.7.2       GlobalOptions_0.1.2    vctrs_0.6.4           
 [10] ROCR_1.0-11            spatstat.explore_3.2-5 paletteer_1.5.0        janitor_2.2.0          htmltools_0.5.7        sctransform_0.4.1      parallelly_1.36.0      KernSmooth_2.23-22     htmlwidgets_1.6.2     
 [19] ica_1.0-3              plyr_1.8.9             plotly_4.10.3          zoo_1.8-12             igraph_1.5.1           mime_0.12              lifecycle_1.0.4        pkgconfig_2.0.3        Matrix_1.6-2          
 [28] R6_2.5.1               fastmap_1.1.1          snakecase_0.11.1       fitdistrplus_1.1-11    future_1.33.0          shiny_1.7.5.1          digest_0.6.33          colorspace_2.1-0       rematch2_2.1.2        
 [37] patchwork_1.1.3        tensor_1.5             RSpectra_0.16-1        irlba_2.3.5.1          progressr_0.14.0       timechange_0.2.0       fansi_1.0.5            spatstat.sparse_3.0-3  httr_1.4.7            
 [46] polyclip_1.10-6        abind_1.4-5            compiler_4.3.1         withr_2.5.2            bit64_4.0.5            fastDummies_1.7.3      MASS_7.3-60            tools_4.3.1            vipor_0.4.5           
 [55] lmtest_0.9-40          beeswarm_0.4.0         httpuv_1.6.11          future.apply_1.11.0    goftest_1.2-3          glue_1.6.2             nlme_3.1-163           promises_1.2.1         grid_4.3.1            
 [64] Rtsne_0.16             cluster_2.1.4          reshape2_1.4.4         generics_0.1.3         hdf5r_1.3.8            gtable_0.3.4           spatstat.data_3.0-3    tzdb_0.4.0             hms_1.1.3             
 [73] data.table_1.14.8      utf8_1.2.4             spatstat.geom_3.2-7    RcppAnnoy_0.0.21       ggrepel_0.9.4          RANN_2.6.1             pillar_1.9.0           spam_2.10-0            RcppHNSW_0.5.0        
 [82] ggprism_1.0.4          later_1.3.1            circlize_0.4.15        splines_4.3.1          lattice_0.22-5         survival_3.5-7         bit_4.0.5              deldir_1.0-9           tidyselect_1.2.0      
 [91] miniUI_0.1.1.1         pbapply_1.7-2          gridExtra_2.3          scattermore_1.2        matrixStats_1.1.0      stringi_1.8.1          lazyeval_0.2.2         codetools_0.2-19       cli_3.6.1             
[100] uwot_0.1.16            xtable_1.8-4           reticulate_1.34.0      munsell_0.5.0          Rcpp_1.0.11            globals_0.16.2         spatstat.random_3.2-1  png_0.1-8              ggrastr_1.0.2         
[109] parallel_4.3.1         ellipsis_0.3.2         dotCall64_1.1-0        listenv_0.9.0          viridisLite_0.4.2      scales_1.2.1           ggridges_0.5.4         crayon_1.5.2           leiden_0.4.3          
[118] rlang_1.1.2            cowplot_1.1.1    

Is this problem caused by Seurat upgrade? How can I convert Seurat v5 object into anndata?

Columns of type characters converted into byte string

Dear author,

I noticed variables of my cell metadata with character type were converted into byte string (b'' surrounding the entries). I could fix this by applying .str.decode('utf-8') to the affected columns. Leaving the columns as byte caused scanpy to fail in detecting those (e.g. when selecting argument for color for plotting functions).

Regards,
Mikhael

Remote install fails

Remote install fails with

Error: Failed to install 'MuDataSeurat' from GitHub:
  Unknown remote type: SeuratData=github
  object 'seuratdata=github_remote' of mode 'function' was not found
Traceback:

1. remotes::install_github("PMBio/MuDataSeurat")
2. install_remotes(remotes, auth_token = auth_token, host = host, 
 .     dependencies = dependencies, upgrade = upgrade, force = force, 
 .     quiet = quiet, build = build, build_opts = build_opts, build_manual = build_manual, 
 .     build_vignettes = build_vignettes, repos = repos, type = type, 
 .     ...)
3. tryCatch(res[[i]] <- install_remote(remotes[[i]], ...), error = function(e) {
 .     stop(remote_install_error(remotes[[i]], e))
 . })
4. tryCatchList(expr, classes, parentenv, handlers)
5. tryCatchOne(expr, names, parentenv, handlers[[1L]])
6. value[[3L]](cond)

R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
other attached packages:
[1] remotes_2.4.1

This was fixed by changing

Remotes: SeuratData=github::satijalab/seurat-data
to Remotes: github::satijalab/seurat-data

WriteH5MU fails with NULL as varm_key after running Seurat workflow

Saving an Seurat objects after normal Seurat workflow fails due to NULL value a varm_key.
I'm not sure why and if it is occuring here or here but wrapping this lines in if (!is.null(varm_key)){} solved it for me and no keys where obviously missing in the h5mu object.

R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS

Packages:
hdf5r_1.3.4
SeuratObject_4.0.2
Seurat_4.0.4
MuDataSeurat_0.0.0.9000

Thanks for developing this package. It helps a lot in switching between Seurat and scanpy!

HDF5-API Errors: error #000: H5D.c in H5Dvlen_reclaim(): line 732: invalid argument

Hi, I'm trying to convert mudata to Seurat object. I'm getting following error while using ReadH5MU():

seurat <- ReadH5MU("str.h5mu")
Error in self$read_low_level(file_space = self_space_id, mem_space = mem_space_id, :
HDF5-API Errors:
error #000: H5D.c in H5Dvlen_reclaim(): line 732: invalid argument
class: HDF5
major: Invalid arguments to routine
minor: Bad value

Kindly help, thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.