Coder Social home page Coder Social logo

tidygate's Introduction

tidygate: add gate information to your tibble

Lifecycle:maturing

Please have a look also to

  • nanny for tidy high-level data analysis and manipulation

  • tidyHeatmap for producing heatmaps following tidy principles

  • tidybulk for tidy and modular transcriptomics analyses

Installation

# From Github
devtools::install_github("stemangiola/tidygate")

# From CRAN
install.package("tidygate")

What is tidygate

It interactively or programmately labels points within custom gates on two dimensions, according to tidyverse principles. The information is added to your tibble. It is based on the package gatepoints from Wajid Jawaid.

The main benefits are

  • in interactive mode you can draw your gates on extensive ggplot-like scatter plots
  • you can draw multiple gates
  • you can save your gates and apply the programmatically.

Input

A tibble of this kind

dimension1 dimension2 annotations
chr or fctr numeric

Step-by-step instructons for Rstudio

1) Execute the following code in the console panel
tidygate_gate <-
  tidygate_data %>%
  mutate( gate = gate_chr( Dim1, Dim2 ) )
2) look at the Viewer and draw a gate clicking at least three times on the plot

3) Click the finish button on the top-right corner, or press escape on your keyboard

The output tibble

tidygate_gate
## # A tibble: 2,240 x 9
##    group   hierarchy `ct 1`    `ct 2`    relation cancer_ID   Dim1    Dim2 gate 
##    <chr>       <dbl> <chr>     <chr>        <dbl> <chr>      <dbl>   <dbl> <chr>
##  1 adrenal         1 endothel… epitheli…    -1    ACC       -0.874 -0.239  0    
##  2 adrenal         1 endothel… fibrobla…    -1    ACC       -0.740  0.114  1    
##  3 adrenal         1 endothel… immune_c…    -1    ACC       -0.988  0.118  0    
##  4 adrenal         1 epitheli… endothel…     1    ACC        0.851  0.261  0    
##  5 adrenal         1 epitheli… fibrobla…     1    ACC        0.839  0.320  0    
##  6 adrenal         1 epitheli… immune_c…     1    ACC        0.746  0.337  0    
##  7 adrenal         1 fibrobla… endothel…     1    ACC        0.722 -0.0696 0    
##  8 adrenal         1 fibrobla… epitheli…    -1    ACC       -0.849 -0.317  0    
##  9 adrenal         1 fibrobla… immune_c…     0.52 ACC       -0.776 -0.383  0    
## 10 adrenal         1 immune_c… endothel…     1    ACC        0.980 -0.116  0    
## # … with 2,230 more rows

Gates are saved in a temporary file for later use

## [[1]]
##            x          y
## 1 -0.9380459  0.2784375
## 2 -0.9555544 -0.1695209
## 3 -0.3310857  0.2116150
## 
## [[2]]
##             x          y
## 1  0.01324749  0.2165648
## 2 -0.31065917 -0.1026984
## 3 -0.11514794 -0.2982161
## 4  0.48013998  0.1225183

Programmatic gating

We can use previously drawn gates to programmately add the gate column

tidygate_data %>%
  mutate( gate = gate_chr(
    Dim1, Dim2,
     # Pre-defined gates
    gate_list = my_gates
  ))
## # A tibble: 2,240 x 9
##    group   hierarchy `ct 1`    `ct 2`    relation cancer_ID   Dim1    Dim2 gate 
##    <chr>       <dbl> <chr>     <chr>        <dbl> <chr>      <dbl>   <dbl> <chr>
##  1 adrenal         1 endothel… epitheli…    -1    ACC       -0.874 -0.239  0    
##  2 adrenal         1 endothel… fibrobla…    -1    ACC       -0.740  0.114  1    
##  3 adrenal         1 endothel… immune_c…    -1    ACC       -0.988  0.118  0    
##  4 adrenal         1 epitheli… endothel…     1    ACC        0.851  0.261  0    
##  5 adrenal         1 epitheli… fibrobla…     1    ACC        0.839  0.320  0    
##  6 adrenal         1 epitheli… immune_c…     1    ACC        0.746  0.337  0    
##  7 adrenal         1 fibrobla… endothel…     1    ACC        0.722 -0.0696 0    
##  8 adrenal         1 fibrobla… epitheli…    -1    ACC       -0.849 -0.317  0    
##  9 adrenal         1 fibrobla… immune_c…     0.52 ACC       -0.776 -0.383  0    
## 10 adrenal         1 immune_c… endothel…     1    ACC        0.980 -0.116  0    
## # … with 2,230 more rows

tidygate's People

Contributors

davisvaughan avatar hadley avatar mblue9 avatar stemangiola avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

tidygate's Issues

Gate not reflected as I did

gates used to work properly but suddenly all the gate I made were shown on the left top with smaller size. Is this a bug or am I doing something wrong? Below is my code and error message. Thx!

harmonized_gated1 <-
harmonized_seurat %>%
mutate(gated1 = tidygate::gate_chr(
UMAP_1, UMAP_2, .color = seurat_clusters,
.size = 0.1
))
Mark region on plot.
Error in dplyr::mutate():
ℹ In argument: gated1 = tidygate::gate_chr(...).
Caused by error in map():
ℹ In index: 1.
Caused by error in data[i, 1]:
! subscript out of bounds
Run rlang::last_trace() to see where the error occurred.

image

Improve performances and interface of `tidygate` for large single-cell objects

tidygate allows gating points from a tibble, visually procedurally and in a replicable way, compatibly with a pipe-oriented programming style

https://github.com/stemangiola/tidygate

With an arbitrarily large amount of points (1M cells), it struggles, mostly at the visualisation step.

We propose a downsample strategy for the visual preview, while the selection would be done on all data points.

PLEASE IGNORE THIS AT THE MOMENT: As a possible second PR, the plotting backend might be changed from the old interactive plotting style base-R (often broken in the R studio server). A more modern Photoshop, the lasso-style selector, would be great. An example is CellSelector() from Seurat. However, we would need to implement lasso-style gating and multiple gating.

Not able to select points

Hi,
I would love to use your package but somehow I am not able to make it work. It would be great if you could help me understand, what is going wrong here. When I try the follwoing example I always get this error:

ggplot(tidygate_data)+geom_point(mapping = aes(x=Dim1,y=Dim2))

tidygate_gate <-

  • tidygate_data %>%
  • mutate( gate = gate_chr( Dim1, Dim2 ) )
    Error: Problem with mutate() column gate.
    gate = gate_chr(Dim1, Dim2).
    x Must extract column with a single valid subscript.
    x Subscript var has size 0 but must be size 1.
    Run rlang::last_error() to see where the error occurred.

Backtrace:

  1. tidygate_data %>% mutate(gate = gate_chr(Dim1, Dim2))
  2. dplyr:::pull.data.frame(., color_hexadecimal)
  3. tidyselect::vars_pull(names(.data), !!enquo(var))
  4. tidyselect:::pull_as_location2(loc, n, vars)
  5. vctrs::num_as_location2(i, n = n, negative = "ignore", arg = "var")
  6. vctrs:::result_get(...)

When I run the second part uncluding gate_chr, a new plot appears in the plot area (I am using RStudio), but it is empty.

Let me know if you need any other info!
Thanks for your help!

rasterization needed for big datasets

Hey everyone,
I have a dataset of 370k cells, and when I run tidygate it is very laggy,and out of 3 times, it works once without crashing. I think a restarization is needed once you have more than 100k cells, like the DimPlot in Seurat does it.
Once I reduce the number of plotted cells, the algorithm works smoother.
Cheers

Loading/gating

File loaded into R:

> library(readxl)
> LiveDead <- read_excel("Documents/LiveDead.xlsx", 
+     col_types = c("numeric", "numeric", "text"))
> View(LiveDead)                                                                                    
> LiveDead
# A tibble: 5,046 x 3
   Mean488x_525m Mean560x_607m Cell_Status
           <dbl>         <dbl> <chr>      
 1         1381.          302. Live       
 2         7104           266. Live       
 3         8073.          251. Live       
 4          960.          594. Live       
 5          814.          426. Live       
 6         8985.          250. Live       
 7        23917.          292. Live       
 8          990.          548. Live       
 9         1317.          464. Live       
10         7952.          323. Live       
# … with 5,036 more rows

LiveDead.xlsx

Attempt to gate data

> LiveDead %>% gate(.data = LiveDead, .dim1 = Mean488x_525m, .dim2 = Mean560x_607m)
Error: Must subset columns with a valid subscript vector.
x Subscript has the wrong type `tbl_df<
  Mean488x_525m: double
  Mean560x_607m: double
  Cell_Status  : character
>`.
ℹ It must be numeric or character.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.