Coder Social home page Coder Social logo

emptydrops's Introduction

emptydrops

Python implementation of emptydrops-like cell calling as in CellRanger v3.0.2

Disclaimer:

All code originally comes from https://github.com/10XGenomics/cellranger with minimal modifications for packaging and running under python3.

Usage:

from emptydrops import find_nonambient_barcodes
from emptydrops.matrix import CountMatrix

matrix = CountMatrix.from_legacy_mtx(mtx_dir)

find_nonambient_barcodes(
    matrix,          # Full expression matrix
    orig_cell_bcs,   # (iterable of str): Strings of initially-called cell barcodes
    min_umi_frac_of_median=0.01,
    min_umis_nonambient=500,
    max_adj_pvalue=0.01
)

Returns:

[
    'eval_bcs',      # Candidate barcode indices in addition to those in `orig_cell_bcs` (n)
    'log_likelihood',# Ambient log likelihoods (n)
    'pvalues',       # pvalues (n)
    'pvalues_adj',   # B-H adjusted pvalues (n)
    'is_nonambient', # Boolean nonambient calls (n)
]

emptydrops's People

Contributors

nh3 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

gww

emptydrops's Issues

Ambiguous type for find_nonambient_barcodes input

Hi,

The function find_nonambient_barcodes lists the input requirement:

orig_cell_bcs (iterable of str): Strings of initially-called cell barcodes.

However, because the default meaning of "str" changed between python2 (bytes) and python 3 (unicode) this broke in python3. Worse, with a unicode input it appears that no good barcodes were identified in the original list and the codes hits an uninformative "return None".

    # Choose candidate cell barcodes
    orig_cell_bc_set = set(orig_cell_bcs)
    orig_cells = np.flatnonzero(np.fromiter((bc in orig_cell_bc_set for bc in matrix.bcs),
                                            count=len(matrix.bcs), dtype=bool))

    # No good incoming cell calls
    if orig_cells.sum() == 0:
        return None

Suggestions for fixing it are

  1. Casting each string to bytes before the "set" type casting orig_cell_bcs = tuple( i.encode('ascii') for i in orig_cell_bcs )
  2. Provide an informative exception

Thanks for making this code more accessible!

how to plot the barcode rank plot

Excellent work!!
I still had some questions about how to plot the barcode rank plot.
Can u tell me how to use the result of the emptydrops to plot the barcode rank plot?

Code finds *additional* barcodes, filters existing

I believe this code does not perform the task as expected.

Specifically, it masks out the original cell barcodes such that the non_ambient calls are only made for the cells other than the originally supplied barcodes. In the CellRanger package (but not the code included here) this is followed by extending the set of original filtered barcodes:

# Update the lists of cell-associated barcodes
            for genome in genomes:
                eval_bc_strs = np.array(gg_matrix.bcs)[result.eval_bcs]
                filtered_bcs_groups[(gg, genome)].extend(eval_bc_strs[(genome_calls == genome) & (result.is_nonambient)])

This is a pretty specific behavior which doesn't perform what the user would expect based on R's emptydrops package (which notably does not require any pre-input of a starting barcode list). Curious to hear your thoughts.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.