Comments (3)
Hi,
thanks for the question. I don't think that the read count for chains with equal UMI count provides any useful information about true abundance. But it's also not worse than random. There are two reasons it's implemented that way:
- the sorting gets more reproducible as it doesn't depend on random order in the case of ties
- the same default sorting keys can be used for datasets that do and don't provide umi counts
You have a point that it's currently not possible to specify whether to sort ascending/descending in the index_chains
function. I'm a bit wary thought to introduce even more complexity to that function.
Would something like
adata.obsm["airr"]["sort_key2"] = -adata.obsm["airr"]["sort_key"]
ir.pp.index_chains(adata, sort_chains_by = {"sort_key2": float("-inf")})
work for you as a workaround?
from scirpy.
Hi @grst,
Thank you for the quick reply. In the context of 10X VDJ sequencing, abundance of reads is largely determined by, on top of UMI abundance, efficiency of PCR amplification and luck of sampling during sequencing. In other words, read count represents number of observations. Unique molecules that require higher number of observations to be found are likely less abundant than those require fewer observations. That's the rationale. In practice, it would affect only a very small subset of cells.
Thank you for the workaround. I can live with that :). Thanks!
from scirpy.
Closing because you said you could live with the workaround. Let me know if there are any other issues.
from scirpy.
Related Issues (20)
- Integrate TCRdist3 HOT 5
- Retrieving specific portions of the Immune Receptor beyond the junction (or CDR3). HOT 2
- ir_dist alignment stuck HOT 4
- IEDB database cdr3_aa stored as junction_aa HOT 10
- Unclear default value for the Hamming Distance cut-off HOT 1
- Dandelion interoperability
- Where has UMI count for AIR chains gone? HOT 1
- Large dataset tutorial
- Make sure axes of nextwork plots don't have any ticks
- Add the Morisita-Horn index for repertoire overlap similarity scores HOT 1
- Community tutorial page
- ir.tl.ir_query fails with error 'ValueError: max_workers must be greater than 0' HOT 1
- ir.tl.clonotype_modularity - ValueError: Length of values does not match length of index HOT 2
- "read_10x_vdj" not loading data properly HOT 2
- clone definition purely using CDR3 sequence HOT 1
- Optimize TCRdist metric HOT 1
- When running 'ir.tl.define_clonotypes' on MacOS14.4.1, I've got an Error:module 'os' has no attribute 'sched_getaffinity' HOT 2
- TypeError: join() got an unexpected keyword argument 'validate' HOT 9
- Thoughts on adding motif results to amino acid similarity? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scirpy.