Comments (5)
In GitLab by @grst on Jan 24, 2020, 10:10
10x files:
We use the file filtered_contig_annotations.csv
.
# Human B cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_hs_pbmc3_b_filtered_contig_annotations.csv | sort | uniq -c
1 chain
929 IGH
624 IGK
506 IGL
# Human T cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_hs_pbmc3_t_filtered_contig_annotations.csv | sort | uniq -c
1 chain
46 Multi
4907 TRA
5168 TRB
# Mouse B cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_mm_pbmc4_b_filtered_contig_annotations.csv| sort | uniq -c
1 chain
5215 IGH
5475 IGK
2573 IGL
# Mouse T cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_mm_pbmc4_t_filtered_contig_annotations.csv| sort | uniq -c
1 chain
7 Multi
761 TRA
1301 TRB
There are indeed a bunch of barcodes that have more than 4 chains, e.g.
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_1 True 648 TRB TRBV19 None TRBJ2-1 TRBC2 True True CASSISTDWGNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_2 True 511 TRA TRAV23/DV6 None TRAJ58 TRAC True True CAASQETSGSRLTF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_3 True 521 TRB TRBV6-5 None TRBJ2-1 TRBC2 True True CASSYRTGSSYNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_4 True 659 TRA TRAV8-6 None TRAJ6 TRAC True True CAVNPGGSYIPTF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_5 True 427 TRB TRBV7-8 None TRBJ2-1 TRBC2 True False CQQLRKTSYNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_6 True 463 TRA TRAV13-1 None TRAJ4 TRAC True False CSKFLFSGGYNKLIF
But for those I checked, there are only four that are productive.
For now, I think it's fine to just use the productive chains and emit a warning that there might be more.
from scirpy.
In GitLab by @grst on Jan 24, 2020, 10:55
TraCeR
- Supports alpha, beta, gamma and delta. Per default, it will only look for alpha/beta
- The final output ("filtered") will contain only the two most highly expressed ones from each locus. That means max. 4 for alpha/beta. No idea how this looks like if it searches for gamma/delta as well.
Files to use:
recombinants.txt
: Output oftracer summary
. Contains CDR3 sequences, V and J gene of the filtered chains.<CELL>/filtered_TCR_seqs/filtered_TCRs.txt
. Contains V gene, J gene, CDR3 seqs and TPM of the filtered chains. Seems to be the only way to get the TPMs unfortunately.
For now, let's go for TraCeR alpha/beta only.
from scirpy.
In GitLab by @szabogtamas on Jan 24, 2020, 13:04
Totally agree: even four chains is a lot and we can safely assume that there shouldn't be more than four productive chains in a cell. Since we only have datasets with alpha/beta, I would also leave gamma/delta for now. We can include them later.
from scirpy.
In GitLab by @grst on Feb 14, 2020, 16:18
alpha/beta is fine for now. But this needs to be documented.
from scirpy.
In GitLab by @grst on Mar 27, 2020, 11:31
assigned to @szabogtamas
from scirpy.
Related Issues (20)
- ir.tl.chain_qc doesn't mark cells with no IR HOT 2
- If integer field is None, writing AIRR files fails HOT 1
- tl.define_clonotypes within_group parameter returns ValueError HOT 1
- Integrate TCRdist3 HOT 5
- Retrieving specific portions of the Immune Receptor beyond the junction (or CDR3). HOT 2
- ir_dist alignment stuck HOT 4
- IEDB database cdr3_aa stored as junction_aa HOT 10
- Unclear default value for the Hamming Distance cut-off HOT 1
- Dandelion interoperability
- Where has UMI count for AIR chains gone? HOT 1
- Large dataset tutorial
- Make sure axes of nextwork plots don't have any ticks
- Add the Morisita-Horn index for repertoire overlap similarity scores HOT 1
- Sorting logic in `index_chains()` HOT 3
- Community tutorial page
- ir.tl.ir_query fails with error 'ValueError: max_workers must be greater than 0' HOT 1
- ir.tl.clonotype_modularity - ValueError: Length of values does not match length of index HOT 2
- "read_10x_vdj" not loading data properly HOT 2
- clone definition purely using CDR3 sequence HOT 1
- Optimize TCRdist metric HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scirpy.