Comments (11)
Hi Alex,
Thanks for the feedback. Updating the file in place may cause confusion for users who use older version of reference conventions. Adding new reference VCF file for new conventions is an option, but it could be redundant if there are many versions of conventions. Alternatively, "bcftools annotate --rename-chrs" tool could be used to update the chromosome names.
Cheers,
Xianjie
from cellsnp.
Thank you! No worries, just wanted to give a heads-up. Thanks for pointing out the bcftools
method too, didn't know about it!
from cellsnp.
Hi @hxj5 - just wanted to ask a quick question.
I just ran the conversion of the reference VCF (which quite clearly worked - all main chromosomes are now renamed to chr1-chr22), and then ran cellSNP
in "mode 1" (using this reference VCF) with a BAM file with the same naming convention (chr1 etc)
The results, however, are empty - and the header of cellSNP.cells.vcf.gz
looks like this:
##fileformat=VCFv4.2
##source=cellSNP_v0.3.2
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=.,Description="Filter info not available">
##INFO=<ID=DP,Number=1,Type=Integer,Description="total counts for ALT and REF">
##INFO=<ID=AD,Number=1,Type=Integer,Description="total counts for ALT">
##INFO=<ID=OTH,Number=1,Type=Integer,Description="total counts for other bases from REF and ALT">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="total counts for ALT and REF">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="total counts for ALT">
##FORMAT=<ID=OTH,Number=1,Type=Integer,Description="total counts for other bases from REF and ALT">
##FORMAT=<ID=ALL,Number=5,Type=Integer,Description="total counts for all bases in order of A,C,G,T,N">
##contig=<ID=1>
##contig=<ID=2>
##contig=<ID=3>
##contig=<ID=4>
##contig=<ID=5>
##contig=<ID=6>
##contig=<ID=7>
##contig=<ID=8>
##contig=<ID=9>
##contig=<ID=10>
##contig=<ID=11>
##contig=<ID=12>
##contig=<ID=13>
##contig=<ID=14>
##contig=<ID=15>
##contig=<ID=16>
##contig=<ID=17>
##contig=<ID=18>
##contig=<ID=19>
##contig=<ID=20>
##contig=<ID=21>
##contig=<ID=22>
##contig=<ID=X>
##contig=<ID=Y>
Do you have any idea how it might have happened? Can't seem to find the way to debug this... Thank you in advance!
P.S. My cellSNP
is version 0.3.2 installed with pip
.
from cellsnp.
Hi, for the "empty" VCF, could you share the log information, especially the error message?
I was also wondering what the type of the data is. Is it scRNA, or scDNA/scATAC? sometimes the empty output is due to improper setting of --UMItag
for scDNA/scATAC data.
from cellsnp.
The data is single nucleus RNA-seq aligned with STARsolo
. There are no obvious errors in the log file, here's the concatenated version:
Processing sample Pla_HDBR11923125..
[cellSNP] mode 1: fetch given SNPs in 67849 single cells.
[cellSNP] loading the VCF file for given SNPs ...
[cellSNP] fetching 7352497 candidate variants ...
2.00% positions processed.
...............................
100.00% positions processed.
[cellSNP] fetched 7352497 variants, now merging temp files ...
[cellSNP] 38 lines in final vcf file
[cellSNP] All done: 376 min 45.6 sec
from cellsnp.
Thanks for the information. seems no obvious error. Could you share your command line and a few lines of 1) the BAM records 2) the VCF file and 3) the barcode file?
from cellsnp.
I've extracted the bam file of chromosome 22 of my sample. Here are the links to the bam file, barcode list, and the reference VCF:
https://www.dropbox.com/s/eynica8ftbzsdc4/barcodes.tsv
https://www.dropbox.com/s/69p16aucuno39p0/chr22.bam
https://www.dropbox.com/s/nd4oiiw9adzlwne/sorted.vcf.gz
Thank you for your help!
from cellsnp.
Command line was as follows:
cellSNP -s $TAG.bam -b $TAG.cut2k_barcodes.tsv -O $TAG.cellsnp.out -R $REF -p 16 --minMAF 0.1 --minCOUNT 20
where REF
is the reference VCF file.
from cellsnp.
Hi, seems the bam file does not contain the UR
tag, which is the default UMI tag when setting --UMItag Auto
and barcode file is given. You may try setting --UMItag UB
as the bam file contains the UB
tag.
from cellsnp.
Thank you very much! UB
is the tag for error-corrected UMI, while UR
is one for raw. I guess that's the difference between Cell Ranger
and STARsolo
-generated BAM files.
I'll let you know if it worked.
from cellsnp.
It did! 👍 (Totally forgot about this discussion.)
Thanks again for your help.
from cellsnp.
Related Issues (20)
- Question about running cellSNP on merged BAM files from multiple samples HOT 4
- Change the flag filtering default to include PCR duplicates HOT 4
- Running cellSNP on transcriptomic BAM from salmon alevin HOT 4
- Reference Genome HOT 1
- Wrong number of chromosomes in output file? HOT 3
- Does it work for mouse? HOT 1
- Some SNPs are not in gene HOT 1
- AD in cells HOT 1
- Empty AD&DP mtx HOT 23
- KeyError: '.' in cellSNP/pileup_utils.py causing failure of temp file merging HOT 2
- Run time estimation HOT 1
- generating VCF for REGION_VCF from RNA-seq data HOT 1
- Adapting cellSNP for gene-of-interest / indels / multiple scRNA-seq samples
- RuntimeWarning: divide by zero encountered in log HOT 9
- Default nan-handling policy is a memory hog HOT 1
- Is it expected to generate a "position x barcode" matrix? HOT 1
- output sparse matrix HOT 11
- No SNPs called in more than 1 barcodes HOT 3
- Run time estimation HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cellsnp.