Coder Social home page Coder Social logo

Comments (11)

hxj5 avatar hxj5 commented on June 8, 2024

Hi Alex,

Thanks for the feedback. Updating the file in place may cause confusion for users who use older version of reference conventions. Adding new reference VCF file for new conventions is an option, but it could be redundant if there are many versions of conventions. Alternatively, "bcftools annotate --rename-chrs" tool could be used to update the chromosome names.

Cheers,
Xianjie

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

Thank you! No worries, just wanted to give a heads-up. Thanks for pointing out the bcftools method too, didn't know about it!

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

Hi @hxj5 - just wanted to ask a quick question.

I just ran the conversion of the reference VCF (which quite clearly worked - all main chromosomes are now renamed to chr1-chr22), and then ran cellSNP in "mode 1" (using this reference VCF) with a BAM file with the same naming convention (chr1 etc)

The results, however, are empty - and the header of cellSNP.cells.vcf.gz looks like this:

##fileformat=VCFv4.2
##source=cellSNP_v0.3.2
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=.,Description="Filter info not available">
##INFO=<ID=DP,Number=1,Type=Integer,Description="total counts for ALT and REF">
##INFO=<ID=AD,Number=1,Type=Integer,Description="total counts for ALT">
##INFO=<ID=OTH,Number=1,Type=Integer,Description="total counts for other bases from REF and ALT">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="total counts for ALT and REF">
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="total counts for ALT">
##FORMAT=<ID=OTH,Number=1,Type=Integer,Description="total counts for other bases from REF and ALT">
##FORMAT=<ID=ALL,Number=5,Type=Integer,Description="total counts for all bases in order of A,C,G,T,N">
##contig=<ID=1>
##contig=<ID=2>
##contig=<ID=3>
##contig=<ID=4>
##contig=<ID=5>
##contig=<ID=6>
##contig=<ID=7>
##contig=<ID=8>
##contig=<ID=9>
##contig=<ID=10>
##contig=<ID=11>
##contig=<ID=12>
##contig=<ID=13>
##contig=<ID=14>
##contig=<ID=15>
##contig=<ID=16>
##contig=<ID=17>
##contig=<ID=18>
##contig=<ID=19>
##contig=<ID=20>
##contig=<ID=21>
##contig=<ID=22>
##contig=<ID=X>
##contig=<ID=Y>

Do you have any idea how it might have happened? Can't seem to find the way to debug this... Thank you in advance!

P.S. My cellSNP is version 0.3.2 installed with pip.

from cellsnp.

hxj5 avatar hxj5 commented on June 8, 2024

Hi, for the "empty" VCF, could you share the log information, especially the error message?

I was also wondering what the type of the data is. Is it scRNA, or scDNA/scATAC? sometimes the empty output is due to improper setting of --UMItag for scDNA/scATAC data.

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

The data is single nucleus RNA-seq aligned with STARsolo. There are no obvious errors in the log file, here's the concatenated version:

Processing sample Pla_HDBR11923125..
[cellSNP] mode 1: fetch given SNPs in 67849 single cells.
[cellSNP] loading the VCF file for given SNPs ...
[cellSNP] fetching 7352497 candidate variants ...
2.00% positions processed.
...............................
100.00% positions processed.

[cellSNP] fetched 7352497 variants, now merging temp files ... 
[cellSNP] 38 lines in final vcf file
[cellSNP] All done: 376 min 45.6 sec

from cellsnp.

hxj5 avatar hxj5 commented on June 8, 2024

Thanks for the information. seems no obvious error. Could you share your command line and a few lines of 1) the BAM records 2) the VCF file and 3) the barcode file?

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

I've extracted the bam file of chromosome 22 of my sample. Here are the links to the bam file, barcode list, and the reference VCF:

https://www.dropbox.com/s/eynica8ftbzsdc4/barcodes.tsv
https://www.dropbox.com/s/69p16aucuno39p0/chr22.bam
https://www.dropbox.com/s/nd4oiiw9adzlwne/sorted.vcf.gz

Thank you for your help!

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

Command line was as follows:

cellSNP -s $TAG.bam -b $TAG.cut2k_barcodes.tsv -O $TAG.cellsnp.out -R $REF -p 16 --minMAF 0.1 --minCOUNT 20

where REF is the reference VCF file.

from cellsnp.

hxj5 avatar hxj5 commented on June 8, 2024

Hi, seems the bam file does not contain the UR tag, which is the default UMI tag when setting --UMItag Auto and barcode file is given. You may try setting --UMItag UB as the bam file contains the UB tag.

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

Thank you very much! UB is the tag for error-corrected UMI, while UR is one for raw. I guess that's the difference between Cell Ranger and STARsolo-generated BAM files.

I'll let you know if it worked.

from cellsnp.

apredeus avatar apredeus commented on June 8, 2024

It did! 👍 (Totally forgot about this discussion.)

Thanks again for your help.

from cellsnp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.