Coder Social home page Coder Social logo

xcltk's Introduction

xcltk: Toolkit for XClone Preprocessing

XClone is a statistical method to detect allele- and haplotype-specific copy number variations (CNVs) and reconstruct tumour clonal substructure from scRNA-seq data, by integrating the expression levels (read depth ratio; RDR signals) and the allelic balance (B-allele frequency; BAF signals). It takes three matrices as input: the allele-specific AD and DP matrices (BAF signals) and the total read depth matrix (RDR signals).

The xcltk package implements a preprocessing pipeline to generate the three matrices from SAM/BAM/CRAM files. It supports data from multiple single-cell sequencing platforms, including droplet-based (e.g., 10x Genomics) and well-based (e.g., SMART-seq) platforms.

News

You can find the full manual of the xcltk preprocessing pipeline at preprocess/README.md.

All release notes are available at docs/release.rst

Installation

Install via pip (latest stable version)

xcltk is avaliable through pypi.

pip install -U xcltk

Install from this Github Repo (latest stable/dev version)

pip install -U git+https://github.com/hxj5/xcltk

In either case, if you don't have write permission for your current Python environment, we suggest creating a separate conda environment or add --user for your current one.

Manual

You can check the full parameters with xcltk -h.

Program: xcltk (Toolkit for XClone Preprocessing)
Version: 0.3.1

Usage:   xcltk <command> [options]

Commands:
  -- BAF calculation
     allelefc         Allele-specific feature counting.
     baf              Preprocessing pipeline for XClone BAF.
     fixref           Fix REF allele mismatches based on reference FASTA.
     rpc              Reference phasing correction.

  -- RDR calculation
     basefc           Basic feature counting.

  -- Tools
     convert          Convert between different formats of genomic features.

  -- Others
     -h, --help       Print this message and exit.
     -V, --version    Print version and exit.

xcltk's People

Contributors

huangyh09 avatar hxj5 avatar rongtingting avatar

Stargazers

 avatar

Watchers

 avatar  avatar

xcltk's Issues

Got error when I run xcltk

Hi,

I have successfully installed xcltk on my local computer. However, when I attempted to run the tool with the necessary parameters, I encountered some errors as following:

[I::pipeline::pipeline_wrapper] xcltk BAF preprocessing starts ...
[I::pipeline::pipeline_wrapper] check args ...
Traceback (most recent call last):
File "/Users/cpan/opt/anaconda3/envs/xcltk/bin/xcltk", line 8, in
sys.exit(main())
^^^^^^
File "/Users/cpan/opt/anaconda3/envs/xcltk/lib/python3.11/site-packages/xcltk/xcltk.py", line 49, in main
elif command == "baf": baf_baf(sys.argv)
^^^^^^^^^^^^^^^^^
File "/Users/cpan/opt/anaconda3/envs/xcltk/lib/python3.11/site-packages/xcltk/baf/pipeline.py", line 118, in pipeline_main
ret = pipeline_wrapper(
^^^^^^^^^^^^^^^^^
File "/Users/cpan/opt/anaconda3/envs/xcltk/lib/python3.11/site-packages/xcltk/baf/pipeline.py", line 152, in pipeline_wrapper
assert_n(label)
File "/Users/cpan/opt/anaconda3/envs/xcltk/lib/python3.11/site-packages/xcltk/utils/base.py", line 9, in assert_n
assert x is not None and len(x) > 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

For context, I can run cellsnp-lite successfully on the same system, which suggests that the general setup is functional. However, given the enhancements in xcltk, I am eager to transition to using this tool. Please help me check this problem, thanks a lot!

Best,
Chu

Sanger Imputation Service issue

Dear xcltk developers,

First of all thank you for developing such an amazing tool along with XClone. The documentation is great!
I have encountered an issue while trying to upload the VCF file generated from the baf_pre_phase.sh script. I am unsure of how to resolve this problem since I used the reference genome you mentioned that is compatible with Sanger. Could you please suggest some solutions?

--- Aborted Job ---
  The genotype probability distribution in the input file does not match the reference
  panel frequencies well. The number of genotypes expected with low frequencies under HWE
  (with P<=0.1) is too big in the user data: 0.32 whereas the threshold is 0.26. For comparison,
  the number of these genotypes in 1000Genomes data is 0.17, the attached plot shows typical
  GT distributions

  This is usually an indicator of REF,ALT alleles being on incorrect strand. Another frequent
  problem is the VCF using a different reference sequence, for example GRCh38 instead of GRCh37.

I also have another question regarding xcltk basefc parameter -r <region file>, would it be possible to provide some example of what this file should be?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.