Coder Social home page Coder Social logo

ga4k-sv-finder's Introduction

Changelog:

05/03/2024 -

Amended end coordinate of insertion calls and overlapping gene IDs

04/04/2024 -

Re-uploaded raw data tsv. Included homozygote count. Updated cohort allele frequency (cohort_af) accuracy for observed genotypes. Removed frequencies for european (eur) and admixed american (amr) ancestries based on somalier probabilities.

04/01/2024 -

We identified some issues with calculations in the tsv file and have removed the file until fixed. We will update with the corrected file soon.

GA4K SV Finder

GA4K SV Finder is a tool to search for structural variants (SV) and associated genes from 497 probands in the Genomics Answers for Kids (GA4K) cohort (2023) with HiFi long read genomes processed with PBSV (v2.6.2). Whether you're interested in specific genes, SV coordinates, variant frequencies, or a mix (query file), GA4K SV Finder provides a look into the GA4K rare disease cohort.

Note: SVs were aligned using human genome reference hg38. Cohort alleles with frequencies that would identify a single person have been excluded.

Dependencies

  • Python 3.7 or higher
  • Tkinter (usually comes with Python)
  • Pandas

Installation

Python Installation

For Linux (Debian/Ubuntu)

sudo apt update
sudo apt install python3

For Linux (CentOS/Red Hat)

sudo dnf install python3

For MacOS

brew install python

Pandas Installation

After installing Python, you can install Pandas using pip:

pip install pandas

Application Usage

Navigate the command line to the directory where you have downloaded the raw data tsv and ga4ksvf-app.py (cd path/to/your/download)

Execute the following command: python3 ga4ksvf-app.py

Command Line Usage

GA4K SV Finder supports three modes of operation. Navigate the command line to the directory where you have downloaded the raw data tsv and ga4ksvf-cmd.py (cd path/to/your/download)

Genes

To search for a specific gene, use:

python3 ga4ksvf-cmd.py GENENAME

Coordinates

To search by genomic coordinates:

python3 ga4ksvf-cmd.py chr3:179121491-179374301

Query Files (Both)

To run multiple queries from a file:

python3 ga4ksvf-cmd.py [options]

  • -f FILE, --file FILE: Process queries from a specified file.
  • -e, --export: Export accumulated query results to a CSV file. Prompts for a file name, defaulting to result.csv.

Interactive Mode

If the script is run without any arguments, it enters interactive mode. In this mode, you can input queries directly into the terminal. Type 'exit' to quit the interactive mode. python3 ga4ksvf-cmd.py

Example queries.txt content:

chr3:179121491-179374301
chr16:53841057-53841060
chr9:121275764-121307053
MFN1
FTO
GSN

ga4k-sv-finder's People

Contributors

cas-lemaster avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.