Coder Social home page Coder Social logo

iags's Introduction

IAGS: Inferring Ancestor Genome Structure under a wide range of evolutionary scenarios

The number of novel species with high quality genomes are rapidly accumulating, signaling the start of a golden age for the study of genome structure evolution. Here, we develop IAGS, a generalized novel computational framework to infer ancestral genome structure for a variety of evolutionary scenarios. IAGS provides four basic models to solve simple single-copy (GMP and GGHP) and complex multi-copy ancestor problems (Multi-copy GMP and GGHP) with blocks / endpoints matching optimization (self-BMO and EMO) strategies and their combinations to decode complex evolutionary history in a bottom-up manner.

The previous can be found at IAG, which was only designed for three Papaver species.

Dependencies

Python 3.6

Packages Version used in Research
numpy 1.19.2
pandas 1.1.5
matplotlib 3.3.4

Gurobi solver 9.1.2 with Academic License.

conda install -c gurobi gurobi 

Development environment: Windows 10
Development tool: Pycharm

Usage

Detailed instruction at docs UserGuide.pdf
Example usages in scenarios

Introductions

docs

User guide

Prepare input file

Please refer processDrimm to generate the input file.

dataSturcture

Basic data structure for IAGS.

inferringAncestorGenomeStructure

Containing the source code of four formulations, including GMP, GGHP, BMO and EMO.

models

Containing the source code of four basic models for IAGS, including GMP, GGHP, Multi-copy GMP and Multi-copy GGHP.

util

Including utils for downstream analysis.

inputdata

Four real datasets used in our research, including block sequences for three Brassica, nine Yeast, five Gramineae and three Papaver species. The dataset source as following.

Species URL Block
Brassica rapa (field mustard) https://www.ncbi.nlm.nih.gov/assembly/GCF_000309985.2/ https://www.nature.com/articles/s41477-020-0735-y#Sec19
Brassica nigra http://cruciferseq.ca https://www.nature.com/articles/s41477-020-0735-y#Sec19
Brassica oleracea (wild cabbage) https://www.ncbi.nlm.nih.gov/assembly/GCA_900416815.2 https://www.nature.com/articles/s41477-020-0735-y#Sec19
Eremothecium gossypii http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Lachancea kluyveri http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Kluyveromyces lactis http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Zygosaccharomyces rouxii http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Lachancea thermotolerans http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Lachancea waltii http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Naumovozyma castellii http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Kazachstania naganishii http://gryc.inra.fr/index.php?page=download Orthofinder and Drimm-Synteny
Saccharomyces cerevisiae https://www.ncbi.nlm.nih.gov/assembly/GCF_000146045.2/ Orthofinder and Drimm-Synteny
Gordon, et al. pre-WGD Ancestor(Version 7 Aug2012) http://ygob.ucd.ie/ Orthofinder and Drimm-Synteny
Zea mays https://www.ncbi.nlm.nih.gov/assembly/GCF_902167145.1 Orthofinder and Drimm-Synteny
Sorghum bicolor https://www.ncbi.nlm.nih.gov/assembly/GCF_000003195.3 Orthofinder and Drimm-Synteny
Oryza sativa https://www.ncbi.nlm.nih.gov/assembly/GCF_001433935.1/#/st Orthofinder and Drimm-Synteny
Brachypodium distachyon https://www.ncbi.nlm.nih.gov/assembly/GCF_000005505.3 Orthofinder and Drimm-Synteny
Thinopyrum elongatum https://bigd.big.ac.cn/gwh/Assembly/965/show Orthofinder and Drimm-Synteny
Papaver rhoeas https://xjtu-omics.github.io/Papaver-Genomics/ https://github.com/xjtu-omics/IAG/tree/master/inputFiles
Papaver somniferum https://xjtu-omics.github.io/Papaver-Genomics/ https://github.com/xjtu-omics/IAG/tree/master/inputFiles
Papaver setigerum https://xjtu-omics.github.io/Papaver-Genomics/ https://github.com/xjtu-omics/IAG/tree/master/inputFiles

scenarios

Pipline and example usages for four real datasets.

simulations

Including Non-CRBs and CRBs simulations.

Contact

If you have any questions, please feel free to contact: [email protected], [email protected], [email protected]

Reference

Please cite the following paper when you use IAGS in your work

Shenghan Gao, Xiaofei Yang, Jianyong Sun, Xixi Zhao, Bo Wang, Kai Ye, IAGS: Inferring Ancestor Genome Structure under a Wide Range of Evolutionary Scenarios, Molecular Biology and Evolution, Volume 39, Issue 3, March 2022, msac041, https://doi.org/10.1093/molbev/msac041

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.