Coder Social home page Coder Social logo

tanghaibao / treecut Goto Github PK

View Code? Open in Web Editor NEW
28.0 28.0 10.0 540 KB

Find nodes in hierarchical clustering that are statistically significant

Home Page: http://chibba.agtec.uga.edu/duplication/cut

Python 100.00%
clustering statistical-analysis unsupervised-learning

treecut's Introduction

treecut's People

Contributors

tanghaibao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

treecut's Issues

Study whether the hierarchical clustering method is suited for grouping genotypes with markers

Explore the alternative method, for example using STRUCTURE, this is explored in the following paper

http://www.ncbi.nlm.nih.gov/pubmed/21472410
Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data?

In population genetics, STRUCTURE is much more useful than hierarchical clustering. Correlating phenotype information with STRUCTURE results will become emergent problem to solve.

Study this and discuss possible implementations.

Incorporate clustering uncertainties

Based on Jingping's idea below:

Since the program takes an existed tree and that tree construction itself can be fishy sometimes. I wonder if the algo might also try to consider alternative tree shapes to improve cutting. Search of tree shape itself is NP-hard problem, but some simplified thing might be doable. For example, the algo could take not only the tree shape but also node support for the input tree. Then for nodes whose support is low the algo could try to switch around its "children". For example in the figure attached, say if the support for node N4 is low, the algo could try the alternative clustering shown in the right part of the figure, which in this case might narrow done the defining of the group (from 130, 150, 90 to 130, 150). I assume in some other cases this could result in expansion of group definition too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.