tanghaibao / treecut Goto Github PK
View Code? Open in Web Editor NEWFind nodes in hierarchical clustering that are statistically significant
Home Page: http://chibba.agtec.uga.edu/duplication/cut
Find nodes in hierarchical clustering that are statistically significant
Home Page: http://chibba.agtec.uga.edu/duplication/cut
Explore the alternative method, for example using STRUCTURE, this is explored in the following paper
http://www.ncbi.nlm.nih.gov/pubmed/21472410
Determination of genetic structure of germplasm collections: are traditional hierarchical clustering methods appropriate for molecular marker data?
In population genetics, STRUCTURE is much more useful than hierarchical clustering. Correlating phenotype information with STRUCTURE results will become emergent problem to solve.
Study this and discuss possible implementations.
Currently only two tests are performed on internal nodes, and quite simplistic: t-test (for continuous data), and Fisher's exact test (for categorical data). Study the theory behind this package:
http://www.is.titech.ac.jp/~shimo/prog/pvclust/
and possibly porting some of the functions over.
Based on Jingping's idea below:
Since the program takes an existed tree and that tree construction itself can be fishy sometimes. I wonder if the algo might also try to consider alternative tree shapes to improve cutting. Search of tree shape itself is NP-hard problem, but some simplified thing might be doable. For example, the algo could take not only the tree shape but also node support for the input tree. Then for nodes whose support is low the algo could try to switch around its "children". For example in the figure attached, say if the support for node N4 is low, the algo could try the alternative clustering shown in the right part of the figure, which in this case might narrow done the defining of the group (from 130, 150, 90 to 130, 150). I assume in some other cases this could result in expansion of group definition too.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.