schlegelp / tanglegram Goto Github PK
View Code? Open in Web Editor NEWPlot tanglegrams from two dendrograms
License: GNU General Public License v3.0
Plot tanglegrams from two dendrograms
License: GNU General Public License v3.0
As an alternative to the matplotlib backend.
These would bundle a bunch of existing functions such as .plot
, .untangle
and .entanglement
while also providing a platform for future additions like similarity metrics.
Internally, they would make it easier to pass data between functions (currently were passing linkages and labels around).
Produce proper docs (check out the furo theme).
We are interested in using your tanglegram code as a dependency for our tool, KEGGDecoder (binderized repo here, pull request to integrate tanglegram here). Would you be willing to make your package pip installable?
Would it be possible to implement the color functionality contained in the original R code so that the middle panel isnt just grey? I have tons of items in my dendrogram, so currently the connecting bars are all running together as a single grey bar. Being able to change the color on those objects would be super helpful
Hi!
I recently updated tanglegram to utilize the changes in code made available through the 'master' install - I am particularly increased in the options for untangling the output figures. I ran into this issue:
When running tg.plot(..., sort=False)
there are no issues, but using any argument that tries to sort the results (i.e., sort = True
, sort = 'step1side'
, etc.) results in the error: NameError: name 'invert' is not defined
Tracking to the line of offending code (Ln 753), all of the sort functions go through the function leaf_order()
, where 'invert' is undefined. Have you run into that issue and any suggestions on satisfying it?
Thanks!
Hi @Vannghia69. I just now had a chance to test your new implementation of the dendrogram optimization on a more realistic and larger dataset: depending on the size and the "alignability" (i.e. how easy it is to disentangle) of the two dendrograms, it becomes a prohibitively slow procedure.
For example, I had two dendrograms with just 14 leafs where I stopped the optimization after 10 minutes because it didn't look like it was going to converge anytime soon. With cdd3022 I made a couple changes such as using leave_list
instead of dendrogram(..., no_plot=True)
and avoid re-calculations of leaf order wherever possible.
However, the biggest issue still was that there were too many nested loops and that the number of linkage variants to test increased exponential (2, 4, 8, 16, 32, ...) with every li_MID
step. Bottom line: we have to cut down on the number of nested loops somehow. I hope I read your code correctly but my understanding is that your implementation broadly did this (pseudo code):
for li_MID in range(number of hinges) (first loop):
get all possible variations of rotating around the top li_MID hinges
for every variant of the left dendrogram (second loop):
for every variant of the right dendrogram (third loop):
coarse refinement by testing all possible rotations from the bottom to li_MID (this contains multiple nested loops)
fine refinement by rotating only the leaves (also loops)
test entanglement of refined version and keep if better than what we've seen so far
if entanglement has not improved twice break loop
I was wondering if it was sufficient only do a refinement at the very end, outside the main loop. Like this:
for li_MID in range(number of hinges) (first loop):
get all possible variations of rotating around the top li_MID hinges
for every variant of the left dendrogram (second loop):
for every variant of the right dendrogram (third loop):
test entanglement and keep if better than what we've seen so far
if entanglement has not improved twice break loop
coarse refinement of the best linkage from the bottom to li_MID
fine refinement by rotating only the leaves
Thoughts?
It would be great to have objective measures of how well two dendrogram match.
Are you considering to add something equivalent to Tal Galili's cor.dendlist (mainly the cophenetic and baker scores) and entanglement functions?
Matrix as a input to linkage function will make linkage think that you are passing observation value rather than distance value. A condensed matrix is required for linkage to represent distance value.
I have dendrograms with hundreds of entries. Thus, I need a higher plot size. I suggest to add figsize
or figheight
as a parameter:
def gen_tangle(a, b, labelsA=None, labelsB=None, optimize_order=10000,
color_by_diff=True, link_kwargs={}, dend_kwargs={}, figsize=(8, 8)):
...
fig = pylab.figure(figsize=figsize)
Hi Phillipp, I'd love to try your package, but I'm getting an error on pip install:
pip install git+git://github.com/schlegelp/tanglegram@master
...
ERROR: Complete output from command python setup.py egg_info:
ERROR: Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/private/tmp/pip-req-build-bt81skgp/setup.py", line 15, in <module>
with open('requirements.txt') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
Can you please check what is missing here? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.