Coder Social home page Coder Social logo

Comments (14)

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

This is a wonderful idea! Fantastically wonderful. I've been looking for a lightweight, portable, visualization solution, and this would be it.

Would be great to have you work on it! Otherwise, I can try an implementation myself (I love the idea so much!).

If you want to work on it, let me know, and we can discuss organization and API.

For the former, I think it should go into the dendropy.dataio hierarchy, maybe in its own module, dendropy.dataio.d3writer?

Perhaps the API should follow, for e.g. that for the NewickWriter, in terms of what gets rendered (node labels, edge lengths etc.). Also, support for other rendering features (colors, node shapes, edge thicknesses, tree styles, what gets collapsed etc. etc.) should be planned in the API even if we do not get around to implementing it all?

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Fantastic! I'd be happy to help!

If development would go faster through you, I don't mind doing code-review and branching off your work. Whatever works best for you! Otherwise, I'd be happy take a crack at it myself (hopefully sometime today or tomorrow).

Yeah, I agree it makes sense to have this API live in its own module in dendropy.dataio. There is likely a format specified/standardized by Vega for this kind of data structure and all the rendering features you mentioned. This might be a good place to start planning the organization. Vega was started with D3 in mind. I'll look through their docs for ideas.

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

Great!

I think given my familiarity with the codebase, one approach would be
for me to set up all the "scaffolding" --- i.e., the hooks into the data
schema API, the basic class to handle the writing etc., and leave the
main "write" method as a stub to be fleshed out. Then, if you want and
have the time, you can work at translating the tree structure into the
required JSON. I can work on this over this weekend or next week. In the
mean time, if you are agreeable, maybe get familiar with the Vega/D3
API, features, etc. (if you are not already)?

On 3/31/16 12:33 PM, Zachary Sailer wrote:

Fantastic! I'd be happy to help!

If development would go faster through you, I don't mind doing
code-review and branching off your work. Whatever works best for you!
Otherwise, I'd be happy take a crack at it myself (hopefully sometime
today or tomorrow).

Yeah, I agree it makes sense to have this API live in its own module in
|dendropy.dataio|. There is likely a format specified/standardized by
Vega https://github.com/vega/vega for this kind of data structure and
all the rendering features you mentioned. This might be a good place to
start planning the organization. Vega was started with D3 in mind. I'll
look through their docs for ideas.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#46 (comment)


Jeet Sukumaran

[email protected]

Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):

http://www.flickr.com/photos/jeetsukumaran/sets/

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Sounds good! Ping me when you have the basic hooks in place. I'll work on the JSON format and post some ideas here.

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

Will do!

On 3/31/16 1:18 PM, Zachary Sailer wrote:

Sounds good! Ping me when you have the basic hooks in place. I'll work
on the JSON format and post some ideas here.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#46 (comment)


Jeet Sukumaran

[email protected]

Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):

http://www.flickr.com/photos/jeetsukumaran/sets/

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

Ok,

I've put the scaffolding in place:

https://github.com/jeetsukumaran/DendroPy/blob/d3writer/dendropy/dataio/d3writer.py#L148-L160

The _write_tree_list() is for any overhead/meta stuff required for a
group of trees, while _write_tree() works on a single tree. You can
see the corresponding methods in the NewickWriter class for examples on
how this is handled.

Test framework in place at:

https://github.com/jeetsukumaran/DendroPy/blob/d3writer/dendropy/test/test_dataio_d3_writer.py

Test design is going to take some thinking. Typically, the approach has
been to round-trip read-write-read, and then confirm that the objects of
second reading semantically correspond to the objects of the first
reading. Lots of infrastructure to support this.

With a write-only paradigm here, we might have to do a brute-force /
dumb approach, i.e., check if the generated strings match exactly what
is expected. This works, but is fragile -- i.e., non-semantic changes in
the rendering pipeline will break the test (e.g., placement of spaces,
newlines, etc.). But that's not a deal-breaker, I suppose, being only
majorly annoying in the main development phase and usually
easily-fixable. I am open to other suggestions if you have any.

I might find time to work on the actual D3 composition implementation
next week or later. If you want to give it a go in the mean time, that
would be great!

-- jeet

On 3/31/16 1:18 PM, Zachary Sailer wrote:

Sounds good! Ping me when you have the basic hooks in place. I'll work
on the JSON format and post some ideas here.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#46 (comment)


Jeet Sukumaran

[email protected]

Blog/Personal Pages:
http://jeetworks.org/
GitHub Repositories:
http://github.com/jeetsukumaran
Photographs (as stream):
http://www.flickr.com/photos/jeetsukumaran/
Photographs (by galleries):

http://www.flickr.com/photos/jeetsukumaran/sets/

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Awesome, thanks for getting that in place! I've forked the d3writer branch and should get some time to work on it today/tomorrow.

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Hi @jeetsukumaran

Sorry for the long delay on this. The summer proved to be a busy time for me. But I was able work on this idea yesterday. I've even got a prototype branch here.

This is far from finished. Note, I haven't added most of the keyword arguments, nor have I written tests. I was mostly familiarizing myself with DendroPy's data I/O API.

I did make a simple, static example of my working branch visible in this gist.

Currently, I construct a nested dictionary for a tree. This basically converts the Tree object into a hierarchical, metadata dictionary. Then, I use python's standard json library to encode the metadata as a JSON string. This library handles all None-to-null, True-to-true, and False-to-false conversions.

It appears that there is not yet a clear, standardized JSON/Vega grammar defined for hierarchical, tree-like data. There is conversation going on between the "Open Tree of Life" group. In summary, Vega is starting to make a point to flatten JSON formats for readability, which doesn't work well for hierarchical data. My opinion is that we stick to the examples from D3. Every node has a "name" and "children" key-value pair. The "children" value is an array of child nodes. Child nodes have a "parent" argument pointing back to the parent name. I think, at the very least, this basic structure is acceptable to the general user-base:

{
    "name" : "A",
    "children" : [
        {
             "name" : "B",
             "children" : [],
        },
        {
             "name" : "C",
             "children" : [],
        },
    ]
}

Other items can be specific to DendroPy, but don't affect the generality of the output. For example, annotations, if not suppressed, are included as an "annotations" key mapped to a set of key-value pairs for each annotation. Lengths of branches, if not suppressed, are included as "length" in each child node's data.

Finally, as a small aside: I'm thinking we should rename the writer to JSONWriter instead of D3Writer. I think this data type of output is more general than just D3. It has other advantages like portability to other languages. I also find JSON format more human readable compared to other tree formats. It doesn't matter to me too much. If you'd rather keep the focus on D3, I'm fine with that as well.

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

Great stuff!

Really like what is happening here.

I agree with you that readability is nice, but really, really, really, really, really, should not take priority over usability. And there is strong reason to keep hierarchical data hierarchical. At the same time, we are (presumably?) not out to create a new data format, but rather render the data model in a format that can be consumed by an existing visualization technology, and it makes sense to use the format/conventions/standards/expectations of the visualization technology that we are targeting to condition output. TL;DR: I agree, stick to D3 examples!

As far as the naming goes, given that I imagine at some point we would like to take advantage of some D3-specific expression capability, my suggestion would be to have a JsonWriter class that handles all generic JSON stuff, and a D3JsonWriter that specializes it. Client code would then specify schema="json-d3" or schema="d3" (for example; we can decide the name later) to render the tree as D3-specific JSON and schema="json" for more generic JSON (if we want to support that). The D3JsonWriter would, of course, over-ride the _write etc. methods as needed, and also call on the base class _write as needed. I think the addition of the class hierarchy complexity is offset by the gains in modularity, abstraction, and DNRY-ness?

But that is just my suggestion. If you feel that simply renaming it "JSONWriter" makes more sense (with, maybe, the optional specification of a keyword dialect="D3" to activate D3-specific rendering), then that would be the way to go!

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Yes, I definitely agree that we aren't trying to create a new data format (haha). I'm just surprised that there isn't a defined "tree" grammar for JSON format already out there (at least not that I could find immediately). It seems like there are so many great visualization tools that are prime for such a grammar. By including such a writer in DendroPy, we might be inadvertently contributing to creation of such a grammar.

I agree with keeping the hierarchy in the output, and D3 seems to honor that as well.

I really like the idea of subclassing a JsonWriter class. I'll add that to my next implementation. I think D3 is one use-case (and likely most popular use-case) of a JSON format. It would great to connect DendroPy to fresh visualization tools like D3. A lot of these tools are written as Javascript libraries, so JSON is the natural porting mechanism. A subclassed writer would likely include extra visualization attributes (i.e. window size, colors, etc). The more general JSON format, however, would be useful for porting DendroPy tree data to other APIs or languages. I'm saying this for selfish reasons ;)

Thanks for talking through this stuff! I'll keep working on it and keep you updated!

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Also, in the interest of making a general JSON format that is portable, would a JsonReader class make sense as well? Or do you think this is outside the scope of DendroPy?

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

A reader would indeed be nice. I imagine the use case would be more limited, especially if the JSON is narrowly defined (DendroPy/D3-specific)? Though, having a reader will be useful for tests (round-tripping). Just as relevant, with projects like this, it is not always necessary to stick exclusively to what is important/needed/useful; I always tend to work in things that I like/want, even if it is very idiosyncractic and done more for interest rather than utility. So if the idea of writing a reader appeals to you, go for it!

WRT to JSON tree grammar/data format, the OTOL folks are using a JSON-based derivation of NeXML. I've been meaning to write a DendroPy parser for it, but it's been on the back-burner. Not saying that it should be used here, but mentioning it for reference or source of ideas.

from dendropy.

Zsailer avatar Zsailer commented on July 17, 2024

Hey! @jeetsukumaran

I wanted to mention a new project I've been working on here. PhyloVega is a Python package that uses Vega's (JSON) specifications to draw interactive trees.

While writing this package, I (finally) figured out the Vega grammar for drawing trees. I think it's pretty powerful. With Vega's declarative grammar, I can style my tree any way I want. In this example below, I read in a tree using PhyloPandas and style the tree using a declarative grammar API. Underneath the hood, the TreeChart object is just building a JSON spec for Vega.

from phylopandas import read_newick
from phylovega.api import TreeChart

# Read tree using PhyloPandas
df = read_newick('tree.newick')

# Construct Vega Specification
chart = TreeChart(
    df,
    height_scale=200,

    # Node attributes
    node_size=200,
    node_color="#ccc",

    # Leaf attributes
    leaf_labels="id",

    # Edge attributes
    edge_width=2,
    edge_color="#000",
)

# Display in Jupyter
chart.display()

static-example

This may be an interesting light-weight visualization solution for DendroPy. If I can find some time, I'll write up documentation for this grammar. Then, it would be pretty easy to write a JSON I/O tool for DendroPy. What do you think?

from dendropy.

jeetsukumaran avatar jeetsukumaran commented on July 17, 2024

from dendropy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.