Coder Social home page Coder Social logo

Comments (4)

jaychia avatar jaychia commented on May 17, 2024 4

I wrote a small custom step decorator for myself to tag each task with its parent tasks. I had to do this because I couldn't find any mechanism built into the Metaflow API to track task lineage (do correct me if I'm wrong but it seems easy to check which Step a task came from with .parent, but it doesn't seem possible to check which task spawned the current task?)

def _track_dag(step_func: Callable) -> Callable:
    """Decorator on steps to update self.parent_tasks and self.curr_task so as to enable
    visualization of the DAG during analysis
    """

    @wraps(step_func)
    def wrapper(*args: Any, **kwargs: Any) -> None:
        self_ = args[0]
        curr_task: Optional[str] = self_.curr_task if hasattr(self_, "curr_task") else None
        if "inputs" in signature(step_func).parameters:
            inputs = args[1] if len(args) > 1 else kwargs["inputs"]
            self_.parent_tasks = [i.curr_task for i in inputs]
        else:
            self_.parent_tasks = [curr_task] if curr_task else []
        self_.curr_task = current.pathspec
        return step_func(*args, **kwargs)
    return wrapper

So my flows look like:

class MyFlow(FlowSpec):
    @step
    @_track_dag
    def step(self):
        pass
    ...

And when I'm looking at my run in a notebook, I have another helper that uses PyGraphViz to use the parent pointers and display the DAG, along with some custom coloring to show status of the tasks:

from IPython.display import Image
import pygraphviz as pgv

class MetaflowUtils:
    @staticmethod
    def show_graph(run):
        G = pgv.AGraph(directed=True)

        for s in list(run):
            for t in list(s):
                color = None
                if not t.finished:
                    color = 'black'
                elif t.successful:
                    color = 'green'
                else:
                    color = 'red'
                G.add_node(parse_pathspec(t.data.curr_task), color=color)

        for s in list(run):
            for t in list(s):
                for p in t.data.parent_tasks:
                    G.add_edge(parse_pathspec(p), parse_pathspec(t.data.curr_task))

        return Image(G.draw(format='png', prog='dot'))
        
MetaflowUtils.show_graph(run)

image

Is this perhaps close to what you're looking for @SteNicholas ?

from metaflow.

dvukolov avatar dvukolov commented on May 17, 2024 4

There's some related built-in functionality in Metaflow, which requires graphviz:

$ python playlist.py output-dot | dot -Tpng -o playlist.png

playlist

The format can be changed, e.g. to SVG, if you're having issues with PNG dependencies.

from metaflow.

savingoyal avatar savingoyal commented on May 17, 2024

@SteNicholas Can you elaborate on your ask? You should be able to stand-up custom dashboards using the metaflow.client in a Jupyter notebook easily. A very basic example is in the docs. The client gives you access to the flow state and all the data artifacts. In my experience, dashboards and visualizations depend on a particular business context and visualizing via a Notebook is a very good first step.

from metaflow.

SteNicholas avatar SteNicholas commented on May 17, 2024

@jaychia Yes, I will close this issue.

from metaflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.