Coder Social home page Coder Social logo

django-chartflo's People

Contributors

brylie avatar reduxionist avatar synw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

django-chartflo's Issues

Roadmap

  • Clean up the code
  • Move to Chart.js instead of Amcharts ( see #1 )
  • Freeze the api
  • Release on pip

And then add features:

  • Consider adding a ready to use view that could draw charts from model instances or paths
  • Merge django-mptt-graph: to draw hierarchical trees
  • Maybe merge django-chartmodels: to draw charts for models stats

Dashboards views

Goal : make it easy to compose dashboards with a base view and template

We provide a base template, a simplified version of Admin LTE and views to load different pages in a dashboard. Dashboards are registered in the database with the authorized groups for each.

TODO: demo and update the docs

Add support for hierarchical trees

It would be nice if this charting package could support hierarchical trees.

Goal

Add support for hierarchical trees, by incorporating a charting library/component that provides tree layouts.

Possible solution

Merge django-mptt-graph to draw hierarchical trees

Clean up the code for 0.2.0 release

Identify parts of the code to clean up in preparation for 0.2.0 release, such as:

  • software libraries incompatible with open source license terms
  • inconsistent naming conventions
  • boilerplate code or markup (e.g. placeholder paragraphs or 'dummy' headers)
  • commented out code
  • unused files/folders

Chart streaming data

Ability to update the charts with streaming data coming from websockets or other ways

Support for Chartjs

I started to add Chartjs support. It works with the same api in Dataswim and is integrated in the Chartflo dashboards. The Chartjs related code is isolated in aPychartjs module.

Chartjs has advantages: the charts are nice and it is easy to use, as it has less abstraction layers than the other rendering engines we use.

Support multiple rendering engines

It could be good to add support for multiple rendering engines. The plan is actually to replace Amcharts by Charts.js. We could take the opportunity of refactoring to implement it. This would make the module extensible and less tight to one library. There are a lot of great ones in javascript that we could benefit from.

I will implement this so that we can keep the actual working js library as default, work on the switch, and just change the defaults when it is done then we kick out Amcharts after that.

I am actually learning Bokeh which renders charts directly from python. It would work differently but has significant advantages: it generates chunks of html so that it is much less work in frontend maintenance and zero javascript fatigue. It can also produce png and svg images which can be nice as a lightweight alternative for embedded material: once generated they would be way much cheaper to serve and render than js and db hits.
This approach could match the case of using pre-agregated data that is in the proposal #10 as it always has to pregenerate the charts, on-demand would be too costly. I may later make a branch to see if it could fit in the mix, when I will be more comfy working with Bokeh.

Use Altair to encode data to the Vega Lite format

Instead of a custom encoder use Altair to encode the data to the Vega Lite format. It is very convenient, for example to set encoding. I started to implemented this in the vegalite branch.

This means depending on altair and pandas: it sounds like the cleanest way to serialize the data. Anyway we would had to depend on pandas one day or the other, considering how good and widely used it is for the kind of tasks we are doing.

The declarative approach is really nice and we can still use other js libs to render the charts, isolating the boring imperative stuff in individual units for different rendering engines

Server side rendering engine

I made a new branch using the Bokeh library to render the charts from python. This approach is very different from what we use now.

[Edit]: it is here

How it works

The queries are constructed in the admin interface. When a Question object is saved from the admin the chart is generated by python: the resulting html and js are stored in the database.

When a user requests a chart, it will just fetch the corresponding html without expensive queries. This made it possible to code a dashboard that aggregates several charts. It would also be possible to generate files and call the charts from templates resulting in no queries at all.

Datastructure

Models:

  • Filter: defines a specific filter
  • Query: defines a query, m2m link to filters
  • Question: defines a chart, aggregates queries in m2m relation
  • Dashboard: defines a set of questions to assemble in a view, m2m with Question

Demo code

To run the demo code see the install instructions

Status

This has just been made up together and needs quite a lot of work to be usable. Partial list:

  • Add options for operators
  • Add timeline chart and manage date fields
  • Make a models registration mechanism to regenerate the data on change and/or use a time based worker
  • Make the png images generation work
  • Add more chart types

I will definitely continue to code in this direction as I find this approach more adapted to what I need and very productive and maintainable.

Use consistent words for 'chart'

We may be mixing terminology when referring to charts. In the following example, we use both 'chart' and 'graph'.

context["graph_type"] = self.graph_type
context["title"] = context["label"] = self.title
context["chart_url"] = self._get_chart_url()

In general, we should use the same words for things, so our code is clean and easy to follow.

About the API surface

I wonder about what the top level api should cover. I made a few methods for convenience and these essentially wrap the equivalent ones in the underlying library. In fact I don't use them and tend to generate the charts fully with Dataswim. Where Chartflo is interesting is for distributing the charts and compose dashboards. This and the events-based autogeneration mechanism are the features we need in this module.

My question is; should I keep the methods draw, stack and export that are just wrappers? I also made a convert_dataset method that translates a Django orm query into a dataframe, but I don't use it... [Edit]: I found an usage since

My method in a case like this is to remove everything that I don't need. I don't want to maintain useless code, and the module had been vastly simplified by the use of external libs so it is not the moment to bloat it again. What do you really need? Is it ok to proceed like that?

Serve ChartsView over rest

The ChartsView's default template actually extends base.html, it would be good to have a way to serve this over rest, not extending the base template.

Will implement with a request.is_ajax() check.

Use the Vega Lite specification for charts data

I checked the Vega Lite spec and it looks very clear and useful to structure the data. As suggest by @brylie we could start using it for the rest views #14 , isolating the logic of serialization in something like serializers.py so that we can reuse it later if needed.

About Altair: it looks very similar to Bokeh. I'll definitely give it a shot and explore the python-generated charts way

Move ChartController's non-django methods to a lower level python module

Goal: move all the ChartController's non-django datasources methods to an external lower level python module.

I need those generators outside of Django: the generators using dictionary and Altair data objects as datasource will move to a chartflo python module. Our ChartController class would inherit from it, so that our api will not change due to this modification. The class will be responsible only for all the generators that take Django queries as datasources.

Advantages: the generators are usable outside of Django, as wrappers around the Altair api, generating html. We will also get a better maintainability with simplification and single responsability

Eliminate ChartsView

Proposal: to eliminate the old ChartsView that was used to display only one chart. Now that we have the dashboards this appears to be useless: charts can be included directly in templates now.

I made a generic dashboard view so that users don't have to setup a view.

Add docstrings to all class and function definitions

In preparation for the 0.2 release, it would be helpful to make the source code as well-documented as possible. To this end, every class and function should have a docstring.

Goal

Well-documented sourcecode.

Task

Add docstrings in the following places:

Remove empty files

When a file is empty, it should likely be deleted.

Goal

Clean codebase by removing unused/empty files.

Task

Remove the following empty files:

  • /chartflow/admin.py
  • /chartflow/models.py
  • /chartflow/tests.py - that or write at least one test! ๐Ÿ˜„

Create 0.2.0 milestone

If you want to use semantic versioning, it might be helpful to create a GitHub milestone to track work towards the next release.

After creating the initial release (issue #4), the next version would be 0.2.0. Create the 0.2.0 milestone, and assign the '0.2.0 Roadmap' task (issue #3) to that milestone.

Display tabular data in dashboards

For info I made a module that can generate chunks of html to display tabular data in the dashboards: https://github.com/synw/django-tabular

I isolated it from Chartflo since it is not strictly related to charts. No screenshot yet: the module is just starting and some feature are to be implemented like sorting, pagination, custom filters.

It is designed to be included in Chartflo's generators and work the same way as chart generation: it produces chunks of html to be included in dashboards

Add LICENSE file

After removing the dependency on AmCharts (issue #1), this project can be offered under an open source license.

Choose a license for the project, and add the license file to the root of the repository. The conventional name for the file is LICENSE or LICENSE.md.

Development roadmap and directions

Some work has been done regarding to our main immediate objective: to stop using Amcharts. We are actually structuring the module so that it can be extensible

What we have now

  • A serialization engine that uses Altair to produce some Vega Lite data
  • A generic view that can draw one chart from a query or a dataset
  • The Vega Lite rendering engine for charts

What we would need

  • A first prototype of external javascript library integration to get an idea on how it would work and how to convert the data in the appropriate format for the particular lib. Chart.js was our initial idea #1
  • To refine the serializers and review the possible options

Proposal: to switch from Amcharts using the Vega Lite rendering engine and make the Vega Lite serialization a standard to pass data around and try to make a first stable release with this. It is easy to add new rendering engines using the CHARTFLO_ENGINE setting.

@brylie: for now I will concentrate on the core, specially the serializers and the api design. If you wish to take care of the external js integration mechanisms and try with one lib feel free to go ahead in a branch. This is an important task to see if we are really heading in the right direction

Directions of research

The goal is to make composable views that can render multiple charts. There are different ways beeing currently explored:

  • Server side data aggregation with a query constructor in the admin interface. The chart data is stored in the database and rendered statically. The data can be stored in VL format or as pre-generated html

  • If we have static data we can generate files to be automatically included in templates, eliminating queries, it can be interesting

  • Consider serving charts from templatetags

  • Make dashboards that integrates multiple charts

The query constructor and dashboard parts are quite advanced but still needs some work, see the dashboards branch. I'll post details in another issue about the possible options for data pre-aggregation: this brings some challenges in, specially how to handle data changes

Improve project test coverage

Lets start playing the test coverage game. The only rule is:

a change to the codebase should not reduce test coverage

Right now, test coverage is at zero % (woohoo! We're winning! Dangit, I lost the game.)

Goal

Increase the project test coverage by a teensy-weensy bit.

Task

Pick one or more function(s) and write a test case.

Support for Holoviews / Bokeh rendering engine

Support for Holoviews with the Bokeh rendering engine has been added #13 . We can now choose what rendering engine to use for each chart. It is possible to compose dashboards that use both engines.

Technically many things have changed. All the charts generation logics now live in an external module: Dataswim, a data analytics library of my composition that is equipped to handle charts.
The generation logics for Altair is isolated in a specific module and freezed while waiting for Altair 2.

Chartflo is now only responsible for Django related stuff: mostly the dashboards view and the events-based charts generation. As a result both the code and api had been drastically simplified: ex:

from chartflo.charts import chart
from django.contrib.auth.models import User

# get the data
all_users = User.objects.filter(is_active=True)
staff = all_users.filter(is_staff=True).count()
superusers = all_users.filter(is_superuser=True).count()
users = all_users.filter(is_superuser=False, is_staff=False).count()

# declare the data
data = [users, staff, superusers]
index = ["Users", "Staff", "Superusers"]
columns = ["Number"]
chart.load_data("Groups", data, columns=columns, index=index)

# get the chart
c = chart.draw("Groups", "Number", chart_type="bar")
# now in a jupyter notebook you can type:just 'c' to draw the chart
# store the chart for later exporting
chart.stack("registrations1", "User registrations 1", c)
# ... make other charts
# then export to files in the folder templates/data/html
chart.export("data/html")

Example of a generator that takes the user registration dates, aggregate them by one day and draw a line chart using Bokeh and points chart using Altair:

from dataswim import ds
from django.contrib.auth.models import User
from chartflo.charts import chart


def run(events=None):
    # 1. crunch data
    q = User.objects.all()
    # load data from a django query
    ds.load_django(q, dateindex="date_joined")
    # keep only the relevant data
    ds.keep("date_joined", "username")
    # resample data by one day periods
    ds.rsum("1D")
    # 2. draw charts
    # Note: ds.df is a pandas DataFrame instance
    c = chart.draw("date", "num", ds.df, "line")
    chart.stack("registrations1", "User registrations 1", c)
    chart.engine = "altair"
    x = ("date", "date:T")
    y = ("num", "num:Q")
    # if no dataset is passed, it will use the previously declared one
    c2 = chart.draw(x, y, chart_type= "circle")
    chart.stack("user_registrations", "User registrations", c2)
   # Write the charts html to files
    chart.export("data/html")

The Holoviews Bokeh rendering engine is now default. Altair 1 is getting old and is lagging behind: we are waiting for Altair 2. Furthermore Bokeh is more powerful and offers a very nice interactivity.

To resume we now have:

  • More power
  • More lazyness

I will update the doc and the examples soon and release.

[Edit]: comments in code

Improvements in version 0.5

A complete rewrite has been made for version 0.5. Main improvements:

  • Use Bulma css instead of Admin lte
  • Use Vuejs to make it a single page app and improve the use experience while navigating in the dashboards
  • Simplify the dashboards creation and focus on productivity
  • Remove all the bloat: the dependency tree is much lighter

Note: this module is now responsible only for the dashboards management and widgets creation, the charts generation logic has been externalized to the Dataswim module. It is possible to use any other library or code to generate the charts

The doc has been updated as well as the demo project. The doc is a bit light for the moment and needs to be improved. I'm too close to the code to see if the doc is clear enough: feedback is welcome

Composable dashboards

The goal is to be able to compose dashboards embedding different charts.

Principle: the data is pre-aggregated and chunks of html are generated for each chart: the dashboard assemble these chunks. It would be to costly too query data for multiple charts at each request. Options for building the charts:

Query constructor

This option has already been explored in the server_side_charts branch. Principle: to use a dashboard constructor in the admin interface with these models:

  • Query: an individual query with filters, using a custom line protocol to define the filters
  • Question: a chart composed by one or multiple queries. This is where the chunks of html live: a question must produce the html to render the chart. It uses the Vega Lite conventions to declare fields and for data transformation. It stores the chart data in html format and Vega Lite format. Has m2m relation to Query.
  • Dashboard: assemble questions to render in a view. Has m2m relation with Question

This question constructor actually generate the charts html on save. The question of chart regeneration when data changes is left apart and will be treated separately.

Generators

Principle: define questions in the code, assembling queries just like we do now to define the charts data. Some process should be able to generate the data based on triggers.

The generated html chunks would live in a model like Chart and could be assembled to compose a dashboard.

Online demo

An online demo is now available. It is a dashboard with inflation numbers, showing line charts and layouts with Bokeh.

To run the generator install the demo locally from the repository and run:

python3 manage.py gen inflation

Add a ready to use view that could draw charts from model instances or paths

Goal

Find an easy declarative way to render charts from input parameters without having to code anything. This means some kind of view that can draw charts on demand.

Prerequisite

In all cases a questions constructor is needed in the backend. It must take the input parameters from several queries or filters, construct orm queries, get the data and package it into template variables.

I suggest to adopt the Metabase terminology here: a question is an aggregator of several queries.

Options

Url based constructor

The parameters are declared in the path that hits an endpoint and returns the chart. Using something inspired by the Influxdb line protocol can probably do the job: I will use the simple example on the readme page that compares the user types:

model_path::filter1:val1

/chart/q=auth.User::is_staff:False+auth.User::is_superuser:True+auth.User::is_staff:True;is_superuser:False

Advantage: very declarative, nothing to do for the user other than constructing a line for his request

Inconvenient: not very clear, can probably get too complex or confusing at some point

Models based constructor

The questions are constructed and stored in the database as model instances. The user then requests for a question and receives a chart.

Advantage: powerful, it makes it possible to extend the concept to custom dashboards

Inconvenient: hits the db. Lots of work to do to build a questions constructor on the frontend

The user creates a question and constructs some queries to be attached to it. To serve the data we have two possibilities:

  • The question object is just aware of how to get the data and will hit the db to grab it at each request. Caching may be possible.

  • The data is preagregated and stored so that when the user requests for a question the data is instantly returned without any extra db hit. The data aggregation can be done automatically either with a time based worker or by registering the involved models so that they will update the aggregated data on each save (see the django-mqueue code where models are registered in settings and connected to signals that watch them and perform actions on update/save/delete actions)

So now?

I realize that this approach in the end of the day will translate into building a data visualization tool in Django. It could be very useful. And why not? This is a bit ambitious but the hard work has to be done if we want to be lazy.

I will start to research about the questions constructor for the backend in another module: this is a job for django-instrospection. It is an interesting challenge, and we must get this first to be able to go forward.

@brylie: what do you think of this, does it sound more or less realistic to you?

Freeze the api

In order for this project to be easy to integrate into other projects, we need a stable API.

Goal

Create a stable API for 0.x series releases.

Sub-tasks

  • create a low-fidelity description of the hypothetical API
  • get feedback on the API design/description
  • make modifications to the API design/description based on feedback
  • write code to provide the basic API (remember the API can be extended, but should not break backwards compatibility in the 0.x series)

Charts regeneration on data change

The responsibility of charts regeneration on data changes is left to the user. A simple example of how it can be done is in the readme. It uses and events queue to watch models and trigger regeneration when data changes

Release on pypi

In order for this package to be used in other projects, it would be helpful to release a PIP package.

Goal

Make it easy to integrate this package into Django projects by releasing on PIP

Sub-tasks

  • choose package license (issue #6)
  • create package description file
  • add relevant metadata to description file
  • create package (preferably with Python Wheels format)

Ability to draw charts

Thinking about it I should put back the ability to draw charts inside this module: it would be more convenient for people, and it was made for this after all. This would mean inheriting from the Plot class of Dataswim where the charting logics now live and have this dependency.

I will also refactor with a class that has methods instead of importing functions all the time. So the api will change ... again, but we are close to reach something usable: stability is not that far anymore.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.