synw / django-chartflo Goto Github PK
View Code? Open in Web Editor NEWCharts for the lazy ones in Django
License: MIT License
Charts for the lazy ones in Django
License: MIT License
Based on discussion in #20, the amcharts library is being deprecated. If it is no longer needed, remove the amcharts directory entirely from this repository.
I created a repository for Jupyter demo notebooks to run with django-extensions.
There is only one for now demonstrating how to do a basic bar chart.
And then add features:
Goal : make it easy to compose dashboards with a base view and template
We provide a base template, a simplified version of Admin LTE and views to load different pages in a dashboard. Dashboards are registered in the database with the authorized groups for each.
TODO: demo and update the docs
It would be nice if this charting package could support hierarchical trees.
Add support for hierarchical trees, by incorporating a charting library/component that provides tree layouts.
Merge django-mptt-graph to draw hierarchical trees
Identify parts of the code to clean up in preparation for 0.2.0 release, such as:
Ability to update the charts with streaming data coming from websockets or other ways
I started to add Chartjs support. It works with the same api in Dataswim and is integrated in the Chartflo dashboards. The Chartjs related code is isolated in aPychartjs module.
Chartjs has advantages: the charts are nice and it is easy to use, as it has less abstraction layers than the other rendering engines we use.
The readme starts to be big. It would be good to find the time to start a proper documentation.
It could be good to add support for multiple rendering engines. The plan is actually to replace Amcharts by Charts.js. We could take the opportunity of refactoring to implement it. This would make the module extensible and less tight to one library. There are a lot of great ones in javascript that we could benefit from.
I will implement this so that we can keep the actual working js library as default, work on the switch, and just change the defaults when it is done then we kick out Amcharts after that.
I am actually learning Bokeh which renders charts directly from python. It would work differently but has significant advantages: it generates chunks of html so that it is much less work in frontend maintenance and zero javascript fatigue. It can also produce png and svg images which can be nice as a lightweight alternative for embedded material: once generated they would be way much cheaper to serve and render than js and db hits.
This approach could match the case of using pre-agregated data that is in the proposal #10 as it always has to pregenerate the charts, on-demand would be too costly. I may later make a branch to see if it could fit in the mix, when I will be more comfy working with Bokeh.
Instead of a custom encoder use Altair to encode the data to the Vega Lite format. It is very convenient, for example to set encoding. I started to implemented this in the vegalite
branch.
This means depending on altair
and pandas
: it sounds like the cleanest way to serialize the data. Anyway we would had to depend on pandas one day or the other, considering how good and widely used it is for the kind of tasks we are doing.
The declarative approach is really nice and we can still use other js libs to render the charts, isolating the boring imperative stuff in individual units for different rendering engines
I made a new branch using the Bokeh library to render the charts from python. This approach is very different from what we use now.
[Edit]: it is here
The queries are constructed in the admin interface. When a Question object is saved from the admin the chart is generated by python: the resulting html and js are stored in the database.
When a user requests a chart, it will just fetch the corresponding html without expensive queries. This made it possible to code a dashboard that aggregates several charts. It would also be possible to generate files and call the charts from templates resulting in no queries at all.
Models:
Filter
: defines a specific filterQuery
: defines a query, m2m link to filtersQuestion
: defines a chart, aggregates queries in m2m relationDashboard
: defines a set of questions to assemble in a view, m2m with QuestionTo run the demo code see the install instructions
This has just been made up together and needs quite a lot of work to be usable. Partial list:
I will definitely continue to code in this direction as I find this approach more adapted to what I need and very productive and maintainable.
We may be mixing terminology when referring to charts. In the following example, we use both 'chart' and 'graph'.
django-chartflo/chartflo/views.py
Lines 26 to 28 in 6961287
In general, we should use the same words for things, so our code is clean and easy to follow.
I wonder about what the top level api should cover. I made a few methods for convenience and these essentially wrap the equivalent ones in the underlying library. In fact I don't use them and tend to generate the charts fully with Dataswim. Where Chartflo is interesting is for distributing the charts and compose dashboards. This and the events-based autogeneration mechanism are the features we need in this module.
My question is; should I keep the methods draw
, stack
and export
that are just wrappers? I also made a convert_dataset
method that translates a Django orm query into a dataframe, but I don't use it... [Edit]: I found an usage since
My method in a case like this is to remove everything that I don't need. I don't want to maintain useless code, and the module had been vastly simplified by the use of external libs so it is not the moment to bloat it again. What do you really need? Is it ok to proceed like that?
The ChartsView
's default template actually extends base.html
, it would be good to have a way to serve this over rest, not extending the base template.
Will implement with a request.is_ajax()
check.
I checked the Vega Lite spec and it looks very clear and useful to structure the data. As suggest by @brylie we could start using it for the rest views #14 , isolating the logic of serialization in something like serializers.py
so that we can reuse it later if needed.
About Altair: it looks very similar to Bokeh. I'll definitely give it a shot and explore the python-generated charts way
Due to license concerns pointed by @brylie in this discussion we plan to move to Chart.js
[Edit] I made an amcharts branch for the old code. As this module is not yet released we can use the master to work on the new code.
Since this code works, it deserves an 'official' release. Currently, the GitHub releases tab for this project is empty. Go ahead and make a '0.1.0' release, if you want to use semantic versioning.
Goal: move all the ChartController
's non-django datasources methods to an external lower level python module.
I need those generators outside of Django: the generators using dictionary and Altair data objects as datasource will move to a chartflo
python module. Our ChartController
class would inherit from it, so that our api will not change due to this modification. The class will be responsible only for all the generators that take Django queries as datasources.
Advantages: the generators are usable outside of Django, as wrappers around the Altair api, generating html. We will also get a better maintainability with simplification and single responsability
Proposal: to eliminate the old ChartsView
that was used to display only one chart. Now that we have the dashboards this appears to be useless: charts can be included directly in templates now.
I made a generic dashboard view so that users don't have to setup a view.
In preparation for the 0.2 release, it would be helpful to make the source code as well-documented as possible. To this end, every class and function should have a docstring.
Well-documented sourcecode.
Add docstrings in the following places:
django-chartflo/chartflo/factory.py
Lines 7 to 8 in 3a9bddd
django-chartflo/chartflo/factory.py
Line 9 in 3a9bddd
django-chartflo/chartflo/factory.py
Line 25 in 3a9bddd
django-chartflo/chartflo/factory.py
Line 70 in 3a9bddd
django-chartflo/chartflo/factory.py
Line 97 in 3a9bddd
django-chartflo/chartflo/factory.py
Line 105 in 3a9bddd
django-chartflo/chartflo/views.py
Line 7 in 3a9bddd
django-chartflo/chartflo/views.py
Line 18 in 3a9bddd
django-chartflo/chartflo/views.py
Line 21 in 3a9bddd
django-chartflo/chartflo/views.py
Line 32 in 3a9bddd
When a file is empty, it should likely be deleted.
Clean codebase by removing unused/empty files.
Remove the following empty files:
If you want to use semantic versioning, it might be helpful to create a GitHub milestone to track work towards the next release.
After creating the initial release (issue #4), the next version would be 0.2.0. Create the 0.2.0 milestone, and assign the '0.2.0 Roadmap' task (issue #3) to that milestone.
For info I made a module that can generate chunks of html to display tabular data in the dashboards: https://github.com/synw/django-tabular
I isolated it from Chartflo since it is not strictly related to charts. No screenshot yet: the module is just starting and some feature are to be implemented like sorting, pagination, custom filters.
It is designed to be included in Chartflo's generators and work the same way as chart generation: it produces chunks of html to be included in dashboards
After publishing on pypi (#9), we can add badges to our README to show our pypi pride. Consider the pypi badges on shields.io and add the relevant pypi badges to this project's README
After removing the dependency on AmCharts (issue #1), this project can be offered under an open source license.
Choose a license for the project, and add the license file to the root of the repository. The conventional name for the file is LICENSE or LICENSE.md.
Some work has been done regarding to our main immediate objective: to stop using Amcharts. We are actually structuring the module so that it can be extensible
Proposal: to switch from Amcharts using the Vega Lite rendering engine and make the Vega Lite serialization a standard to pass data around and try to make a first stable release with this. It is easy to add new rendering engines using the CHARTFLO_ENGINE
setting.
@brylie: for now I will concentrate on the core, specially the serializers and the api design. If you wish to take care of the external js integration mechanisms and try with one lib feel free to go ahead in a branch. This is an important task to see if we are really heading in the right direction
The goal is to make composable views that can render multiple charts. There are different ways beeing currently explored:
Server side data aggregation with a query constructor in the admin interface. The chart data is stored in the database and rendered statically. The data can be stored in VL format or as pre-generated html
If we have static data we can generate files to be automatically included in templates, eliminating queries, it can be interesting
Consider serving charts from templatetags
Make dashboards that integrates multiple charts
The query constructor and dashboard parts are quite advanced but still needs some work, see the dashboards
branch. I'll post details in another issue about the possible options for data pre-aggregation: this brings some challenges in, specially how to handle data changes
Lets start playing the test coverage game. The only rule is:
a change to the codebase should not reduce test coverage
Right now, test coverage is at zero % (woohoo! We're winning! Dangit, I lost the game.)
Increase the project test coverage by a teensy-weensy bit.
Pick one or more function(s) and write a test case.
Support for Holoviews with the Bokeh rendering engine has been added #13 . We can now choose what rendering engine to use for each chart. It is possible to compose dashboards that use both engines.
Technically many things have changed. All the charts generation logics now live in an external module: Dataswim, a data analytics library of my composition that is equipped to handle charts.
The generation logics for Altair is isolated in a specific module and freezed while waiting for Altair 2.
Chartflo is now only responsible for Django related stuff: mostly the dashboards view and the events-based charts generation. As a result both the code and api had been drastically simplified: ex:
from chartflo.charts import chart
from django.contrib.auth.models import User
# get the data
all_users = User.objects.filter(is_active=True)
staff = all_users.filter(is_staff=True).count()
superusers = all_users.filter(is_superuser=True).count()
users = all_users.filter(is_superuser=False, is_staff=False).count()
# declare the data
data = [users, staff, superusers]
index = ["Users", "Staff", "Superusers"]
columns = ["Number"]
chart.load_data("Groups", data, columns=columns, index=index)
# get the chart
c = chart.draw("Groups", "Number", chart_type="bar")
# now in a jupyter notebook you can type:just 'c' to draw the chart
# store the chart for later exporting
chart.stack("registrations1", "User registrations 1", c)
# ... make other charts
# then export to files in the folder templates/data/html
chart.export("data/html")
Example of a generator that takes the user registration dates, aggregate them by one day and draw a line chart using Bokeh and points chart using Altair:
from dataswim import ds
from django.contrib.auth.models import User
from chartflo.charts import chart
def run(events=None):
# 1. crunch data
q = User.objects.all()
# load data from a django query
ds.load_django(q, dateindex="date_joined")
# keep only the relevant data
ds.keep("date_joined", "username")
# resample data by one day periods
ds.rsum("1D")
# 2. draw charts
# Note: ds.df is a pandas DataFrame instance
c = chart.draw("date", "num", ds.df, "line")
chart.stack("registrations1", "User registrations 1", c)
chart.engine = "altair"
x = ("date", "date:T")
y = ("num", "num:Q")
# if no dataset is passed, it will use the previously declared one
c2 = chart.draw(x, y, chart_type= "circle")
chart.stack("user_registrations", "User registrations", c2)
# Write the charts html to files
chart.export("data/html")
The Holoviews Bokeh rendering engine is now default. Altair 1 is getting old and is lagging behind: we are waiting for Altair 2. Furthermore Bokeh is more powerful and offers a very nice interactivity.
To resume we now have:
I will update the doc and the examples soon and release.
[Edit]: comments in code
A complete rewrite has been made for version 0.5. Main improvements:
Note: this module is now responsible only for the dashboards management and widgets creation, the charts generation logic has been externalized to the Dataswim module. It is possible to use any other library or code to generate the charts
The doc has been updated as well as the demo project. The doc is a bit light for the moment and needs to be improved. I'm too close to the code to see if the doc is clear enough: feedback is welcome
The goal is to be able to compose dashboards embedding different charts.
Principle: the data is pre-aggregated and chunks of html are generated for each chart: the dashboard assemble these chunks. It would be to costly too query data for multiple charts at each request. Options for building the charts:
This option has already been explored in the server_side_charts
branch. Principle: to use a dashboard constructor in the admin interface with these models:
Query
: an individual query with filters, using a custom line protocol to define the filtersQuestion
: a chart composed by one or multiple queries. This is where the chunks of html live: a question must produce the html to render the chart. It uses the Vega Lite conventions to declare fields and for data transformation. It stores the chart data in html format and Vega Lite format. Has m2m relation to Query
.Dashboard
: assemble questions to render in a view. Has m2m relation with Question
This question constructor actually generate the charts html on save. The question of chart regeneration when data changes is left apart and will be treated separately.
Principle: define questions in the code, assembling queries just like we do now to define the charts data. Some process should be able to generate the data based on triggers.
The generated html chunks would live in a model like Chart
and could be assembled to compose a dashboard.
An online demo is now available. It is a dashboard with inflation numbers, showing line charts and layouts with Bokeh.
To run the generator install the demo locally from the repository and run:
python3 manage.py gen inflation
Find an easy declarative way to render charts from input parameters without having to code anything. This means some kind of view that can draw charts on demand.
In all cases a questions constructor is needed in the backend. It must take the input parameters from several queries or filters, construct orm queries, get the data and package it into template variables.
I suggest to adopt the Metabase terminology here: a question is an aggregator of several queries.
The parameters are declared in the path that hits an endpoint and returns the chart. Using something inspired by the Influxdb line protocol can probably do the job: I will use the simple example on the readme page that compares the user types:
model_path::filter1:val1
/chart/q=auth.User::is_staff:False+auth.User::is_superuser:True+auth.User::is_staff:True;is_superuser:False
Advantage: very declarative, nothing to do for the user other than constructing a line for his request
Inconvenient: not very clear, can probably get too complex or confusing at some point
The questions are constructed and stored in the database as model instances. The user then requests for a question and receives a chart.
Advantage: powerful, it makes it possible to extend the concept to custom dashboards
Inconvenient: hits the db. Lots of work to do to build a questions constructor on the frontend
The user creates a question and constructs some queries to be attached to it. To serve the data we have two possibilities:
The question object is just aware of how to get the data and will hit the db to grab it at each request. Caching may be possible.
The data is preagregated and stored so that when the user requests for a question the data is instantly returned without any extra db hit. The data aggregation can be done automatically either with a time based worker or by registering the involved models so that they will update the aggregated data on each save (see the django-mqueue code where models are registered in settings and connected to signals that watch them and perform actions on update/save/delete actions)
I realize that this approach in the end of the day will translate into building a data visualization tool in Django. It could be very useful. And why not? This is a bit ambitious but the hard work has to be done if we want to be lazy.
I will start to research about the questions constructor for the backend in another module: this is a job for django-instrospection. It is an interesting challenge, and we must get this first to be able to go forward.
@brylie: what do you think of this, does it sound more or less realistic to you?
In order for this project to be easy to integrate into other projects, we need a stable API.
Create a stable API for 0.x series releases.
The responsibility of charts regeneration on data changes is left to the user. A simple example of how it can be done is in the readme. It uses and events queue to watch models and trigger regeneration when data changes
In order for this package to be used in other projects, it would be helpful to release a PIP package.
Make it easy to integrate this package into Django projects by releasing on PIP
Thinking about it I should put back the ability to draw charts inside this module: it would be more convenient for people, and it was made for this after all. This would mean inheriting from the Plot
class of Dataswim where the charting logics now live and have this dependency.
I will also refactor with a class that has methods instead of importing functions all the time. So the api will change ... again, but we are close to reach something usable: stability is not that far anymore.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.