Coder Social home page Coder Social logo

vega / vegafusion Goto Github PK

View Code? Open in Web Editor NEW
290.0 24.0 13.0 20.15 MB

Serverside scaling for Vega and Altair visualizations

Home Page: https://vegafusion.io

License: BSD 3-Clause "New" or "Revised" License

Rust 82.83% JavaScript 0.61% Python 14.80% TypeScript 0.53% SCSS 0.25% Dockerfile 0.04% Java 0.95%
altair charting-library jupyter vega vega-lite

vegafusion's Introduction


VegaFusion provides serverside acceleration for the Vega visualization grammar. While not limited to Python, an initial application of VegaFusion is the acceleration of the Altair Python interface to Vega-Lite.

The core VegaFusion algorithms are implemented in Rust. Python integration is provided using PyO3 and JavaScript integration is provided using wasm-bindgen.

Binder

Documentation

See the documentation at https://vegafusion.io

Project Status

VegaFusion is a young project, but it is already fairly well tested and used in production at Hex. The integration test suite includes image comparisons with over 600 specifications from the Vega, Vega-Lite, and Altair galleries.

Quickstart 1: Overcome MaxRowsError with VegaFusion

The VegaFusion mime renderer can be used to overcome the Altair MaxRowsError by performing data-intensive aggregations on the server and pruning unused columns from the source dataset. First install the vegafusion Python package with the embed extras enabled

pip install "vegafusion[embed]"

Then open a Jupyter notebook (either the classic notebook or a notebook inside JupyterLab), and create an Altair histogram of a 1 million row flights dataset

import pandas as pd
import altair as alt

flights = pd.read_parquet(
    "https://vegafusion-datasets.s3.amazonaws.com/vega/flights_1m.parquet"
)

delay_hist = alt.Chart(flights).mark_bar().encode(
    alt.X("delay", bin=alt.Bin(maxbins=30)),
    alt.Y("count()")
)
delay_hist
---------------------------------------------------------------------------
MaxRowsError                              Traceback (most recent call last)
...
MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000). For information on how to plot larger datasets in Altair, see the documentation

This results in an Altair MaxRowsError, as by default Altair is configured to allow no more than 5,000 rows of data to be sent to the browser. This is a safety measure to avoid crashing the user's browser. The VegaFusion mime renderer can be used to overcome this limitation by performing data intensive transforms (e.g. filtering, binning, aggregation, etc.) in the Python kernel before the resulting data is sent to the web browser.

Run these two lines to import and enable the VegaFusion mime renderer

import vegafusion as vf
vf.enable()

Now the chart displays quickly without errors

delay_hist

Flight Delay Histogram

Quickstart 2: Extract transformed data

By default, data transforms in an Altair chart (e.g. filtering, binning, aggregation, etc.) are performed by the Vega JavaScript library running in the browser. This has the advantage of making the charts produced by Altair fully standalone, not requiring access to a running Python kernel to render properly. But it has the disadvantage of making it difficult to access the transformed data (e.g. the histogram bin edges and count values) from Python. Since VegaFusion evaluates these transforms in the Python kernel, it's possible to access then from Python using the vegafusion.transformed_data() function.

For example, the following code demonstrates how to access the histogram bin edges and counts for the example above:

import pandas as pd
import altair as alt
import vegafusion as vf

flights = pd.read_parquet(
    "https://vegafusion-datasets.s3.amazonaws.com/vega/flights_1m.parquet"
)

delay_hist = alt.Chart(flights).mark_bar().encode(
    alt.X("delay", bin=alt.Bin(maxbins=30)),
    alt.Y("count()")
)
vf.transformed_data(delay_hist)
bin_maxbins_30_delay bin_maxbins_30_delay_end __count
0 -20 0 419400
1 80 100 11000
2 0 20 392700
3 40 60 38400
4 60 80 21800
5 20 40 92700
6 100 120 5300
7 -40 -20 9900
8 120 140 3300
9 140 160 2000
10 160 180 1800
11 320 340 100
12 180 200 900
13 240 260 100
14 -60 -40 100
15 260 280 100
16 200 220 300
17 360 380 100

Quickstart 3: Accelerate interactive charts

While the VegaFusion mime renderer works great for non-interactive Altair charts, it's not as well suited for interactive charts visualizing large datasets. This is because the mime renderer does not maintain a live connection between the browser and the python kernel, so all the data that participates in an interaction must be sent to the browser.

To address this situation, VegaFusion provides a Jupyter Widget based renderer that does maintain a live connection between the chart in the browser and the Python kernel. In this configuration, selection operations (e.g. filtering to the extents of a brush selection) can be evaluated interactively in the Python kernel, which eliminates the need to transfer the full dataset to the client in order to maintain interactivity.

The VegaFusion widget renderer is provided by the vegafusion-jupyter package.

pip install "vegafusion-jupyter[embed]"

Instead of enabling the mime render with vf.enable(), the widget renderer is enabled with vf.enable_widget(). Here is a full example that uses the widget renderer to display an interactive Altair chart that implements linked histogram brushing for a 1 million row flights dataset.

import pandas as pd
import altair as alt
import vegafusion as vf

vf.enable_widget()

flights = pd.read_parquet(
    "https://vegafusion-datasets.s3.amazonaws.com/vega/flights_1m.parquet"
)

brush = alt.selection(type='interval', encodings=['x'])

# Define the base chart, with the common parts of the
# background and highlights
base = alt.Chart().mark_bar().encode(
    x=alt.X(alt.repeat('column'), type='quantitative', bin=alt.Bin(maxbins=20)),
    y='count()'
).properties(
    width=160,
    height=130
)

# gray background with selection
background = base.encode(
    color=alt.value('#ddd')
).add_selection(brush)

# blue highlights on the selected data
highlight = base.transform_filter(brush)

# layer the two charts & repeat
chart = alt.layer(
    background,
    highlight,
    data=flights
).transform_calculate(
    "time",
    "hours(datum.date)"
).repeat(column=["distance", "delay", "time"])
chart
flights_brush_histogram.mov

Histogram binning, aggregation, and selection filtering are now evaluated in the Python kernel process with efficient parallelization, and only the aggregated data (one row per histogram bar) is sent to the browser.

You can see that the VegaFusion widget renderer maintains a live connection to the Python kernel by noticing that the Python kernel is running as the selection region is created or moved. You can also notice the VegaFusion logo in the dropdown menu button.

Motivation for VegaFusion

Vega makes it possible to create declarative JSON specifications for rich interactive visualizations that are fully self-contained. They can run entirely in a web browser without requiring access to an external database or a Python kernel.

For datasets of a few thousand rows or fewer, this architecture results in extremely smooth and responsive interactivity. However, this architecture does not scale very well to datasets of hundreds of thousands of rows or more. This is the problem that VegaFusion aims to solve.

DataFusion integration

Apache Arrow DataFusion is an SQL compatible query engine that integrates with the Rust implementation of Apache Arrow. VegaFusion uses DataFusion to implement many of the Vega transforms, and it compiles the Vega expression language directly into the DataFusion expression language. In addition to being quite fast, a particularly powerful characteristic of DataFusion is that it provides many interfaces that can be extended with custom Rust logic. For example, VegaFusion defines many custom UDFs that are designed to implement the precise semantics of the Vega expression language and the Vega expression functions.

License

As of version 1.0, VegaFusion is licensed under the BSD-3 license. This is the same license used by Vega, Vega-Lite, and Altair.

Prior versions were released under the AGPLv3 license.

About the Name

There are two meanings behind the name "VegaFusion"

  • It's a reference to the Apache Arrow DataFusion library which is used to implement many of the supported Vega transforms
  • Vega and Altair are named after stars, and stars are powered by nuclear fusion

Building VegaFusion

If you're interested in building VegaFusion from source, see BUILD.md

Roadmap

Supporting serverside acceleration for Altair in Jupyter was chosen as the first application of VegaFusion, but there are a lot of exciting ways that VegaFusion can be extended in the future. For more information, see the Roadmap.

vegafusion's People

Contributors

dj-mcculloch avatar jkillian avatar jonmmease avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vegafusion's Issues

Resources

Issue for holding images / movies that can be linked to from elsewhere

vegafusion_interactive_average.mov

Investigate reducing the size of the vegafusion-wasm bundle

The uncompressed WASM bundle for the vegafusion-wasm package is around 4.2MB right now (~1MB compressed)

I've tried the LTO and wasm-opt suggestions from https://rustwasm.github.io/book/reference/code-size.html#size-profiling, but these didn't make much difference in bundle size (compared to the default release build, which already uses wasm-opt).

I think the next step is to try profiling the size with twiggy per https://rustwasm.github.io/book/reference/code-size.html#the-twiggy-code-size-profiler to see what's taking up the most space.

Unable to use the CDN version of vegafusion-jupyter

Hi, maintainer of the Jupyter extension for VS Code here.
We were planning on lighting up vegafusion (wasm) in VS Code notebooks and managed to get this working.
However with the latest version (0.6.0) things don't work correctly.
Looks like the index.js attempts to load a resource that does not exist.

If you look at the code in here https://cdn.jsdelivr.net/npm/[email protected]/dist/index.js you'll see there's a reference to an rc version, and as a result downloading additional resources do not work ne.p="https://unpkg.com/[email protected]/dist/"

Here's the network tab (downloads the index.js correctly, & not the other files):
FYI - This used to work in older versions such as 0.4.0, which is what I had until this morning)

Screen Shot 2022-07-02 at 09 59 27

Screen Shot 2022-07-02 at 09 59 38

Screen Shot 2022-07-02 at 09 59 43

Move feather transformer to 'vegafusion' package

I need the Altair vegafusion-feather renderer for panel-vegafusion. Currently its in vegafusion_jupyter which means I will have that as a dependency.

I believe it should only be nescessary to depend on the vegafusion library ??.

Context

I need it to enable the communication.

image

Document how VegaFusion solves real problems

After having read the material surrounding vegafusion I think the value proposition could be made clearer. I am not an experienced Altair/ Vega user and read it from that perspective. So starting out with "small" examples that already work in Altair does not really show what difference vegafusion provides.

Altair Documents how to work with big datasets

According to https://altair-viz.github.io/user_guide/faq.html#maxrowserror-how-can-i-plot-large-datasets you can work with large datasets if you just follow some best practices.

Why and how does VegaFusion provide something better?

User on twitter says he has worked with altair and 20 mill rows without problems.

https://twitter.com/ArnaudovKrum/status/1485971877029888007?s=20&t=GXoAzB20cc-8Z7mlA2ZwMg

image

Alternatives

There are Python alternatives like Datashader+Holoviews and Vaex (?). How does vegafusion compare to those?

Users with performance problems

I believe a google search will show lots of users having hit the max limit of what Altair/ Vega can do. Can Vegafusion solve those issues?

New, real world use cases

Examplify what Vegafusion can do via some new big data use case that solves a real problem or provides real insights.

Setup CI

Set up continuous integration (probably using GitHub Actions) for:

  • rust fmt compliance
  • clippy compliance
  • black compliance for Python code
  • Prettier JS / TS formatting
  • Integration tests (Each operating system)
  • Python package build (Each operating system)

Support Vega-Embed Themes

With Altair I can normally set a dark theme using alt.themes.enable("dark"). It seems not to be working when using vegafusion-jupyter.

Works without VegaFusion

image

Does not work with VegaFusion

image

Code

import vegafusion_jupyter as vf
vf.enable()

import altair as alt
from vega_datasets import data

alt.themes.enable("dark")

source = data.seattle_weather()
brush = alt.selection(type='interval', encodings=['x'])

bars = alt.Chart().mark_bar().encode(
    x='month(date):O',
    y='mean(precipitation):Q',
    opacity=alt.condition(brush, alt.OpacityValue(1), alt.OpacityValue(0.7)),
).add_selection(
    brush
)

line = alt.Chart().mark_rule(color='firebrick').encode(
    y='mean(precipitation):Q',
    size=alt.SizeValue(3)
).transform_filter(
    brush
)

chart = alt.layer(bars, line, data=source)
chart

Cross-compiling to i686-linux fails (for 0.1.0)

i686-linux used to build without issue, now it fails with the following error:

Full logs: https://gist.github.com/jeremiahpslewis/8730431d91a7ccd64ce77bc4a3f97ead

[20:39:39] WARN rustc_codegen_ssa::back::link Linker does not support -no-pie command line option. Retrying without.
[20:39:42] error: linking with `i686-linux-musl-cc` failed: exit status: 1
[20:39:42]   |
[20:39:42]   = note: "i686-linux-musl-cc" "-m32" "-Wl,-melf_i386" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained/crt1.o" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained/crti.o" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained/crtbegin.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.0.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.1.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.10.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.11.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.12.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.13.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.14.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.15.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.2.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.3.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.4.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.5.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.6.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.7.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.8.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.vegafusion_server.3b1e682e-cgu.9.rcgu.o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39.bz0t8jy3z4inpkx.rcgu.o" "-Wl,--as-needed" "-L" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps" "-L" "/workspace/srcdir/vegafusion/target/release/deps" "-L" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/build/ring-bacc20c2f800b9d5/out" "-L" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/build/lz4-sys-c9e32c7d4ccc1980/out" "-L" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/build/zstd-sys-8ed80be47440e577/out" "-L" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib" "-Wl,-Bstatic" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtonic_web-6dcb9c4c2f56b6f4.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libclap-389a8b75e07d9e6f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libstrsim-e740f280d34fd19c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libatty-7e68e008d4a4bbfe.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtermcolor-484363f0791b06c4.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtextwrap-a63cfb1003e3da38.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libos_str_bytes-9a942725b015dcc2.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libvegafusion_rt_datafusion-e36bc819f7b51e7b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libreqwest-41c130a19152e5b8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librustls_pemfile-11f106b73fbc1656.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhyper_rustls-d4fd539d59169065.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libserde_urlencoded-bc2f0528257ceeb9.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libwebpki_roots-d0fbe4fd7d36237d.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libipnet-05e789b5341494b8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_rustls-26bc935391cf7497.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmime-d0d89c69452b0921.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libencoding_rs-5c28791a3e6eaff0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librustls-55eb9a256072d568.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsct-e55434511ef2ec94.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libwebpki-f9fbe4c85ca55eba.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liburl-0ce969ace1459bbc.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libidna-8c47ada9f0e30b32.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libunicode_normalization-c934307132afe820.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtinyvec-ccebce24ab5029da.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtinyvec_macros-6146693d6782bd60.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libunicode_bidi-a95b8d29f79a5947.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libform_urlencoded-a0b86c25aee1437a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmatches-915b9f3a3d16ac64.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libchrono_tz-673507ae3c041376.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libphf-002da301dfd4f456.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libphf_shared-dfea9be95a8a7caf.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libuncased-aa25cd2e8d309f54.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsiphasher-932ec4806377dcaa.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfloat_cmp-952bec0d05743058.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblru-50641a1e3e1fd10d.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhashbrown-5835842880d5aa42.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libasync_lock-7d43c2f197a3b285.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libevent_listener-dbfa6e6568a8c029.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdatafusion-4e2bff57aa8119c2.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdatafusion_physical_expr-054b313fd6b34525.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libunicode_segmentation-0609bd73f87e3444.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsha2-8a8019bede05569d.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcpufeatures-d5eb6fb6890ecd19.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmd5-ad29a03a6f71a9f1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libblake3-d255093ff72eb9a7.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libconstant_time_eq-23b0518f6b18f96e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libarrayvec-42200407ebd31300.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libarrayref-15f5b9a8c198752f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libblake2-c4bcbe0cc462a040.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdigest-137eb57deb8ed1bc.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsubtle-46a788260fa2ee76.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libblock_buffer-90925029ab8d4834.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcrypto_common-e7810c9e1a8c1203.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libgeneric_array-bf2308aa51d2f48e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtypenum-88ad7b87e0d6fe1b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdatafusion_expr-0bbca4b38a48f7e3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhashbrown-ff453f6855100273.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libahash-b38ad5379b711a67.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtempfile-79a2ab46bb90bc54.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libremove_dir_all-13977637d1acd5d1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures-0175a04dbe07c4f9.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_executor-61e1857cd57361c8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libvegafusion_core-1fbbc27dd1736a6e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdeterministic_hash-adaebcb86c26e724.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libpetgraph-48bd22b8651c5aaf.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfixedbitset-4a9a29b609b617ec.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libitertools-799931e3c3a5d648.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libeither-6cd67abc6a27d1b5.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libdatafusion_common-4282351925a853f7.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libordered_float-64aa980a2c723e75.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libzstd-013af66588b44c60.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libzstd_safe-87658cd287668a8a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libzstd_sys-481a69ba610df7b8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblz4-752becff09c9f368.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblz4_sys-e51a2e15ad5a56be.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbrotli-8abd74d1a5036e39.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbrotli_decompressor-9121aaa7a795d6c2.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liballoc_stdlib-01eb0153714d2509.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liballoc_no_stdlib-96178ddcac9911a9.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libflate2-02c7892a245f4132.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libminiz_oxide-0de3e5495dcdbcd0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libadler-aa82cb97df8f1293.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcrc32fast-d609e353dc8e3ada.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsnap-f1e27bde21760484.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet_format-ccbb80509311c702.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libthrift-12d019cb8e082dff.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libthreadpool-37fe8b2d0c666115.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libinteger_encoding-09568b2466b8e666.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libordered_float-ce8074f161b84b17.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbyteorder-53448fdae9da7d2a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsqlparser-ed115cd500025ebf.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libarrow-5e4f9446475783f5.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_core-cd7f1611872702c3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_write_float-6208445ab07b1d81.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_write_integer-e635f134af468984.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_parse_float-07c7fc184f261422.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_parse_integer-075099ea44e9ac9f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblexical_util-5ae3efa035ff522a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libstatic_assertions-8d910f08e1c6c4fa.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcomfy_table-16ec18d2edf29b1a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libstrum-b5f61e2b761dc941.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libunicode_width-53e635643f8bf08b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libflatbuffers-fc6d0f7f23e6a963.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libthiserror-9306cbb35e13ecc3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbitflags-bec36a01f267c661.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcsv-677423ef7042df9b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcsv_core-05325d8a7af82138.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbstr-b2453f55f413b3b8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libregex_automata-2bf413566be8146e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libregex-f9a8503af6854c85.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libaho_corasick-543de1ed44f4102f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libregex_syntax-b70d7a50044cf713.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmultiversion-ac414e728b7bf006.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhex-de34283c7f666c80.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libchrono-a90d11b452111b5f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtime-5da6a60565bb559c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum-25573e36e99a2eed.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_iter-6b35daf7a0f38196.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_rational-dcdbd2db9e656b2c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_complex-cd2e609613ed7356.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_bigint-4dbaf0edfae7cdd6.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_integer-b5f76896e5816ff3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_traits-5d20063db7433627.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhalf-d6b4a9c37371a70b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libserde_json-aabdb15cceaa8538.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libryu-e47f6569e3231c9d.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libserde-458c8c6d3e6f415c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtonic-59c22596ac9f96ea.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhyper_timeout-83c526f34cb16505.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_io_timeout-af50e4a914666dc1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libasync_stream-50996b4b27c85e59.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libpercent_encoding-aee9f6bfb4dca177.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_stream-f9e8baf7adbca79f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_rustls-16feedb4b93167b1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librustls-81e30bd375b20c14.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbase64-2b998f58a909094a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsct-b84edcb5443d90d0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libwebpki-c7bfc988720414aa.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libring-7fe61574f15b906b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libspin-634995e5614657b4.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libuntrusted-a62f62861335b267.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libonce_cell-c8575f6892fa0e53.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhyper-0332746af4152069.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libwant-4050b9ed3eec00e0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtry_lock-1555d0309bf574e6.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libitoa-7192b9c8564321f0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhttparse-35df7da05832f4c5.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libh2-9c375d9a8a322fbc.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_util-af2931b3e9a80858.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhttpdate-1b91bc274e6b3f2a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtower-81e8265797beae09.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libindexmap-c6db0625456e1fae.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhashbrown-6788d3136ebd09e0.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio_util-1fe0a0b1d645f519.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtokio-a9eb305e593df9da.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libnum_cpus-512267d6b899edf7.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsocket2-797843de4d0547c5.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmio-b97716360557b2f8.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparking_lot-92ff26cdcfb1458f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparking_lot_core-5e2f21e988abccbc.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libsmallvec-accfb7b6fabdc598.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblock_api-f989d485a25aaee3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libscopeguard-babcb1dc8f37ce67.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librand-08263cf3baa7854d.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librand_chacha-7e05d6ea8d27f452.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libppv_lite86-e08817877a0f85cf.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/librand_core-3fc2e323d5f33214.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libgetrandom-afdb2d7efb21b204.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblibc-54f4c3153d61633c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtower_layer-4d03ceaf48a6ef6f.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libprost-2d1fe6d482cc46f1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libpin_project-4320376969b172ff.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtracing-fa4f4f0a14b2307c.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtracing_core-0f250504405a87e3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblazy_static-39fae77bb53c9de7.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/liblog-790cc0d0d33403b9.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libcfg_if-0e23a038f974a64b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libtower_service-6cac7cfd918daf4b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_util-4cde65aaf3af8646.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libmemchr-31354d1aca2fe4f1.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_io-a5e56a4667a3ef61.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libslab-5f2610e01b1caafa.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_channel-31a9129a143a0712.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_sink-51d8ad19b5ee5967.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_task-31beb8ec9b5cab1e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libpin_utils-e6413b4dc7fdabae.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfutures_core-bf3f3d8470060c4e.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhttp_body-133fc16a8c51733b.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libpin_project_lite-f76321a7e627b6c2.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libhttp-fead9d6d037b624a.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libitoa-ba5bf3350ad919a3.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libfnv-17cfe961a23a6955.rlib" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libbytes-8447d32f3ffe8246.rlib" "-Wl,--start-group" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libstd-05200585159df6ea.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libpanic_unwind-b334f19119327840.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libminiz_oxide-962396e6f7f256ed.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libadler-c0c1a86e204bab75.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libobject-7057d878764e3c29.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libmemchr-bdcfefa8728cf306.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libaddr2line-9a4f91b4ed0345f5.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libgimli-60c06717fbc38f7d.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libstd_detect-724e1ffba168709f.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/librustc_demangle-9209c2d36dda230b.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libhashbrown-c24f547f30c1468c.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/librustc_std_workspace_alloc-556abb1a00867d5a.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libunwind-1ca2c562691c634e.rlib" "-lunwind" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libcfg_if-c5d687b2098ef5a4.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/liblibc-e588e4402e712f7f.rlib" "-lc" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/liballoc-4b857a621ce7267b.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/librustc_std_workspace_core-652601b15c1fb60e.rlib" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libcore-fae282f218187ccb.rlib" "-Wl,--end-group" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/libcompiler_builtins-41a9a21cc733a648.rlib" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-znoexecstack" "-nostartfiles" "-L" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib" "-L" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained" "-o" "/workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/vegafusion_server-e2e78b68e3853b39" "-Wl,--gc-sections" "-static" "-Wl,-zrelro,-znow" "-Wl,-O1" "-nodefaultlibs" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained/crtend.o" "/opt/x86_64-linux-musl/toolchains/1.59.0-x86_64-unknown-linux-musl/lib/rustlib/i686-unknown-linux-musl/lib/self-contained/crtn.o"
[20:39:42]   = note: /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o): In function `parquet::encodings::encoding::get_encoder::hfb43799d733abc6b':
[20:39:42]           parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding11get_encoder17hfb43799d733abc6bE+0x12d): undefined reference to `__atomic_load'
[20:39:42]           /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o): In function `parquet::encodings::encoding::PlainEncoder$LT$T$GT$::new::h16f154d6100aff1c':
[20:39:42]           parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding21PlainEncoder$LT$T$GT$3new17h16f154d6100aff1cE+0x50): undefined reference to `__atomic_load'
[20:39:42]           /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o): In function `parquet::encodings::encoding::DictEncoder$LT$T$GT$::new::h6991e01b28a72842':
[20:39:42]           parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding20DictEncoder$LT$T$GT$3new17h6991e01b28a72842E+0x6c): undefined reference to `__atomic_load'
[20:39:42]           /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o): In function `parquet::encodings::encoding::DictEncoder$LT$T$GT$::new::h717ce2f3d102b981':
[20:39:42]           parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding20DictEncoder$LT$T$GT$3new17h717ce2f3d102b981E+0x6c): undefined reference to `__atomic_load'
[20:39:42]           /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o): In function `parquet::encodings::encoding::DictEncoder$LT$T$GT$::new::h88e3967c345e78bc':
[20:39:42]           parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding20DictEncoder$LT$T$GT$3new17h88e3967c345e78bcE+0x6c): undefined reference to `__atomic_load'
[20:39:42]           /workspace/srcdir/vegafusion/target/i686-unknown-linux-musl/release/deps/libparquet-3599b96e42f055ef.rlib(parquet-3599b96e42f055ef.parquet.84dadfee-cgu.0.rcgu.o):parquet.84dadfee-cgu.0:(.text._ZN7parquet9encodings8encoding20DictEncoder$LT$T$GT$3new17h9d6bddc37cb15a3aE+0x6c): more undefined references to `__atomic_load' follow
[20:39:42]           collect2: error: ld returned 1 exit status

It seems like it might be fixed by updating the build.rs script to tweak the build args, but I'm not 100% sure about this. (see hint here https://www.reddit.com/r/cpp_questions/comments/4680t0/16byte_atomics_on_clang_with_libstdc/)

Avoid transfering the full DataFrame to client

Overview

For the Altair use case of building a visualization from a Pandas DataFrame (rather than a reference to an external file), it's important to avoid serializing the DataFrame into the JSON specification.

Instead, the DataFrame should be extracted from the specification and registered with the TaskGraphRuntime. Then a reference to the registered DataFrame should be added back to the specification. Hopefully, this can be done through the Altair data transformer system (https://altair-viz.github.io/user_guide/data_transformers.html).

Often, the full DataFrame will not need to be sent to the client and instead a smaller aggregated form will be sent. But even for cases where the full DataFrame is required by the client, this workflow will allow the DataFrame to be binary serialized as an Arrow RecordBatch rather than inline JSON.

Rust implications

On the Rust side, the TaskGraphRuntime should have the ability to register a persistent TaskValue. Then, a new reference TaskValue variant should be added that stores a reference to the registered value in a TaskGraph. We will also need the ability to unregister a value. I think it makes sense to use the TaskValue hash as the name to de-duplicate values.

Separately from Python, this functionality could be useful in a BI setting for uploading Datasets through the browser for use during a single user session. For long-term storage, the dataset should be saved to parquet. And perhaps eventually this registration mechanism has the notion of temporary vs persistant values, where persistant values are saved to parquet.

Python implications

On the Python side, we can convert the DataFrame to Arrow using pyarrow. Then we'll need to work through how to interop between pyarrow in Python and arrow-rs in Rust. I believe DataFusion already does this.

To avoid leaking memory, it'll be important to unregister the TaskValue associated with a DataFrame when the associated DataFrame is dropped. We should be able to do this with weakref callbacks.

Select CLA and setup CLA bot

VegaFusion will initially be released under the AGPL3 license. To maintain future licensing and funding model flexibility, we'll initially require contributors to sign a Contributor License Agreement (CLA).

Some research is still needed to select an appropriate CLA, but cla-assistant looks like a pretty low friction way to administer one.

Investigate stringify datetimes on Safari

It looks like toDate on Safari doesn't parse dates of the form "2011-01-04 19:00:00.000". These work on Firefox and Chrome. I think we can just add the T and format the dates as "2011-01-04T19:00:00.000" for safari support.

Extract menu into common package

The logic to generate the dropdown menu currently lives in the jupyter extension. To make it easier to support other contexts (Dash, Panel, pure JS, etc.), this logic should either be moved into vegafusion-wasm, or a new package like a vegafusion-embed.

A separate pure TS library would probably be easier than generating the menu from Rust/wasm, so I'm leaning toward creating a new vegafusion-embed package that would mirror the functionality of vega-embed. In addition to the menu logic and CSS, this is where theme support would be added (c.f. #64)

Determine local timezone on client instead of server

Currently, calculations that involve converting the stored UTC time to Local time are performed according to the server's timezone. Instead, this timezone should be determined by the client.

As a new architecture, the client should provide its named timezone as part of the task graph query. Then we can use chrono-tz on the server to convert from UTC to the local time zone. For hashing, it might be easier to add the timezone to each transform/expression that needs it, and then include the timezone string in the hash of expression/transform.

"Ambiguous reference to field" error

A few of the test mocks are currently commented out due to an error raised by DataFusion involving duplicate column/field references.

In vegafusion-rt-datafusion/tests/test_image_comparison.rs:

  • vegalite/circle_natural_disasters

In python/vegafusion-jupyter/tests/test_altair_mocks.py:

  • bar/with_negative_values
  • bar/layered
  • casestudy/falkensee
  • casestudy/us_employment

This error may have the same root cause as this DataFusion bug report apache/arrow-datafusion#1411.

Support consistent view of time-based visualizations across viewer timezones

Overview

We would like the appearance of time-based charts to depend only on the local_tz argument provided to VegaFusion, not on the local timezone of the browser displaying the chart.

Background

There are three general places where the local timezone matters:

  1. When loading datetimes values from strings (e.g. from JSON or CSV) strings without an explicit timezone (naive datetimes) are interpreted as being in local time and immediately converted into UTC milliseconds.
  2. Certain data transformations can operate in a local timezone mode. In particular, the timeunit transform, and the use of date part functions (e.g. hours, months, etc.) in the formula or filter transforms.
  3. Vega provides separate quantitative scale types for local and utc datetimes.

In practice, (2) and (3) should be either both local or both utc to create sensible charts. When using the timeunit parameter of an encoding channel in Vega-Lite, (2) and (3) are automatically aligned.

If (2) and (3) align, then there are 4 combinations to consider:

  • Load naive datetime values then transform in local mode (Naive-Local)
  • Load timezone-aware datetime values then transform in UTC mode (Aware-UTC)
  • Load naive datetime values then transform in UTC mode (Naive-UTC)
  • Load timezone-aware datetime values then transform in local mode (Aware-Local)

Examples

Here are simple Vega-Lite examples of all for combinations. Images generated in the America/New_York timezone.

Naive-Local

https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289ADkkNUvwAybKEkGKMcGmSxoxABiXKkmFKlDErDNWgA2qDumHBoIABMzpGR-M5i8QAsAAQAjGmoaWKozs4gAL4KIR7h6NGx8YnOqWmRWTl5hcUqpREVcQnJ6QDMDbn5BQC6RSDI6gDWEThsNLKYNnCyUGzKc2RooACemyA6CHAAqrJ0ETBsDOoQNgBmNHCCyhGh4UpIAB401957NFATAGELvM0JFRjpMIIyr8DoUCgUgA

Naive-Local

Aware-UTC

https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289AEEA7knVx+AVQAqAYUUY4NMljRiADEuVJMKVKGJJBBjU0AG1QT0w4NBAAJlcYmP5XMSSAFgACAEZM1EyxVFdXAC0QAF8FcK8o9DiEpJTXDMyY3PzCkvLKyOjaxOS0rIBmVoLisoBdcpBkdQBraJw2GllMOzhZKDZlZbI0UABPPZAdBDhzWTpohkxYNgZ1CDsAMxo4QWVoiKilJAAPGkevmONCgs2sdxWaBiUx0mEE1WBpzKpVKQA

Aware-UTC

Naive-UTC

https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289ADkkNUvwCqAFQDCijHBpksaMQAYlypJhSpQxJIIZqaADaoO6YcGggAEzOUVH8zmIJACwABACM6ajpYqjOziAAvgqhHhHoMXEJSc5p6VHZuflFJSplkZXxiSkZAMyNeQWFALrFIMjqANaROGw0spg2cLJQbMrzZGigAJ5bIDoIcKaydJEMmLBsDOoQNgBmNHCCypFhEUpIAB40t977NFBJpYrgs0FExjpMIJyv9DkVCoUgA

Naive-UTC

Aware-Local

https://vega.github.io/editor/#/url/vega-lite/N4IgJAzgxgFgpgWwIYgFwhgF0wBwqgegIDc4BzJAOjIEtMYBXAI0poHsDp5kTykBaADZ04JAKyUAVhDYA7EABoQAEzjQATjRyZ289AEEA7knVx+AGTZQkgxRjg0yWNGIAMS5UkwpUoYjYY1NABtUE9MODQQACZXaOj+VzFEgBYAAgBGDNQMsVRXVwAtEABfBTCvSPRY+MTk13SM6Jy8guKyioiomoSk1MyAZhb8otKAXTKQZHUAayicNhpZTDs4WSg2ZSWyNFAAT12QHQQ4AFVZOiiYNgZ1CDsAMxo4QWUo8MilJAAPGnvfI40KAzADCN2WaGikx0mEEVUBJ1KJRKQA

Aware-Local

Current VegaFusion behavior

VegaFusion is capable of evaluating steps (1) and (2) above on the server, and to do so it will use the provided local_tz parameter to determine what is considered to be the local timezone. Datetime values in the transformed dataset are serialized and sent to the client as UTC milliseconds. As long as the provided local_tz matches the browser's local timezone, this approach will exactly match the behavior when using plain Vega (without VegaFusion).

The trouble is when the browser's local timezone does not match the local_tz argument. In this case we would like the chart to match what plain Vega would produce if the browser were in the local_tz timezone. But we currently have inconsistent behavior.

As of VegaFusion 0.3.0, datetime columns that result from (1) and (2) are always serialized as UTC milliseconds. This gives the desired behavior for the Naive-UTC and Aware-UTC scenarios because in these cases step (3) does not use local time at all, and so the same result will be displayed regardless of the browser timezone.

However, this results in inconsistent behavior for the Naive-Local and Aware-Local scenarios. In these cases, step (3) does use the browsers local timezone and so the serialized UTC values are converted into the browsers local timezone. This means that viewers in different timezones will see results displayed differently. Even worse, in the Naive-Local case, VegaFusion and Vega in the browser will perform UTC conversions relative to different local timezones, resulting in inconsistent behavior.

Proposed updates

My proposal is to update VegaFusion so that datetimes are serialized as naive datetime strings if we can determine that step (3) is in local timezone mode. Otherwise, serialize to UTC milliseconds as we currently do.

Serializing to a naive datetime string involves converting the UTC millisecond value to the local_tz timezone and then converting to a naive datetime string.

I haven't fully worked through the details, but I'm optimistic that we will be able to detect whether (3) is in local mode by checking which scale a datetime value is encoded with and then checking whether that scale is in local or utc mode.

With this one change, I believe we will have the desired behavior in all cases.

Incorrect output with (some?) CSV files

The following code is adapted from https://github.com/vegafusion/demos/blob/main/notebooks/large_altair_examples/flights_crossfilter.ipynb and the final two calls should result in identical behaviour but they don't. Here's the video of what I get for each:

Screen.Recording.2022-04-13.at.12.24.05.mov
import altair as alt
import pandas as pd
import vegafusion as vf

def make_chart(source):
    brush = alt.selection(type='interval', encodings=['x'])

    # Define the base chart, with the common parts of the
    # background and highlights
    base = alt.Chart().mark_bar().encode(
        x=alt.X(alt.repeat('column'), type='quantitative', bin=alt.Bin(maxbins=20)),
        y='count()'
    ).properties(
        width=160,
        height=130
    )

    # gray background with selection
    background = base.encode(
        color=alt.value('#ddd')
    ).add_selection(brush)

    # blue highlights on the transformed data
    highlight = base.transform_filter(brush)

    # layer the two charts & repeat
    return alt.layer(
        background,
        highlight,
        data=source
    ).transform_calculate(
        "time",
        "hours(datum.date)"
    ).repeat(column=["distance", "delay", "time"])

df = pd.read_json("https://vegafusion-datasets.s3.amazonaws.com/vega/flights_200k.json")
df.to_csv("./flights200k.csv")

vf.jupyter.enable(download_source_link="https://github.com/vegafusion/demos")
make_chart(df) # OK!
make_chart("./flights200k.csv") # not OK!

[FeatureRequest] `conda` Python package

Support multiple 'vegafusion' renderers.

The jupyter VegaFusionWidget widget depends on the vegafusion renderer and vegafusion-feather transformer.

image

For panel-vegafusion I want to be able to drop the dependency on vegafusion_jupyter and provide a panel-vegafusion renderer.

Right now its hard-coded that the vegafusion-transformer supports the vegafusion renderer only.

image

Solution

Change to

image

or support some mechanism for "plugging" in alternative renderers.

I would like to call my renderer 'panel-vegafusion'.

Support classic notebook and Voila

The JupyterWidget extension doesn't currently work in the classic notebook or with Voila (using --enable_nbextensions=True), and it should.

Support categorical/dictionary column types

Arrow provides a categorical column type called DictionaryArray.. This is the type that Pandas Categorical columns are converted into by default.

Right now these are expanded into non-categorical form in VegaFusion's feather data transformer. For improved efficiency, VegaFusion should support these directly.

Support Colab

I tried vegafusion-jupyter==0.0.1a1 in Google Colab. Colab is successfully able to load the extension from a CDN, and the widget is displayed

Screen Shot 2022-01-17 at 11 53 58 AM

But the comm message is not successfully sent from the client to the server. Here is the browser console message

Screen Shot 2022-01-17 at 11 53 40 AM

There's not a lot to go on to debug this unforntunately

Provide the VegaFusionRunTime from the vegafusion package

The only dependency the panel-vegafusion python package has to vegafusion python packages is the import of the vegafusion_jupyter.runtime.

image

The runtime is an instance of the VegaFusionRunTime defined in vegafusion_jupyter.

image

I believe its more natural that the VegaFusionRunTime is defined in the vegafusion package. In that way integration libraries have a more simple and natural dependency.

If not it would make sense for me to copy the VegaFusionRunTime implementation to panel-vegafusion package and drop the dependency on vegafusion_jupyter.

Please let me know what you would prefer @jonmmease ? Thanks.

Write CONTRIBUTING.md

We need a CONTRIBUTING.md file that explains how to build the various pieces of the project, and how to contribute to various parts of the code base:

  • Expression functions
  • Transforms
  • Image mock tests
  • etc

Improve error reporting

Currently, when an error occurs while evaluating a specification on the server, the Python process panics, which hangs the JupyterLab interface.

  1. Do not panic, instead create a VegaFusionResponse message to report errors
  2. Display the error in the Jupyter interface instead of a broken vega chart

Announcement blog posts

We'll want to publish pretty thorough blog post (or more likely a series of blog posts) announcing the project.

Section 1 - tl;dr

There should be a tl;dr section with instructions for how to install and activate VegaFusion in JupyterLab, along with a pretty GIF. This is targeting a user who just wants to copy and paste and see if it works before investing any more effort into understanding why they should care about the project.

Section 2 - Why a Python Data Scientist should care

Then there should be a section explaining why a Python Data Scientist should care about VegaFusion. This would include some brief background on Vega/Vega-Lite and Altair. This would emphasize the inclusion of transforms in the grammar, and how these enable automatic interactive workflows like linked brushing on histograms.

Then some good diagrams explaining that with Vega.js, all of the transforms are performed in the browser, and so all of the raw data must be sent to the browser (either inline in the spec, or loaded from a url).

The magic of VegaFusion is that it automatically extracts as much data processing work as possible and performs it on the server, while still supporting full interactivity.

To use VegaFusion, this is all you need to know. But read on to learn more about how it all works

Section 3 - How the current system works

Explain that VegaFusion is built in Rust on top of Arrow and DataFusion. The Vega expression language is compiled into the DataFusion expression language. And Vega transforms are compiled into DataFusion queries. The extensibility of DataFusion is used to add support for custom Vega expression and aggregation functions.

A planning stage is used to parse the original Vega spec and identify the signal and data dependencies. Then the spec is split into two valid specs. One that runs on the server using the runtime built on DataFusion, the other that runs in the browser using the standard Vega.js library. The planning phase also identifies a communication plan which includes all of the signals and datasets that need to be transfered between the two specifications.

The spec parsing and planning logic is compiled to WebAssembly and executed in the browser. The JupyterWidget protocol is used to transfer data between server and client.

Section 4 - How the design will enable additional use cases

The system is designed to be embedded in a variety of web contexts. The initial focus is on the Jupyter Widget use case, but there is a relatively small amount of code that is Jupyter specific. The roadmap includes Dash, Streamlit, and Panel support.

The server uses an efficient caching scheme to avoid duplicate calculations and to support many simultaneous visualizations of the same dataset without increased memory usage.

In the initial jupyter case, for convenience the runtime is embedded in the Python process. But the runtime could also run in a server configuration, allowing multiple processes to connect to it. This would be the preferred configuration for serving a Dashboard to many users using Voila, Dash, Streamlit, or Panel.

Protocol buffers were chosen as the binary serialization format with the intention of hosting the VegaFusion runtime as a gRPC service.

Discussion of the state model.

Like Dash, each client maintains the full user state so the server is not required to keep an active record of the clients that are connected to it, and the server is not obligated to maintain the session state of each user. The difference from Dash is that the client isn't required to store the current value of every node in the Task graph, only a unique fingerprint for that state. The result is that the client can maintain the state of a task graph that includes very large datasets without the requirement to store the datasets itself.

Section 5 - Additional feature roadmap

Support compiling runtime itself to WebAssembly, making it possible to use DataFusion to accelerate calculations in the browser. Also making it possible to mix where calculations take place.

Support scales. Update planner to run encoding logic on the server.

Support rendering

Section 6

Explanation of initial plan to license as AGPL3 with CLA.

Please add option to `to_image_url` to generate data ui.

I'm trying to implement a panel-vegafusion component.

Right now I want to support "Save as PNG". But I can see that the to_image_url method does not return an SVG data uri. But instead a link to something served by the server. That makes it much more complex for me and any one else trying to integrate VegaFusion.

Please support returning an SVG data uri also so that I don't need the server to serve anything additionally. Thanks.

image

image

Support Panel/ Does not work with Panel

Hi John

Congrats with VegaFusion.

I just tried it out with Panel. I would expect/ hope it would work as an ipywidget. But it does not.

pip install panel vegafusion-jupyter vega-datasets ipywidgets_bokeh
import panel as pn
import altair as alt
from vega_datasets import data

import vegafusion_jupyter as vf
vf.enable()

pn.extension("ipywidgets", template="fast")

ACCENT = "#1f77b4"
PALETTE = [ACCENT, "#ff7f0e", "#2ca02c", "#d62728", "#9467bd", "#8c564b", "#e377c2", "#7f7f7f", "#bcbd22", "#17becf"]

if not "panel-vegafusion" in pn.state.cache:
    seattle_weather = pn.state.cache["panel-vegafusion"]=data.seattle_weather()
else:
    seattle_weather = pn.state.cache["panel-vegafusion"]

def get_chart(seattle_weather):
    brush = alt.selection(type='interval', encodings=['x'])

    bars = alt.Chart().mark_bar().encode(
        x='month(date):O',
        y='mean(precipitation):Q',
        opacity=alt.condition(brush, alt.OpacityValue(1), alt.OpacityValue(0.7)),
    ).add_selection(
        brush
    )

    line = alt.Chart().mark_rule(color='firebrick').encode(
        y='mean(precipitation):Q',
        size=alt.SizeValue(3)
    ).transform_filter(
        brush
    )

    return alt.layer(bars, line, data=seattle_weather)

chart = get_chart(seattle_weather)

pn.panel(chart).servable()

pn.state.template.param.update(
    site="Vegafusion", title="Interactive Big Data Apps with Crossfiltering",
    accent_base_color=ACCENT, header_background=ACCENT
)
panel serve panel_vegafusion_app.py --static-dirs _vegafusion_data=./_vegafusion_data

image

WARN Loading failed _vegafusion_data/vegafusion-7bd7f9dd903be78e727c1b52e745f4f17d1212dd.feather SyntaxError: Unexpected token A in JSON at position 0

Consider Supporting Julia / VegaLite.jl

As a minimal proof of concept, vegafusion-server binaries are now available for many platform via Yggdrasil (Julia's binary building infrastructure) and the Julia package registry. Julia already has an Altair equivalent, VegaLite.jl which could be updated to support a vegafusion-server backend.

Motivation: Julia plotting libraries which can support massive datasets and high processing demands tend to be more successful Makie.jl, VegaLite.jl has been held back by lesser performance on large-n use cases. Likewise Julia is home to a community of high performance computing projects (and enthusiasts) who would particularly appreciate the boost a project like vegafusion has to offer.

Add benchmark test suite

In addition to all of the correctness testing (comparing to Vega, Vega-Lite, and Altair), we should have a collection of performance benchmarks on larger datasets.

A common way to do this in the Rust ecosystem is using criterion-rs. For example, here is the DataFusion benchmark suite: https://github.com/apache/arrow-datafusion/tree/master/benchmarks.

My understanding is that the general workflow here is that before merging a PR, a maintainers run the benchmark suite on the main branch and then on the PR branch (being as careful as possible to have identical test conditions between the runs). Criterion will run benchmarks many times, and it will provide a statistical estimate of whether the PR has significantly (in statistical sense) changed the runtime of each benchmark. It's not as automated as doing something on CI, but it's really hard to automate benchmarking in a consistent way without a dedicated physical server running somewhere. We might want to invest in this eventually, but the more manual workflow is still a good place to start in the short term.

For actually writing benchmarks, I'm picturing writing a test harness that's a variation of the logic that's currently in use for performing image tests against Vega.js (See https://github.com/jonmmease/VegaFusion/blob/main/vegafusion-rt-datafusion/tests/test_image_comparison.rs). But for the benchmarking case, we wouldn't run with Vega.js. For correctness, we'd compare base baseline images from past runs.

This workflow will make it possible to benchmark interactive sequences of interactions. For example, we'll be able to write a benchmark that effectively drags a crossfilter selection across a histogram in a chart like https://vega.github.io/vega-lite/examples/interactive_layered_crossfilter.html, but with a much larger dataset. And it has the nice property that benchmarks are defined as JSON files with very little boilerplace.

To start with, we could use Vega-Lite/Altair examples, but programmatically generate parquet files that duplicate the input datasets many times. It might even be nice to do some sweeps over input size in order to generate performance curves.

This is not a 0.1 release blocker, but it's something that would be good to do shortly after the release in order to be confident that the next release hasn't introduced a performance regression.

Share cache across client timezones when local timezone calcaulations are not used

Follow on to #82

The local timezone string has been added as an optional field of every Task. Currently it is always included so the finger print of a task will always include the client's timezone. A future improvement would be to only include the local timezone in tasks that actually need it. This would allow local timezone agnostic tasks to share cache entries. As it currently stands, the cache will not be shared across clients in different time zones.

Uncaught (in promise) out of memory

If I reload my app several times I get an Uncaught (in promise) out of memory error in the browser console.

I've experienced it since I got something working. If I restart the server it works again.

I don't know where somethings are wrong. If its on "my" end or "vegafusions" end. I just report it here as it could potentially be useful for future component developers.

I will investigate. If I can find a minimum reproducible example I will post it here.

uncaught_in_promise.mp4

I believe it happens when the getPanelVegaFusion function below is called.

async function getPanelVegaFusion(){
  const { render_vegafusion } = await import("vegafusion-wasm");
  const { compile } = await import("vega-lite");

  let panelVegaFusion = {
    render_vegafusion: render_vegafusion,
    compile: compile
  };
  return panelVegaFusion
}

window.getPanelVegaFusion = getPanelVegaFusion
export {};

Python + Jupyter integration testing

We need some integration tests that exercise the Jupyter Widget + Python logic.

One idea would be to use Voila to serve a dashboard containing a VegaFusionWidget, and then interact with it and snapshot it using selenium.

Ideally, we'd test every Altair documentation example this way.

Projection Pushdown: Determine fields used by vlSelectionTest

As a follow-on to #113, it would improve the utility of projection pushdown if we were able to precisely determine which columns are used by vlSelectionTest expressions.

For example, consider the Vega-Lite spec:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "Drag out a rectangular brush to highlight points.",
  "data": {"url": "data/cars.json"},
  "params": [{
    "name": "brush",
    "select": "interval"
  }],
  "mark": "point",
  "encoding": {
    "x": {"field": "Horsepower", "type": "quantitative"},
    "y": {"field": "Miles_per_Gallon", "type": "quantitative"},
    "color": {
      "condition": {"param": "brush", "field": "Cylinders", "type": "ordinal"},
      "value": "grey"
    }
  }
}

Screenshot_20220526_151912

With selection
Screenshot_20220526_152045

In the generated Vega, the coloring of the symbols is controlled by the selection with:

          "stroke": [
            {
              "test": "!length(data(\"brush_store\")) || vlSelectionTest(\"brush_store\", datum)",
              "scale": "color",
              "field": "Cylinders"
            },
            {"value": "grey"}
          ],

The question is which columns of datum are used by the expression vlSelectionTest(\"brush_store\", datum). To find out, in general, we need to know the contents of the brush_store dataset.

In this case, the definition of brush_store is empty:

{"name": "brush_store"},

Rows are added to brush_store dynamically in the following signal update expression:

    {
      "name": "brush_modify",
      "on": [
        {
          "events": {"signal": "brush_tuple"},
          "update": "modify(\"brush_store\", brush_tuple, true)"
        }
      ]
    }

The row that is added is actually added is defined as brush_tuple:

    {
      "name": "brush_tuple",
      "on": [
        {
          "events": [{"signal": "brush_Horsepower || brush_Miles_per_Gallon"}],
          "update": "brush_Horsepower && brush_Miles_per_Gallon ? {unit: \"\", fields: brush_tuple_fields, values: [brush_Horsepower,brush_Miles_per_Gallon]} : null"
        }
      ]
    },

The part we actually care about, though, is the brush_tuple_fields value, which is defined as:

    {
      "name": "brush_tuple_fields",
      "value": [
        {"field": "Horsepower", "channel": "x", "type": "R"},
        {"field": "Miles_per_Gallon", "channel": "y", "type": "R"}
      ]
    },

This means that the fields that vlSelectionTest(\"brush_store\", datum) uses are "Horsepower" and "Miles_per_Gallon".

So the question is, is there a reliable way to determine this based on inspection of the Vega specification.

Documentation website for Python library

We'll want a main website for VegaFusion. For 0.1, the main focus should be to document how to use the vegafusion-jupyter Python package to enable serverside acceleration for Altair.

Over time, we'll want to include documentation for how VegaFusion works internally, and for how to use it without Python in a Business Intelligence style setting.

As a starting point, I'd like to look at using the PyData sphinx theme and MyST to support writing documentation in Markdown. Hosting on GitHub pages should be sufficient to get started, but eventually it would be nice to have some live hosted examples to demonstrate performance improvements.

Add support for pre-transforming datasets within specifications

Overview

In some situations, it may not be desirable (or even possible) to run vegafusion-wasm/vegafusion-embed in the web browser context. And, it may not be possible to send messages from the client back to the server. This fundamentally limits the possibility of supporting selection/crossfiltering scenarios on the server, but there are still many cases that could benefit from serverside computation.

The motivating example is a static histogram. In this cases, it would be possible to perform binning and aggregation on the server, and then generate a new Vega specification that includes a small inline dataset that simply contain the histogram bar heights.

Proposal

We could add a new method to the VegaFusionRuntime (and correspondingly a new gRPC service) named something like pre_transform_spec. This method would accept:

  • A valid Vega specification
  • The named timezome to consider "local"
  • (optional) A row limit. Datasets with more than this number of rows are truncated. If not provided, then no limit is imposed.

The method would return:

  • A new Vega specification with supported data transforms already applied.
  • Zero or more WARNINGS. To start, warnings for:
    • Row limit exceeded for one or more datasets in the resulting spec
    • Broken interactivity. If transforms were applied that are parameterized by signals/datasets that may change in response to user interaction, then this warning is included. In the future we may want to add a specialized spec planning algorithm that specifically avoids this situation, but to start we can reuse the existing planning algorithm and raise this warning if the communication plan includes any client-to-server values.
    • No transforms applied, so returned spec is identical to the input spec.

Demo

It would be good to add a simple motivating demo. One possibility would be to add a variation of the current chart-editor demo that uses vega-embed rather than vegafusion-embed on the client. The client would use gRPC-Web to communicate with a VegaFusion server instance to pretransform the user-provided spec. It would then display the result using vega embed and display any warnings.

Python story

This logic path could be useful from Python for situation where the VegaFusion rendering extension is not available. The challenge for integrating it with Altair is that a JavaScript runtime is required in order to convert Vega-Lite specifications to Vega. This could work if a custom Altair renderer was added that called out to nodejs (or maybe kaleido in the future) to perform the vega-lite to vega conversion.

We can expose the new entry point through the Python API without any issue, but useful Altair integration could come later.

Please support events.

Panel 0.13 (to be released in weeks) will provide support for events on the Vega pane.

I would like to be able to provide the same on the Panel VegaFusion pane.

I don't know if this is currently possible with VegaFusion or how?

Inspiration

You can find the typescript implementation of the Vega pane here https://github.com/holoviz/panel/blob/master/panel/models/vega.ts

You can find the development version of the Vega pane docs here describing the event api here https://pyviz-dev.github.io/panel/reference/panes/Vega.html

Support parquet

VegaFusion should support reading datasets from parquet files. This is already supported by DataFusion so it's not very difficult to enable.

Unfortunately I don't recall the details, but when I tried enabling is in the past I ran into datetime representation issues. I think the situation was something like pandas to_parquet would save datetime values as nanosecond timestamps, but the time unit metadata as read by Rust wasn't consistent with that.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.