cldellow / datasette-ui-extras Goto Github PK
View Code? Open in Web Editor NEWAdd editing UI and other power-user features to Datasette.
License: Apache License 2.0
Add editing UI and other power-user features to Datasette.
License: Apache License 2.0
@furilo provided some Figma mockups which surface the size of the tables, which might make it easier for someone to dive in to the "most interesting" tables right away
I hid this in #22
...and now I regret it, as it's useful to see a view definition. eg on https://dux.fly.dev/cooking/questions
If the default HTML had an ID like schema-definitions
, we could expose it via a similar trick as the advanced export, eg make it so https://dux.fly.dev/cooking/questions#schema-definitions showed the schema.
That would require a change in Datasette's default template.
Alternative idea: if we do #54, we'll expose a way to edit views. You'll be able to see the schema there.
Weird, I thought the inner simple
varied based on the facet type, I think maybe it doesn't. Figure out if there's a cleaner solution here.
Show min, p50, p90, p95, p99, max
When you add it, we won't be able to correctly tell that's what you did, unless we do something special. e.g. adding response_time > 500 will change the population on which we're calculating the statistics. We could add a __dux_stats=XXX parameter that we use to freeze the facets.
But then that won't update if you refine the other filters. Maybe that's OK?
Also unclear what the default action should be, eg filter to things higher than it? Lower than it? Wouldn't you just... sort?
Maybe the toggle URL should just be #
and we never return selected = True?
It'd be nice if the facet name was also sticky, at least for the duration of its results being in view
The PR has an attached CSS diff that improves the styling of facets
I'm not sure how achievable this is. #49 explores creating something like updatable views for the turker use case. It'd be handy to have a trigger that can set last_edited_by = current_actor()
, where current_actor()
is evaluated in the context of the current request to get the user's ID.
Can ... can we do that?
Simon mentioned maybe you can abuse asgi_wrapper to get access to the request: https://discord.com/channels/823971286308356157/823971286941302908/1070385463640215702
Build out the MVP of #48:
?_dux_edit=1
query parameter present (in the future we can define via metadata){% extends "default:row.html" %}
<form>
that wraps all the controlsrender_cell
should emit an edit control if the control is editableStringControl
that is just a vanilla text inputsimonw/datasette#596 (comment)
Once you've customized your view, eg https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_nocol=gppd_idnr&_nocol=other_fuel2&_nocol=owner&_nocol=other_fuel3&_sort_desc=capacity_mw&_nocol=other_fuel1
It'd be neat if you could do two things:
(1) could be done with localStorage if we're willing to be inefficient (we'd load the page; then immediately redirect the user on the client-side); or cookies (but we might be likely to blow through the 4KB limit pretty fast). It would be best to do this with persistent storage. The preference could be linked either to an ephemeral identifier stored in a cookie, or your identity as an actor.
(2) would require persistent storage in the DB, maybe a hidden table
IMO, these should be hidden behind a hamburger menu
https://github.com/simonw/datasette/blob/main/datasette/templates/table.html#L183
Maybe we target wrapped-sql on table pages?
The cog takes up a lot of horizontal space.
Maybe:
display: none
by defaultdisplay:inline-block
on hover (maybe on hover of the whole row?)position:absolute
so it doesn't reflowThere is already an attempt to show a read only summary of them. Editing them is rare, plus you can edit some of them by facets today, and by the omni bar eventually.
Maybe a pencil icon to display them, and it toggles the presence of a _dux_edit_filters params.
Placeholder issue - we'll monkey patch this until simonw/datasette#2008 is merged, then we'll revert it
https://github.com/simonw/datasette/blob/main/datasette/templates/table.html#L158
Alternatively, it could be a modal, and we could put the link to display the modal much higher
Goal is to make this page much more about your data, have fewer boiler plate chunks
from simonw/datasette#1298 (comment)
https://medium.com/neocoast/fixing-a-table-header-on-a-horizontally-scrolling-table-de3364610957 also used this approach
I don't personally see the utility in this, so I likely won't implement it for myself, but tracking it in case others want it
This issue lays out the vision for the editing UI that datasette-ui-extras
will provide. See #54 for the DDL version.
Editing is only for SQLite databases, not DuckDB.
Editing will use the write API introduced in Datasette 1.0, so you'll need to be on Datasette 1.0.
My goal is to bring an automatic, pluggable, user-friendly UI that enables these use cases: traditional data entry, turker mode, and forms.
Non-goals: supporting JavaScript-disabled browsers.
Don't make users define redundant mappings. Lean into the structure that SQL provides us.
Use SQL foreign keys and CHECK constraints to define what is permissible and drive UI control selection.
Because SQLite is untyped, we'll have to sniff rows and/or use heuristics in some cases.
Things we'll aim to support:
x IN (...)
You can declare a SQL VIEW to further control the user experience. Imagine that you have:
CREATE TABLE reviews(
id integer primary key,
url text not null,
review text not null,
rating text (check rating in ('negative', 'positive')),
rated_at text,
rated_by text
)
You'd like to have some contractors fill out the rating
field. They shouldn't have access to the url
field. They should only have access to rows that still need a rating. When they rate something, you'd like to automatically track who rated it and when. Oh, and you'd like to give some instructions.
You can do this by creating a view, and giving them access to that:
CREATE VIEW needs_rating AS
/* set rated_by=current_actor() */
/* set rated_at=datetime() */
SELECT
id,
'What is the sentiment of the review? If unsure, choose positive.' AS instructions,
review AS review_readonly,
rating
FROM reviews
WHERE rating IS NULL
Only two unaliased columns from the base table are present, and thus candidates to be editable. id
, however, is part of the primary key, and so only rating
is editable.
When the user submits their entry, the set
statements are automatically executed, tracking who edited the row, and when.
We'll try to render sensible controls. Sometimes we might get it wrong -- perhaps we'll render an input field that expects a number, when it really ought to have been a checkbox that stored 1 for checked and 0 for unchecked.
You can override us by implementing a plugin hook:
@hookimpl
def edit_control(datasette, database, table, column):
if column == 'name':
return 'ShoutyControl'
ShoutyControl
must be a JavaScript class that is available to the page. This can be a pre-defined one provided by datasette-ui-extras
or one you author via a file loaded by extra_js_urls
or inlined by extra_body_script
The class should conform to this interface:
class ShoutyControl {
constructor(db, table, column, initialValue) {
this.initialValue = initialValue;
this.el = null;
}
// Return a DOM element that will be shown to the user to edit this column's value
createControl() {
this.el = document.createElement('input');
this.el.value = this.initialValue;
return this.el;
}
get value() {
// Be shouty.
return this.el.value.toUpperCase();
}
}
TODO: consider if the interface should have an isValid
function, and a way to signal that its value has changed (for example, to permit "autocommit on blur" modes)
TODO: document how you might reference other columns. eg, say you have text The quick brown fox jumped over the lazy red dog.
in column A, and you want the user to annotate it and have those annotations show up in column B as [{"substring": "jumped", "label": "verb"}]
We'll try to show a good control. For small, closed sets, we'll use a drop-down. For larger sets or open sets, an autocomplete combobox.
For the turker use case, we might render a space-inefficient control that has key-bindings that permit you to quickly advance through a dataset, eg from https://prodi.gy:
The use cases don't assume any particular access scheme. Some scenarios may be only authenticated users, some may permit anonymous users (eg forms). See the Authentication, authorization and auditing section for more.
These are imagined as alternative layouts for the row view, eg the pages located at /db/table/1
, /db/table/2
and so on.
It's the typical vertical layout of column name, UI-control-to-specify value.
You might be in auto commit mode, or you might have to click an explicit Save button.
Clicking Save keeps you on the current row page.
Example: curation of detailed data, ability to deep-link in workflows.
You can submit new rows, but not read, edit or delete existing rows.
After submitting you are redirected to a customizable URL. By default, you are sent to the new form submission page.
Example: collecting feedback from the general public.
You can edit a subset of fields on a subset of existing rows, but can not read or delete other rows.
After submitting an edit, you are advanced to the next row that needs editing.
You might want affordances to enable very fast editing -- for example, focus the first control, permit keyboard shortcuts to auto-select an answer and move on.
Example: contractors who are doing piecemeal data entry, trusted internal staff doing manual annotations.
This is all delegated to other systems.
Authentication is handled by Datasette's actor system.
Authorization is handled by Datasette's permission system.
There's no built-in support for auditing. If you'd like to track which users created/updated rows, use the datasette-current-actor plugin and create suitable DEFAULT values or trigger functions.
You can configure which tables are editable by default in metadata.json, or activate edit mode with ?_dux_edit=1
We'll need a hook to create/update INSTEAD OF
triggers for editable views.
The table view would let you update values in-place, without navigating to the row page for each entry.
Rejected because I think I can't do a sufficiently good job on the UI: it likely won't be an actual spreadsheet view with a seamless grid and resizeable columns. More likely, it would have inline controls that automatically commit changes via ajax.
We compute facets by making JSON requests, one per each facet.
For example, faceting by creation_date and tags results in these two requests:
https://dux-demo.fly.dev/superuser/posts.json
?_sort=id
&_facet_year=creation_date
&_facet_array=tags
&post_type__exact=question
&_size=0
&_nocount=1
&_dux_facet=_facet_year <--
&_dux_facet_column=creation_date <--
https://dux-demo.fly.dev/superuser/posts.json
?_sort=id
&_facet_year=creation_date
&_facet_array=tags
&post_type__exact=question
&_size=0
&_nocount=1
&_dux_facet=_facet_array <--
&_dux_facet_column=tags <--
Even though each request is only for a single facet, we include both _facet_year=creation_date
and _facet_array=tags
in both requests so that the URLs that get constructed are correct.
If you remove the creation_date
facet, the request for the tags
facet will have a different set of parameters:
https://dux-demo.fly.dev/superuser/posts.json
?_sort=id
&_facet_array=tags
&post_type__exact=question
&_size=0
&_nocount=1
&_dux_facet=_facet_array
&_dux_facet_column=tags
This means that even though we have a perfectly good cached response for this facet, we won't be able to use it.
A possible alternative would be to omit all _facet_*
parameters except the facet under consideration, and then teach fixupToggleUrl
how to re-add the other columns.
A good solution would also re-add the deleted parameters in the same location to ensure a high cache rate. This feels pretty involved for what is probably not that big of a problem.
I think we could target the column name in the before pseudo class: https://stackoverflow.com/a/32993828
Then I think we'd generate a synthetic click event on the cog icon
Then have suitable styling so that the menu appears like a modal.
Crazy.
As part of #48, it might be valuable to let users define a subset of a table as a view. For example, if you have:
CREATE TABLE data(
id integer primary key,
value text,
approved integer,
approved_at text,
approved_by text
)
You might want to create a queue of records-to-be-reviewed, like:
CREATE VIEW needs_review AS
SELECT
id,
'Look closely at the value. Do you approve it?' AS instructions,
value AS value_aliased_so_as_not_to_be_editable,
approved
FROM data
WHERE approved IS NULL
That view declares that you should be able to edit the approved
field. id
isn't editable, as it's part of the pkey. The other two columns are aliased, and so aren't editable.
Then, you'd use SQLite's INSTEAD OF
trigger to permit updates against the view. You'd probably want some jazz to create these triggers automatically, so users can get by with basic SQL understanding.
Questions:
approved_at
and approved_by
in a somewhat natural wayRow not found: ['1']
. It looks like these two things need to be fixed:
'View' object has no attribute 'update'
. Compare View vs Table#updateWe currently serve N CSS/JS files. Instead, we should transparently concatenate them into a single, fingerprinted file with a long-lived expiry date.
Maybe deploy to fly?
Resources:
A manual deploy is fine to start, although if we're doing something with DBs, it would be good to script that bit.
Is there an interesting, open source database we can use to demo? Ideally it'd have good faceting and searching... maybe pictures?
https://global-power-plants.datasettes.com/global-power-plants/global-power-plants has a lot of configuration data above the fold. A grumpy pants on the orange site didn't like this.
They're not wrong. It'd be nice to permit facets being on the left
We'll monkey-patch the existing ColumnFacet, DateFacet, ArrayFacet.
A pragmatic way to get started:
ideally, we only compute facets? I think we can pass nocount
, nosuggest
to disable a bunch fos tuff, maybe we can also pass pagesize=0? see code
_size=0&_nocount=1
is a start, although the size gets bumped up to 1 (to discover pagination? special case of 0!)ideally we make 1 call per facet, so we can begin rendering as soon as any data is available
it'd be nice if we could reuse _facet_results.html -- but not necessary for an MVP
toggle_url
- strip .json
, _nocount
, _size
__dux_facets
driven by metadata and qs paramsFacets are really cool. They're complex and present many tradeoffs. There's no one right set of choices that can satisfy everyone. I think Datasette's current approach with them is fairly conservative, which is a sensible choice for Datasette, the platform. For my own tastes, I'd like them to behave a bit differently. And since this is a plug-in, YOLO, let's take a wildly different approach.
What I like:
What I'd like to adjust:
LIMIT
clause applies after the WHERE
clause, see DateFacetThat's a big list! They don't all depend on each other. This ticket is primarily to explore the performance side of things.
Clicking the X next to one of the filters changes the column to - remove filter -
. I think it'd be nicer if it just immediately applied. (Facets are like this currently, so there'd be a nice symmetry.)
The UI for IN requires the user to serialize the options themselves. It'd be nice if instead you entered one option at a time, and previous entries became pills that could be X'd out individually.
https://github.com/simonw/datasette/blob/0b4a28691468b5c758df74fa1d72a823813c96bf/datasette/filters.py#L313-L324 does json_each
to test array membership.
This is, of course, the right way to do it.
...but probably we would get by just fine if we did column like '%"string-value"%
, which seems to be 3x faster:
When we have stats for the table in dux_column_stats
, let's enhance the search box to help users build structured queries.
We'll render it if it's absent -- extra_template_vars
could pass supports_search=True).
The spirit of what I want, using https://dux.fly.dev/superuser/posts as an example:
Typing linu
suggests:
Typing 2022
(or 2022-01-08
, or 2022-03
, with language tweaked as needed) suggests:
creation_date
in 2022creation_date
after 2022creation_date
before 2022Prioritizing the options to show might be hard--probably want max 10 (this is what google.com uses). We'll try to do something reasonable about which columns to include (maybe we should use the currently displayed columns as a starting point?), and we should let the user override the default case.
https://global-power-plants.datasettes.com/global-power-plants/global-power-plants has a search box
Pressing /
should focus it
eg the superuser dataset has 1.4M rows and times out parsing json tags
There are a few things in the linked issue, this ticket is about having a fixed table header on the table page
Split from #21. I'm not convinced this is a good idea yet.
Imagine two facets, gender and country. They start like:
Gender
- Male (90)
- Female (70)
Country
- Canada (100)
- USA (50)
- Mexico (10)
If you pick USA, the facets change to:
Gender
- Male (25)
- Female (25)
Country
X USA (50)
Could they instead change to:
Gender
- Male (25)
- Female (25)
Country
- Canada (100)
X USA (50)
- Mexico (10)
i.e. show what you have selected, but also show the other options within the same facet.
This would let you create ORs, by picking multiple countries, eg USA and Mexico.
This might get pretty weird if you have multiple facets active.
This is half baked, but I'm thinking something like a drawer that you can pull out, then you can swipe left/right to see facet counts.
Maybe you get a preview of the data with a scrim that obscures it unless you click?
Related to #9, although could be done with facets in the ATF position, too.
Have a control that hides the facet pickers. Persist whether it's hidden in localStorage, perhaps per database/table.
proposed in https://discord.com/channels/823971286308356157/996877076982415491/1069147970248904744
Extract the function that sets the body class
Add a click handler that fiddles the URL and prevents default
https://dux.fly.dev/cooking/badges/2?_dux_edit=1 shows the UI. Since the actor doesn't actually have the update-row permission, I would have expected
datasette-ui-extras/datasette_ui_extras/utils.py
Lines 32 to 40 in 7f4a0e7
I'm pretty sure the actor doesn't have the update-row permission, as trying to submit an edit results in a failure, and that checks the same thing (see: https://github.com/simonw/datasette/blob/8b9d7fdbd8de7e74414cc29e3005382669a812dc/datasette/views/row.py#L204-L206)
I don't see it locally if I write a permission_allowed hook that prevents update-row... so what's going on? I actually can see this locally -- I was testing with False
, but if I test with None
, it repros
related to #3 - #3 is about when you have to consult an fkey table for the label, this is about when the raw value is present
When typing in the filter box, it'd be nice if it auto-suggested options
This might require some storage in order to remember candidate values (running SELECT DISTINCT xxx FROM table
is likely slow)
...currently, you can get into a state where you've added a facet that times out and you can't remove it.
Split from #21
If we remove facet suggestions, we no longer have a way to facet by date, or by array.
It would be nice if we could customize the cog to show those options -- right now it always forces column faceting.
I think there isn't an official way to hook column actions yet: simonw/datasette#983 (comment)
We might be able to party in https://github.com/simonw/datasette/blob/main/datasette/static/table.js#L1
...although discovering which column we're connected too will be a pain. I think we'd have to parse the absolute positioning then reverse engineer which column it is
This also means mobile users won't be able to control facets, I'm ok with that for now -- plus we might make the omnibar smarter, which would mitigate this a bit.
I enabled datasette-cluster-map
.
Visiting https://dux.fly.dev/parquet/geonames makes a request to https://dux.fly.dev/parquet/geonames.json?_size=max&_labels=on&_shape=objects
That failed with: {"ok": false, "error": "'NoneType' object has no attribute 'startswith'", "status": 500, "title": null}
https://dux.fly.dev/geonames/geonames.json?_size=max&_labels=on&_shape=objects fails similarly, so probably a bug in the facet patching
I'd like non-programmers to be able to use datasette-ui-extras
successfully to build edit UIs. A challenge with SQLite is that its untyped/flexibly typed.
We'd like to recognize:
2021-12-03
(eg date()
)2021-12-03 01:02:03
(eg datetime()
)2021-01-04T21:26:30Z
2020-05-01T16:10:12.469751
2022-11-11T21:44:21+00:00
1575765142
(question: is this a date or a date time? If it's a date, what TZ is it in?)Proposal:
_
.CREATE TABLE _dux_column_stats(
table text not null,
column text not null,
type text not null,
nullable boolean not null,
min any,
max any,
computed_at text not null default (datetime()),
limit integer, -- Was this based on SELECT *, or SELECT * LIMIT N ?
distinct_limit int not null, -- How many distinct examples were we willing to capture?
distincts text not null, -- eg [{ value: 123.1, count: 123}]
json_each_distincts text not null, -- same shape as distincts, but the contents of JSON arrays
nulls integer not null, -- the output of COUNT(*) FILTER (WHERE TYPEOF(column) == 'null')
integers integer not null, -- as above, but integer
reals integer not null, -- as above, but real
texts integer not null, -- as above, but text
blobs integer not null, -- as above, but blob
primary key (table, column)
);
That ought to be enough for us to determine the serialization format of a column, eg whether it's seconds since the epoch or ISO timestamp with T or with space.
On startup, we'll ensure the table exists for every attached writable database. If the schema isn't exactly what we expect, we'll drop and recreate it. This is super opinionated! Maybe we'll have an opt-out knob later, but even then, it should be on by default so that it's easy to use.
We'll be able to fetch stats on demand. If there's no entry in the table, we'll do a minimal scan based on WITH small AS (SELECT "column" FROM table LIMIT 1000) SELECT ...
In the absence of stats, we'll assume that BOOLEAN
is a checkbox (0/1), DATE
is DATE()
and DATETIME
is DATETIME()
.
Whenever we do a minimal scan, we'll also queue a full scan to happen in a separate thread. Basically, things should trend towards being accurate. To start, we'll just pick a high enough N that this generally works for common end-user scenarios.
This table should be configured as a hidden table so it does not appear for the end-user.
https://dux.fly.dev/parquet/geonames?_facet=updated_at shows a spinner forever
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.