pinterest / querybook Goto Github PK
View Code? Open in Web Editor NEWQuerybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Home Page: https://www.querybook.org
License: Apache License 2.0
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
Home Page: https://www.querybook.org
License: Apache License 2.0
Make sidebar search and table search use similar UI to control what parameters can be filtered such as:
Story
As a user, it would be confusing when I go on DataHub and it does not tell me why I cannot see any environments.
As an admin, I want to give pointers to new users when they first visit DataHub.
Acceptance
Boost tables based on boost score
Story
As a user, I would only want to view query examples by a certain query engine
Assumption
Acceptance
Story
As a user I want to export multiple query results externally
Acceptance
Currently, all private DataDocs are not indexed on Elasticsearch for simplification of logic. Since most of the DataDocs will be private by default with FGAC, it is essential to make them searchable from Elasticsearch. The new Elasticsearch table for DataDocs should include 2 more fields: public and readable_user_ids. The second field readable_user_ids should include every user who can access this private DataDoc.
This would be useful if user opens DataHub in a multi-window browser
As a developer, I want to set up data source unit tests quickly with some example data in database
Acceptance:
expected formatting:
DELETE JAR s3://test-bucket/hadoopusrs/prod/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
-> same
actual formatting
DELETE JAR s3://test-bucket/hadoopusrs/prod/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
->
DELETE JAR s3: / / test - bucket / hadoopusrs / prod / test -0.5 - SNAPSHOT / test -0.5 - SNAPSHOT.jar;
ADD JAR s3://test-bucket/hadoopusrs/bob/test-0.5-SNAPSHOT/test-0.5-SNAPSHOT.jar;
This is caused by the url change without considering the current state
Currently picking a table chart does not let the user choose a title. It makes it hard to differentiate between a table chart and a query execution
When searching xxx.yyy in data doc search, yyy would return nothing and users have to search xxx.yyy to find the result. The strategy will be provide multiple analyzers to analyze code and rich text differently
The request does not return when you add a filter for start date or end date.
Things to check:
change it from a table format to rows format similar to announcement admin ui
To repro:
Story
As an admin, I want to add the same query engine to different environments without worrying about duplicating the config.
As an admin, I want to be able to order query engine in the dropdown so that I can order them differently for the user.
Assumption
Acceptance
Currently most columns are using varchar or mediumtext
As a user, I want to see who are the frequent users of a table so I can ask them questions.
Assumption:
Use query samples to obtain info about the common query runners
Acceptance:
Add the ability to
After favoriting a DataDoc, it does not show up on refresh
Problem:
The sql-lexer assumes that anything that is a VARIABLE type following a FROM statement is a table and breaks the suggestions.
Root cause:
Presto allows a FROM clause in front of things other than table names
The types supported by the extract function vary depending on the field to be extracted. Most fields support all date and time types.
extract(field FROM x) โ bigint
Returns field from x.
Code where this fails:
while (!stream.eol()) {
// here the match fails, and because nothing gets consumed it goes off in an infinite loop if the match is handled
// Maybe the right thing to do is, if there's no match, break out of the stream matching?
const match = stream.match(/^([_\w\d]+|`.*`)\.?/, true);
// this fails and kicks you out of the loop, but then the suggestions stop working
if (match[1]) {
let part = match[1];
if (part.charAt(0) === '`') {
// remove first and last char
part = part.slice(1, -1);
}
parts.push(part);
}
short snippet of what caused this:
SELECT *
FROM table_2
JOIN table_1
ON table_1.field_1 = table_2.field_2
AND extract(YEAR FROM field_1_date) = table_2.field_year
This is useful when there is a large amount of series
We can support snowflake easily with snowflake-sqlalchemy integration
https://docs.snowflake.com/en/user-guide/sqlalchemy.html
https://pypi.org/project/snowflake-sqlalchemy/
There is a user setting for editor text size. It would be nice to have a similar setting for the text size of the query results.
We can also reuse such setting for query results size
Add a table warning system in DataHub where users can put their own warning messages for a table. This warning message will be shown by the linter while user is writing code.
Create Notifier plugin model to allow for different orgs to add new notification services such as ms teams. Notifier will handle sending query completion messages as well as doc permission change messages to DataHub users.
By default the default schema name is 'default', which does not apply to all cases since this can be overridden in the connection string. This setting would also be different for different query engines, for example, sqlite's default is actually 'main' instead of 'default'
Acceptance
Together with #202, they should help with the experience of exporting
Story
As an user I want to export my entire DataHub query results without worrying about the preview size
Acceptance
Add field selection to row samples, by default, all columns are selected
Users can export the raw query
Users can copy the result to clipboard as tsv
Being able to set which fields to search for cmd+k search
Here are some of the potential fields:
This change will apply to the following views:
Fields such as partition, hive metastore information, query users, should be all hidden if there is no information to show
Expand the dropdown to not just give edit/read permission but also to give ownership. The previous owner should still get write permission afterwards
When sorting the impression count in the impression table in DataDoc/DataTable view, it does not sort from largest to smallest or vice versa.
Story
As a user, I want to use Vscode to develop DataHub with minimal amount of effort
Acceptance
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.