Coder Social home page Coder Social logo

databook's People

Contributors

gtoonstra avatar jornh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

databook's Issues

Extract sql queries from a tableau dashboard

Use an API to connect to Tableau and extract the quer(ies) used to populate a dashboard. Try to extract with the API to with datasource it connects to identify the database.
I've written a library called "sqlineage" that helps to find out the tables that are involved in the query, which should make it possible to establish relationships between the tableau dashboard and the underlying datasources.

Contribute an example airflow template operator

The template operator should implement a method to 'audit' the transfer of data from one source to a destination and write some metadata somewhere describing the transfer (at a minimum the source db/schema/table and the destination) and perhaps some statistics like rowcount, when it was last run, etc.

Extract metadata about tables from RDBMS

Use sqlalchemy to build a generic "metadata extractor" from a database. Allow for filtering of some tables and then build a json extract file with column names, primary key, comments and data type information. This can be used on a table info page.

Document developer usage experiences on Windows 10 with WSL and Docker

Right now https://github.com/gtoonstra/databook#prerequisites clearly states:

"You'll need a Mac or Linux with a docker installation to run the sample deployment of databook."

I don't own a Mac, and I'd like to use a combination of WSL (a Ubuntu based sub-system that keeps getting closer and closer to being a "capable enough" Linux) and Docker on my Windows machine in place of having to wrangle a VM.

This issue (which I guess I just volunteered to work on fixing ๐Ÿ˜‰) is just to have a place to track what I come across:

  • Today I just got one step closer to it working, as company IT here finally let Win 10 version 1709 (Fall Creators Update) out of the bag. It resolved an issue with gUnicorn couldn't run because WSL was missing /proc/<pid>/status in Win 10 version 1703 (this BTW also affected airflow webserver with Airflow 1.9.0)

    So, long story short: To run databook webserver on WSL bash - a prerequisite is you need to be on v1709

  • Windows docker uses windows mount point names. This doesn't work well with the current docker-compose files. I hope https://nickjanetakis.com/blog/setting-up-docker-for-windows-and-wsl-to-work-flawlessly#ensure-volume-mounts-work will provide an elegant fix to that

  • more...? I hope not ...

When I have it working my plan is to submit a PR on the README or something ...

Github crawler

Write a crawler for a github repository to extract SQL code (DDL) and extract some metadata about it:

  • creator
  • contributors
  • table name
  • database name

Then add this metadata to an input file that enriches the graph database.

just FYI Amundsen data portal project

@gtoonstra , just FYI that Lyft has built a data portal project named Amundsen, which is also inspired by Airbnb data portal. Our project is now open source:

The last repo will be a extractor/model library repo which is intent to be using in Airflow DAG. We put some examples on how to use that library in an Airflow DAG.

Thanks,

Improve the groups web page.

The group's page can visualize a bit of information, but there's very little interaction ability at the moment. The "add link" doesn't work and no information about the group is present (except for the memberships).

Make it possible to share some data with the group or otherwise add some links where this sharing info is stored (confluence, wiki, etc).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.