Coder Social home page Coder Social logo

etl-synthea-dbt's Introduction

ETL-Synthea-dbt

Using dbt (data build tool) to convert Synthea synthetic data to OMOP Common Data Model in PostgreSQL

Requirements

  • dbt-core >= 1.0.0
  • dbt-postgres >= 1.0.0
  • sqlfluff >= 2.0.0
  • sqlfluff-templater-dbt >= 2.0.0

Set-up

  1. Install dbt with Postgres adapter, sqlfluff and sqlfluff-templater-dbt
pip install dbt-postgres sqlfluff sqlfluff-templater-dbt
  1. If the installation is successful, you should have dbt CLI accessible globally. To test run
dbt --version
  1. Go to ~/.dbt/ (user home directory, i.e., not this repo directory), create profiles.yml with the example from example.profiles.yml. Fill in database credentials, read instructions in Postgres Profile config.
    dbt can write and read to one database connection at a time. dbt is not an ETL tool, as it does just the T.

  2. Test settings & DB connection, run

dbt debug

  1. Install packages for macro
dbt deps
  1. Write code. Learn more from Official free online courses.

  2. We use SQLFluff to make our SQL code clean and consistent.

Install SQLFluff and sqlfluff-templater-dbt with

pip install sqlfluff sqlfluff-templater-dbt

To lint SQL codes, run

sqlfluff lint .

To fix linting issues, run

sqlfluff fix .

or the following to force fix

sqlfluff fix -f .
  1. To test whether the code is good on our database or not, compile the model with
dbt compile

(optional: with --models like dbt run below).

  1. To run models,
  • dbt run - regular run
  • Model selection syntax (source). Specifying models can save you a lot of time by only running/testing the models that you think are relevant. However, there is a risk that you'll forget to specify an important upstream dependency so it's a good idea to understand the syntax thoroughly:
    • dbt run --models modelname - will only run modelname
    • dbt run --models +modelname - will run modelname and all parents
    • dbt run --models modelname+ - will run modelname and all children
    • dbt run --models +modelname+ - will run modelname, and all parents and children
    • dbt run --models @modelname - will run modelname, all parents, all children, AND all parents of all children
    • dbt run --exclude modelname - will run all models except modelname
    • Note that all of these work with folder selection syntax too:
      • dbt run --models folder - will run all models in a folder
      • dbt run --models folder.subfolder - will run all models in the subfolder
      • dbt run --models +folder.subfolder - will run all models in the subfolder and all parents
  1. To run models with test, use
dbt build

(optional: with --models like dbt run above).

  1. To generate doc, run dbt docs generate, serve by dbt docs serve. Or, run them together as
dbt docs generate && dbt docs serve --port 8080

Other resources

etl-synthea-dbt's People

Contributors

thanepi avatar na399 avatar

Stargazers

Matthäus Morhart avatar

Watchers

Prapat Suriyaphol avatar  avatar

Forkers

b8heng

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.