Coder Social home page Coder Social logo

dbt-labs / jaffle_shop_duckdb Goto Github PK

View Code? Open in Web Editor NEW
120.0 120.0 84.0 535 KB

Get started with dbt in less than 1 minute from `git clone` to `dbt docs serve` for free!

Home Page: https://bit.ly/3e0qtxo

License: Apache License 2.0

Dockerfile 5.20% Shell 94.80%
data dbt duckdb sql

jaffle_shop_duckdb's Issues

prevent 'wheel' is not installed errors within requirements.txt

Also got a bunch of 'wheel' is not installed, Would it make sense to add wheel to the requirements.txt ?

Collecting text-unidecode>=1.3
  Using cached text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
Using legacy 'setup.py install' for dbt-core, since package 'wheel' is not installed.
Using legacy 'setup.py install' for dbt-postgres, since package 'wheel' is not installed.
Using legacy 'setup.py install' for configobj, since package 'wheel' is not installed.

1st Phase Feedback

Notes from working through this:

  • Encountered an error with python3 -m pip install -r requirements.txt and had to both update pip and brew install postgresql. Rerunning after updating pip and installing postrgesql worked! Had I not encountered this error, definitely would have been up and running in 1 min 🙂
  • Forgot to run source venv/bin/activate after installing requirements, so it was defaulting to my local snowflake adapter instead of the duckdb one. Don't be like me and read the instructions carefully!
  • Everything else ran super super smoothly! This is such a wildly fantastic idea and I can't wait to see it in use.

Constraint application of primary and foreign keys

Hi, I'm quite new to DBT and DuckDB. I'm still a bit confused about setting up primary and foreign keys in the models.

I saw that this is how it is being implemented in the models:

- name: order_id
tests:
- unique
- not_null
description: This is a unique identifier for an order
- name: customer_id
description: Foreign key to the customers table
tests:
- not_null
- relationships:
to: ref('customers')
field: customer_id

When I open the resulting .db file in DBeaver or Metabase, the constraints doesn't seem to be implemented.

My question is:

  • Is this something that is expected in DBT or am I missing something?
  • Why don't we implement this in the .sql scripts?

PS:
I also found these tools:

github action to prove mach speed works with macos, linux, windows terminals

follow the readme toggles and copy them into github action format.

This will serve as auto checks too.

It'll be cool for new folks coming across this repo for the first time thinking, "Yeah right, no way I can get started that fast" then click through the github actions and go, "Oh they already proved it's that fast..."

Cant get dbt build to work properly

Hi I've attempted to start the project by first copying the following instructions:

git clone https://github.com/dbt-labs/jaffle_shop_duckdb.git
cd jaffle_shop_duckdb
python3 -m venv venv
source venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt
source venv/bin/activate
dbt build
dbt docs generate
dbt docs serve

And that led to this error.
Screenshot 2023-02-13 at 15 04 56

Screenshot 2023-02-13 at 15 05 39

So then I tried by using the provided vscode dev container but then it somehow fails to find the csv seed files. (even tho they exist in the repo)

Screenshot 2023-02-13 at 15 03 45

VS Code dev container pops up dbt power user error

When running this repo in VS Code (locally or via codespaces), it looks like the tasks in the devcontainer complete okay, but the dbt power user extension pops up an error.

Any idea how to prevent the error? It seems maybe related to needing DBT_PROFILES_DIR set?

image

Update README text

Synced with @sungchun12 on some minor updates to the README to hopefully improve readability :)

  • Linking to documentation on profiles.yml in step 4
  • Removing the "What is a jaffle?" section; I know that's a cornerstone of jaffle shop, but I can't tell if it's just adding dead weight to a semi-long readme
  • Moving Doug's callout/dropdown to the virtual environment reactivation earlier up (maybe at the step it occurs)
  • The note in Step 2 here is potentially confusing—this is bey default using a duckDB database/adapter, so the code should work without having to change any of the model code. I think the note could be reworded along the lines of, "If you decided to use this project in your own data warehouse, make sure....blah blah"
  • Linking to documentation on dbt build in step 5

Can payment_methods be defined once and referred multiple times?

The accepted values for payment_method field for stg_customers are defined in `models/staging/schema.yml'.

version: 2

models:
  - name: stg_customers
    columns:
      - name: customer_id
        tests:
          - unique
          - not_null

  - name: stg_orders
    columns:
      - name: order_id
        tests:
          - unique
          - not_null
      - name: status
        tests:
          - accepted_values:
              values: ['placed', 'shipped', 'completed', 'return_pending', 'returned']

  - name: stg_payments
    columns:
      - name: payment_id
        tests:
          - unique
          - not_null
      - name: payment_method
        tests:
          - accepted_values:
              values: ['credit_card', 'coupon', 'bank_transfer', 'gift_card']

These values are used in models/orders.sql.

-- ...

order_payments as (

    select
        order_id,

        {% for payment_method in payment_methods -%}
        sum(case when payment_method = '{{ payment_method }}' then amount else 0 end)
            as {{ payment_method }}_amount,
        {% endfor -%}

        sum(amount) as total_amount

    from payments

    group by order_id

),

Can payment_methods defined once in `models/staging/schema.yml', so that we could avoid data inconsistencies in the future. If it is possible, how can I do it?

Reduce size of README

Currently, the README is a giant wall of text that is intimidating to me.

As a user, I always appreciate the main focus to be install instructions that help me get hands-on. Everything else is a distraction from that goal, in my opinion.

What good looks like

  • minimum prerequisites
    • minimum software to install
    • minimum skills/knowledge required
  • the user can quickly find all the commands they need to run
    • the relevant commands don't need to be fished out from the middle of sentences (see example below)
  • work for the greatest number of users
    • instructions that are cross-platform (zsh, PowerShell, etc)
  • ideally fits on a single page

Example of fishing out relevant commands

  • pip upgrade command needs to be fished out from here
  • Should just be a stand-alone line that says this instead:
    pip install --upgrade pip

How to Seed or Read Parquet File

Hi Team,

we would like to seed or read Parquet file from local storage and Azure Blob Storage in project , Can you please guide the configuration which we can do to read the file.

Regards,
Akash

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.