Coder Social home page Coder Social logo

dbt_shopify's Introduction

Shopify Transformation dbt Package (Docs)

📣 What does this dbt package do?

This package models Shopify data from Fivetran's connector. It uses data in the format described by this ERD and builds off the output of our Shopify source package.

The main focus of the package is to transform the core object tables into analytics-ready models, including a cohort model to understand how your customers are behaving over time.

The following table provides a detailed list of all models materialized within this package by default.

TIP: See more details about these models in the package's dbt docs site.

model description
shopify__customer_cohorts Each record represents the monthly performance of a customer (based on customer_id), including fields for the month of their 'cohort'.
shopify__customers Each record represents a distinct customer_id, with additional dimensions like lifetime value and number of orders.
shopify__customer_email_cohorts Each record represents the monthly performance of a customer (based on email), including fields for the month of their 'cohort'.
shopify__customer_emails Each record represents a distinct customer email, with additional dimensions like lifetime value and number of orders.
shopify__orders Each record represents an order, with additional dimensions like whether it is a new or repeat purchase.
shopify__order_lines Each record represents an order line item, with additional dimensions like how many items were refunded.
shopify__products Each record represents a product, with additional dimensions like most recent order date and order volume.
shopify__transactions Each record represents a transaction with additional calculations to handle exchange rates.
shopify__daily_shop Each record represents a day of activity for each of your shops, conveyed by a suite of daily metrics about customers, orders, abandoned checkouts, fulfillment events, and more.
shopify__discounts Each record represents a unique discount, enriched with information about its associated price_ruleand metrics regarding orders and abandoned checkouts.
shopify__inventory_levels Each record represents an inventory level (unique pairing of inventory items and locations), enriched with information about its products, orders, and fulfillments.
shopify__line_item_enhanced This model constructs a comprehensive, denormalized analytical table that enables reporting on key revenue, customer, and product metrics from your billing platform. It’s designed to align with the schema of the *__line_item_enhanced model found in Shopify, Recharge, Stripe, Zuora, and Recurly, offering standardized reporting across various billing platforms. To see the kinds of insights this model can generate, explore example visualizations in the Fivetran Billing Model Streamlit App. Visit the app for more details.

Example Visualizations

Curious what these models can do? Check out example visualizations from the shopify__line_item_enhanced model in the Fivetran Billing Model Streamlit App, and see how you can use these models in your own reporting. Below is a screenshot of an example report—explore the app for more.

Streamlit Billing Model App

🎯 How do I use the dbt package?

Step 1: Prerequisites

To use this dbt package, you must have the following:

Step 2: Install the package (skip if also using the shopify_holistic_reporting package)

If you are not using the Shopify Holistic reporting package, include the following shopify package version in your packages.yml file:

TIP: Check dbt Hub for the latest installation instructions or read the dbt docs for more information on installing packages.

packages:
  - package: fivetran/shopify
    version: [">=0.13.0", "<0.14.0"] # we recommend using ranges to capture non-breaking changes automatically

Do NOT include the shopify_source package in this file. The transformation package itself has a dependency on it and will install the source package as well.

Databricks dispatch configuration

If you are using a Databricks destination with this package, you must add the following (or a variation of the following) dispatch configuration within your dbt_project.yml. This is required in order for the package to accurately search for macros within the dbt-labs/spark_utils then the dbt-labs/dbt_utils packages respectively.

dispatch:
  - macro_namespace: dbt_utils
    search_order: ['spark_utils', 'dbt_utils']

Step 3: Define database and schema variables

Single connector

By default, this package runs using your destination and the shopify schema. If this is not where your Shopify data is (for example, if your Shopify schema is named shopify_fivetran), add the following configuration to your root dbt_project.yml file:

# dbt_project.yml

vars:
    shopify_database: your_database_name
    shopify_schema: your_schema_name

Union multiple connectors

If you have multiple Shopify connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the source_relation column of each model. To use this functionality, you will need to set either the shopify_union_schemas OR shopify_union_databases variables (cannot do both) in your root dbt_project.yml file:

# dbt_project.yml

vars:
    shopify_union_schemas: ['shopify_usa','shopify_canada'] # use this if the data is in different schemas/datasets of the same database/project
    shopify_union_databases: ['shopify_usa','shopify_canada'] # use this if the data is in different databases/projects but uses the same schema name

Please be aware that the native source.yml connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined source.yml.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the Union Data Defined Sources Configuration section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

Step 4: Enable fulfillment_event data

The package takes into consideration that not every Shopify connector may have fulfillment_event data enabled. However, this table does hold valuable information that is leveraged in the shopify__daily_shop model. fulfillment_event data is disabled by default.

Add the following variable to your dbt_project.yml file to enable the modeling of fulfillment events:

# dbt_project.yml

vars:
    shopify_using_fulfillment_event: true # false by default

Step 5: Setting your timezone

By default, the data in your Shopify schema is in UTC. However, you may want reporting to reflect a specific timezone for more realistic analysis or data validation.

To convert the timezone of all timestamps in the package, update the shopify_timezone variable to your target zone in IANA tz Database format:

# dbt_project.yml

vars:
    shopify_timezone: "America/New_York" # Replace with your timezone

Note: This will only numerically convert timestamps to your target timezone. They will however have a "UTC" appended to them. This is a current limitation of the dbt-date convert_timezone macro we leverage.

(Optional) Step 6: Additional configurations

Expand/Collapse details

Enabling Standardized Billing Model

This package contains the shopify__line_item_enhanced model which constructs a comprehensive, denormalized analytical table that enables reporting on key revenue, subscription, customer, and product metrics from your billing platform. It’s designed to align with the schema of the *__line_item_enhanced model found in Recurly, Recharge, Stripe, Shopify, and Zuora, offering standardized reporting across various billing platforms. To see the kinds of insights this model can generate, explore example visualizations in the Fivetran Billing Model Streamlit App. For the time being, this model is disabled by default. If you would like to enable this model you will need to adjust the shopify__standardized_billing_model_enabled variable to be true within your dbt_project.yml:

vars:
  shopify__standardized_billing_model_enabled: true # false by default.

Passing Through Additional Fields

This package includes all source columns defined in the macros folder. You can add more columns using our pass-through column variables. These variables allow for the pass-through fields to be aliased (alias) and casted (transform_sql) if desired, but not required. Datatype casting is configured via a sql snippet within the transform_sql key. You may add the desired sql while omitting the as field_name at the end and your custom pass-though fields will be casted accordingly. Use the below format for declaring the respective pass-through variables:

# dbt_project.yml

vars:
  shopify_source:
    customer_pass_through_columns:
      - name: "customer_custom_field"
        alias: "customer_field"
    order_line_refund_pass_through_columns:
      - name: "unique_string_field"
        alias: "field_id"
        transform_sql: "cast(field_id as string)"
    order_line_pass_through_columns:
      - name: "that_field"
    order_pass_through_columns:
      - name: "sub_field"
        alias: "subsidiary_field"
    product_pass_through_columns:
      - name: "this_field"
    product_variant_pass_through_columns:
      - name: "new_custom_field"
        alias: "custom_field"

Adding Metafields

In May 2021 the Shopify connector included support for the metafield resource. If you would like to take advantage of these metafields, this package offers corresponding mapping models which append these metafields to the respective source object for the following tables: collection, customer, order, product_image, product, product_variant, shop. If enabled, these models will materialize as shopify__[object]_metafields for each respective supported object. To enable these metafield mapping models, you may use the following configurations within your dbt_project.yml.

Note: These metafield models will contain all the same records as the corresponding staging models with the exception of the metafield columns being added.

vars:
  shopify_using_all_metafields: True ## False by default. Will enable ALL metafield models. FYI - This will override all other metafield variables.
  shopify_using_collection_metafields: True ## False by default. Will enable ONLY the collection metafield model.
  shopify_using_customer_metafields: True ## False by default. Will enable ONLY the customer metafield model.
  shopify_using_order_metafields: True ## False by default. Will enable ONLY the order metafield model.
  shopify_using_product_metafields: True ## False by default. Will enable ONLY the product metafield model.
  shopify_using_product_image_metafields: True ## False by default. Will enable ONLY the product image metafield model.
  shopify_using_product_variant_metafields: True ## False by default. Will enable ONLY the product variant metafield model.
  shopify_using_shop_metafields: True ## False by default. Will enable ONLY the shop metafield model.

Changing the Build Schema

By default this package will build the Shopify staging models within a schema titled (<target_schema> + _stg_shopify) and the Shopify final models within a schema titled (<target_schema> + _shopify) in your target database. If this is not where you would like your modeled Shopify data to be written to, add the following configuration to your dbt_project.yml file:

# dbt_project.yml

models:
  shopify:
    +schema: my_new_schema_name # leave blank for just the target_schema
  shopify_source:
    +schema: my_new_schema_name # leave blank for just the target_schema

Change the source table references

If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:

IMPORTANT: See this project's dbt_project.yml variable declarations to see the expected names.

# dbt_project.yml

vars:
    shopify_<default_source_table_name>_identifier: your_table_name 

Lookback Window

Records from the source can sometimes arrive late. Since several of the models in this package are incremental, by default we look back 7 days to ensure late arrivals are captured while avoiding the need for frequent full refreshes. While the frequency can be reduced, we still recommend running dbt --full-refresh periodically to maintain data quality of the models. For more information on our incremental decisions, see the Incremental Strategy section of the DECISIONLOG.

To change the default lookback window, add the following variable to your dbt_project.yml file:

vars:
  shopify:
    lookback_window: number_of_days # default is 7

Change the calendar start date

Our date-based models start at 2019-01-01 by default. To customize the start date, add the following variable to your dbt_project.yml file:

vars:
  shopify:
    shopify__calendar_start_date: 'yyyy-mm-dd' # default is 2019-01-01

(Optional) Step 7: Orchestrate your models with Fivetran Transformations for dbt Core™

Expand for details

Fivetran offers the ability for you to orchestrate your dbt project through Fivetran Transformations for dbt Core™. Learn how to set up your project for orchestration through Fivetran in our Transformations for dbt Core setup guides.

🔍 Does this package have dependencies?

This dbt package is dependent on the following dbt packages. Please be aware that these dependencies are installed by default within this package. For more information on the following packages, refer to the dbt hub site.

IMPORTANT: If you have any of these dependent packages in your own packages.yml file, we highly recommend that you remove them from your root packages.yml to avoid package version conflicts.

packages:
    - package: fivetran/shopify_source
      version: [">=0.12.0", "<0.13.0"]

    - package: fivetran/fivetran_utils
      version: [">=0.4.0", "<0.5.0"]

    - package: dbt-labs/dbt_utils
      version: [">=1.0.0", "<2.0.0"]

    - package: calogica/dbt_date
      version: [">=0.9.0", "<1.0.0"]
      
    - package: dbt-labs/spark_utils
      version: [">=0.3.0", "<0.4.0"]

🙌 How is this package maintained and can I contribute?

Package Maintenance

The Fivetran team maintaining this package only maintains the latest version of the package. We highly recommend you stay consistent with the latest version of the package and refer to the CHANGELOG and release notes for more information on changes across versions.

Contributions

A small team of analytics engineers at Fivetran develops these dbt packages. However, the packages are made better by community contributions!

We highly encourage and welcome contributions to this package. Check out this dbt Discourse article on the best workflow for contributing to a package!

🏪 Are there any resources available?

  • If you have questions or want to reach out for help, please refer to the GitHub Issue section to find the right avenue of support for you.
  • If you would like to provide feedback to the dbt package team at Fivetran or would like to request a new dbt package, fill out our Feedback Form.
  • Have questions or want to be part of the community discourse? Create a post in the Fivetran community and our team along with the community can join in on the discussion!

dbt_shopify's People

Contributors

dylanbaker avatar fivetran-avinash avatar fivetran-catfritz avatar fivetran-chloe avatar fivetran-jamie avatar fivetran-joemarkiewicz avatar fivetran-reneeli avatar fivetran-sheringuyen avatar fraserharris avatar jlmendgom5tran avatar kristin-bagnall avatar lizdeika avatar martinguindon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbt_shopify's Issues

[Bug] Incremental Load in `shopify__customer_cohorts` has NULL fields

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

Hi Fivetran folks,

Firstly, thank you for your hard work in maintaining this repository, it has helped me tremendously.

For context, I'm using Fivetran and to extract Shopify data, transform it with this repo, and load it into AWS Redshift connector. I started using this connector since last week and I only have data for July and August 2024

The problem I'm facing is in the table shopify__customer_cohorts, the lifetime columns total_price_lifetime, order_count_lifetime, line_item_count_lifetime and cohort_month_number are having NULL values. The image below shows the NULLs:

Screenshot 2024-08-09 at 9 38 38 PM

I believe this problem occurs because the incremental load CTE -- backfill_lifetime_sums -- gives NULL value. Here is the complied dbt script of the CTE:

backfill_lifetime_sums as (
    -- for incremental runs we need to fetch the prior lifetimes to properly continue adding to them
    select source_relation,
        customer_id,
        max(total_price_lifetime) as previous_total_price_lifetime,
        max(order_count_lifetime) as previous_order_count_lifetime,
        max(line_item_count_lifetime) as previous_line_item_count_lifetime,
        max(cohort_month_number) as previous_cohort_month_number
    from "dev"."env_var2_shopify"."shopify__customer_cohorts"
    where date_month < '2024-08-01'
    group by 1,
        2
),

I have identified a potential fix, which is to add a COALESCE function in these lines, like below:

...
coalesce(backfill_lifetime_sums.previous_cohort_month_number, 0) + windows.cohort_month_number as cohort_month_number,
coalesce(backfill_lifetime_sums.previous_total_price_lifetime,0) + windows.total_price_lifetime as total_price_lifetime,
coalesce(backfill_lifetime_sums.previous_order_count_lifetime,0)+ windows.order_count_lifetime as order_count_lifetime,
coalesce(backfill_lifetime_sums.previous_line_item_count_lifetime,0) + windows.line_item_count_lifetime as line_item_count_lifetime,
...

I'm interested to hear your perspective. If you think my potential fix is okay, could I raise a PR? Thanks!

Relevant error log or model output

No response

Expected behavior

The lifetime columns should have a value instead of NULL, as shown below:

Screenshot 2024-08-09 at 9 42 42 PM

I got this expected result after adding the COALESCE function as mentioned above

dbt Project configurations


# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'dbt_tutorial'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'dbt_tutorial'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets:         # directories to be removed by `dbt clean`
  - "target"
  - "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

vars: # must add
    shopify_database: dev
    shopify_schema: env_var2

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
  dbt_tutorial:
    # Config indicated by + and applies to all files under models/example/
    example:
      +schema: final
      +materialized: view

Package versions

packages:
- package: fivetran/shopify
  version: [">=0.12.0", "<0.13.0"] 

What database are you using dbt with?

redshift

dbt Version

Core:
  - installed: 1.7.17
  - latest:    1.8.5  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - redshift: 1.7.7  - Update available!
  - postgres: 1.7.17 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

[Feature] Missing field from Orders

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Hi,

First off, thank you for creating this package! It's made bringing Shopify data into our data warehouse so much easier than I was expecting.

However, I've encountered a small issue. There is a status tied to orders that shows if an order is open, archived or canceled. It doesn't appear in the dataset I'm getting from the package and is causing some issues with reconciling reporting with what is available in the Shopify front-end. I can't find any way to identify a manually archived order without this status field.

I'm wondering if the field can be added?

Describe alternatives you've considered

I'm pretty new to using dbt, so I'm not fully au fait with the process of altering a package for my own use. I'm more than willing to make the change if you guys are able to provide assistance.

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

Shopify's documentation in the field in question: https://help.shopify.com/en/manual/orders/order-status

BUG - Null issue from Mongo DB records

Are you a current Fivetran customer?
yes

Describe the bug
whenever data is loaded from mongo db a null value record is generated for every sync at data warehouse end
Steps to reproduce

  1. Just synced data from mongo db
  2. Then a record with null is generated once the sync is done at DWH
    3.Creating error in reports

Expected behavior
For Every sync these null values have to be avoided.

Project variables configuration

copy dbt_project.yml here

Package Version

copy packages.yml here

Warehouse

  • BigQuery
  • Redshift
  • [X ] Snowflake
  • Postgres
  • Databricks
  • Other (provide details below)

Additional context

Screenshots

Please indicate the level of urgency

Are you interested in contributing to this package?

  • Yes, I can do this and open a PR for your review.
  • Possibly, but I'm not quite sure how to do this. I'd be happy to do a live coding session with someone to get this fixed.
  • No, I'd prefer if someone else fixed this. I don't have the time and/or don't know what the root cause of the problem is.

[Feature] Shopify customers to be based on email address rather than customer ID

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

The Fivetran Shopify models for shopify__customers and shopify__customer_cohorts are very helpful packages. However they are based entirely around the Shopify customer_id.

In Shopify, when a customer checks out as a guest, a new customer_id is created every time, even if the customer uses the same email address.

It would be a great improvement in the accuracy of these models to base the models on customer email (or hashed email address if that is used) rather than customer id. This will improve the accuracy of new vs returning customers as well as lifetime value etc.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Bug] shopify__orders. fixed_amount_discount_amount is calculated incorrectly

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

The field shopify__orders. fixed_amount_discount_amount is filtering for, and aggregating Shipping discounts

sum(case when type = 'shipping' then amount else 0 end) as fixed_amount_discount_amount,

Relevant error log or model output

No response

Expected behavior

The field shopify__orders. fixed_amount_discount_amount is is calculated and filtering for fixed_amount coupons.

dbt Project configurations

N/A

Package versions

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  - package: fivetran/shopify
    version: [">=0.11.0", "<0.12.0"]

What database are you using dbt with?

bigquery

dbt Version

Core:
  - installed: 1.5.2
  - latest:    1.7.11 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - bigquery: 1.5.3 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Additional Context

https://shopify.dev/docs/api/admin-rest/2024-04/resources/order#resource-object shows there are three discount code types, fixed_amount, percentage and shipping. This looks like a copy/paste error, and shipping should be updated to fixed_amount.

Ex:

Current
sum(case when type = 'shipping' then amount else 0 end) as fixed_amount_discount_amount,
Fix
sum(case when type = 'fixed_amount' then amount else 0 end) as fixed_amount_discount_amount,

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

[Feature] Set Models to Incremental Materializations

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Adapter: Snowflake
Would love an easy way to set the output tables to refresh incrementally. Currently, tables are replaced every time a dbt job is run. Because of this, I am unable to create snowflake streams or dynamic tables that reference these tables as a source. If the tables were refreshed incrementally, the links to these dynamic tables would not break.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Feature] Customizable start date for shopify__calendar

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Currently the Shopify__calendar has a start date of 1/1/2019. I have orders and customers from 2018. Under the current structure customer_chort has customers showing 0 orders in month 1. To be in a given cohort you need to have at least 1 order in month 1 since that is what qualifies you. These customers had their first order in 2018 but their cohort_month 1 is calculated starting 1/1/2019. I've attached sample data to illustrate the scenario. I've removed the customer_id everything else is as created by the dbt package.

customer_cohorts_example_data.csv

Describe alternatives you've considered

I could clone the repository and modify it, but I'm just learning this stuff so I'd prefer a variable or earlier start data for the spine.

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

Multiple Shopify Shops

In our setup, we have multiple shopify shops (they all land in different schemas in snowflake) and would like to use this package on all of them. Is there any way to achieve this? Or maybe point me in the right direction and I'd be more than glad to generate a PR?

Thanks for the package guys!

Add refund amounts to this package

Add logic to account for both:

  • Order lines refund amounts
  • Order refund amounts

Complications:

  • Need to figure out how to address partial refunds and shipping refunds. What about discounts?

[Feature] Add postgres compatibility

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Add postgres compatibility

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

BUG - accept `null` values in JSON

Are you a current Fivetran customer?
Yes -- Blue Bottle Coffee

Describe the bug
A database error occurs when there are null values in the JSON. Example JSON in the receipt column of stg_shopify__transaction below:

{
  "charges": {
    "data": [
      {
        "balance_transaction": {
          "exchange_rate": null
        }
      }
    ]
  }
}

Steps to reproduce
When running the shopify__transactions model, I see this error:

Database Error in model shopify__transactions (models/shopify__transactions.sql)
  Invalid input syntax for type numeric
  DETAIL:  
    -----------------------------------------------
    error:  Invalid input syntax for type numeric
    code:      8001
    context:   value: ""
    query:     1559843
    location:  cg_util.cpp:621
    process:   query6_100_1559843 [pid=29142]
    -----------------------------------------------

Package Version

  - package: fivetran/shopify
    version: 0.4.1
    # documentation: https://github.com/fivetran/dbt_shopify

Warehouse

  • BigQuery
  • Redshift
  • Snowflake
  • Postgres
  • Databricks
  • Other (provide details below)

Duplicate aggregates for customers

There are duplicate columns for two dimensionalized facts for customers; one set of columns is provided by the raw data, and another is introduced in the shopify__customers_order_aggregates model.

These two sets of columns are equivalent and return the same data:

stg_shopify__customer (raw) shopify__customers_order_aggregates
orders_count number_of_orders
total_spent lifetime_total_price

As discussed with Kristin, I'll issue a PR that will fix this by removing the duplicate aggregates from the shopify__customers_order_aggregates model.

This will be a breaking change, as anyone using this package and using the number_of_orders or lifetime_total_price columns will need to update their models to use the ones provided by the source.

[Feature] Obtain the Exchange Rate field from somewhere else other than the `receipt` field in Transactions

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Currently, the package obtains the exchange_rate field from the Transactions object in a JSON field titled receipt. This works to obtain the exchange rate provided within the receipt. However, the receipt for different payment processors is not always the same and also may not be entirely accurate (especially when considering returns).

This FR is to leverage some other method of obtaining the exchange_rate field to be used within the downstream transformation logic in this package.

Describe alternatives you've considered

At the moment we will continue to leverage the receipt field in the Transactions table.

Unfortunately, we are unaware if there is another solution. But this FR is intended for the team and community to ideally find an alternative for the package to leverage.

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Feature] Add metrics to Cohort model like Net Sales & Gross Margin

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

As an analyst at an ecommerce company, I would like to report on Shopify results using financial metrics like Gross Sales, Net Sales, Gross Margin, etc.

It would be helpful to calculate these metrics in the shopify__customer_cohorts model. Ecommerce companies typically measure lifetime_value using Net Sales & Gross Margin to determine profitability of cohorts. It would be helpful to have each of the above metrics included in the monthly and lifetime columns of this model.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

See the Shopify Finances Report for examples of metrics that could be added to this Fivetran package.

[Question] after updatting to 0.12.0 package orders, order_lines and transactions failed

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

After we updated dbt package from 0.11.0 to 0.12.0 3 tables started to failing ( an hour earlier, on older version dbt run completed successfully )

Relevant error log or model output

10:57:30
10:57:30    Database Error in model shopify__transactions (models/shopify__transactions.sql)
  000904 (42000): SQL compilation error: error line 3 at position 16
  invalid identifier 'TRANSACTIONS_UNIQUE_ID'
  compiled Code at target/run/shopify/models/shopify__transactions.sql
10:57:30  
10:57:30
10:57:30    Database Error in model shopify__order_lines (models/shopify__order_lines.sql)
  000904 (42000): SQL compilation error: error line 3 at position 16
  invalid identifier 'ORDER_LINES_UNIQUE_KEY'
  compiled Code at target/run/shopify/models/shopify__order_lines.sql
10:57:30  
10:57:30
10:57:30    Database Error in model shopify__orders (models/shopify__orders.sql)
  000904 (42000): SQL compilation error: error line 3 at position 16
  invalid identifier 'ORDERS_UNIQUE_KEY'
  compiled Code at target/run/shopify/models/shopify__orders.sql
10:57:30  
10:57:30
10:57:30  Done. PASS=352 WARN=0 ERROR=3 SKIP=7 TOTAL=362

Expected behavior

to run without issues

dbt Project configurations

vars:
 shopify_database: raw
  shopify_union_schemas: ['shopify_uk','shopify_ca','shopify_fr']
  shopify_using_fulfillment_event: true # false by default
  shopify_source:
    order_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
    order_line_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
    customer_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
    order_line_refund_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
    daily_shop_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
    customer_cohorts_pass_through_columns:
      - name: "region"
        alias: "region"
        transform_sql: "case when source_relation = 'shopify_uk' then 'UK' when source_relation = 'shopify_fr' then 'FR' else 'CA' end"
#    shopify_timezone: "case when source_relation = 'shopify_uk' then 'Europe/London' when source_relation = 'shopify_fr' then 'Europe/Paris' else 'America/Toronto' end"


models: 
 shopify:
    +schema: shopify # leave blank for just the target_schema
    intermediate:
      +schema: staging # leave blank for just the target_schema
  shopify_source:
    +schema: staging # leave blank for just the target_schema


Package versions

  • package: fivetran/shopify
    version: [">=0.12.0", "<0.13.0"]

What database are you using dbt with?

snowflake

dbt Version

latest

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

[Bug] shopify__transactions model is failing due to incomplete JSON field in Redshift

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

It has come to our attention that for some users leveraging the Shopify connector on Redshift, there are some receipt records from the transactions table that are being inserted into their warehouse as a string "null". This is not a support JSON object and also is not a compatible null entry as it is recorded as a string.

This behavior ultimately results in a failure when trying to parse the receipt field within the shopify__transactions model as the string "null" is not interpreted as null or a JSON field. Thus resulting in the below error.

While we should focus on ensuring the connector syncs complete records as this should be expected, but we should also explore how we may make the data model more flexible. We could take a similar approach to JSON catching before parsing as what we did in the recent Fivetran Platform data model release with the inclusion of the fivetran_log_fivetran_parse() macro to allow for better error handling.

This could also be a good time to consider applying the update to fivetran_utils. However, I would want to consider the testing implications before then.

Relevant error log or model output

Error executing query:
SQL Error [XX000]: ERROR: JSON parsing error
  Detail: 
  -----------------------------------------------
  error:  JSON parsing error
  code:      8001
  context:   invalid json object null
  query:     646212249
  location:  loc
  process:   query12345

Expected behavior

The Shopify data model is able to properly parse the compatible JSON fields and ignores the non compatible ones.

dbt Project configurations

Default config for Quickstart

Package versions

v0.10.0

What database are you using dbt with?

redshift

dbt Version

v1.3.1

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

[Feature] Leverage macro to ensure no fan outs

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

With the removal of the dbt_expectations.expect_table_row_count_to_equal_other_table in PR #66 , we'll need to come up with an in house macro that will check that the metafield mondels have the same row count as the staging models.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

Error in models/shopify__transactions.sql Related to Calculated Exchange Rates for Refunds

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

The exchange rate related columns are calculated via this code:

    select
        *,
        coalesce(cast(nullif({{ fivetran_utils.json_parse("receipt",["charges","data",0,"balance_transaction","exchange_rate"]) }}, '') as {{ dbt.type_numeric() }} ),1) as exchange_rate,
        coalesce(cast(nullif({{ fivetran_utils.json_parse("receipt",["charges","data",0,"balance_transaction","exchange_rate"]) }}, '') as {{ dbt.type_numeric() }} ),1) * amount as currency_exchange_calculated_amount
    from joined

However, the structure of the JSON object in the receipt column is different for refunds. This leads to the exchange_rate field always falling back to the coalesced value of 1 for refunds transactions. This is subtle for conversions close to 1.0 but become very problematic for conversions rates that are not close to 1.0. For example, CRC which has a rate of ~ 0.0019 USD to 1 CRC.

Relevant error log or model output

Here are examples of the JSON for captures versus refunds (I removed all non-relevant fields, etc.):

capture

{
  "charges": {
    "data": [
      {
        "balance_transaction": {
          "exchange_rate": 0.00184531
        }
      }
    ]
  }
}

refund

{
  "balance_transaction": {
    "exchange_rate": 0.00187629,
    "id": "txn_12345",
    "object": "balance_transaction"
  },
  "charge": {
    "balance_transaction": "txn_12345"
  }
}


### Expected behavior

The expected behavior is to calculate the correct exchange rates, etc. for refund transactions.

### dbt Project configurations

name: chabi
version: 1.0.0
config-version: 2
profile: '{{ env_var(''DBT_PROFILE'', '''') }}'
model-paths:

  • models
    analysis-paths:
  • analyses
    test-paths:
  • tests
    seed-paths:
  • seeds
    macro-paths:
  • macros
    snapshot-paths:
  • snapshots
    target-path: target
    clean-targets:
  • target
  • dbt_packages
    models:
    +persist_docs:
    relation: true
    columns: true
    +sql_header: alter session set timezone = 'America/Los_Angeles'; alter session set week_start = 1; alter session set TIMESTAMP_TYPE_MAPPING = TIMESTAMP_LTZ;
    ad_reporting:
    +schema:
    intermediate:
    +schema: stg
    klaviyo:
    +schema:
    intermediate:
    +schema: stg
    +materialized: table
    klaviyo_source:
    +schema: stg
    shopify:
    +schema:
    shopify_source:
    +schema: stg
    shopify_holistic_reporting:
    +schema:
    intermediate:
    +schema: stg
    microsoft_ads:
    +schema: stg
    microsoft_ads_source:
    +schema: stg
    amazon_ads:
    +schema: stg
    amazon_ads_source:
    +schema: stg
    google_ads:
    +schema: stg
    google_ads_source:
    +schema: stg
    tiktok_ads:
    +schema: stg
    tiktok_ads_source:
    +schema: stg
    facebook_ads:
    +schema: stg
    facebook_ads_source:
    +schema: stg
    snapchat_ads:
    +schema: stg
    snapchat_ads_source:
    +schema: stg
    vars:
    iana_timezone: America/Los_Angeles
    snapchat_schema: SNAPCHAT_ADS
    pinterest_schema:
    amazon_ads_schema: AMAZON_ADS
    google_ads_schema: GOOGLE_ADS
    tiktok_ads_schema: TIKTOK_ADS
    facebook_ads_schema: FACEBOOK_ADS
    linkedin_ads_schema:
    microsoft_ads_schema: BINGADS1
    pinterest__using_keywords: false
    ad_reporting__amazon_ads_enabled: true
    ad_reporting__google_ads_enabled: true
    ad_reporting__reddit_ads_enabled: false
    ad_reporting__tiktok_ads_enabled: true
    ad_reporting__twitter_ads_enabled: false
    ad_reporting__facebook_ads_enabled: true
    ad_reporting__linkedin_ads_enabled: false
    ad_reporting__snapchat_ads_enabled: true
    ad_reporting__microsoft_ads_enabled: true
    ad_reporting__pinterest_ads_enabled: false
    amazon_ads__portfolio_history_enabled: false
    ad_reporting__apple_search_ads_enabled: false
    klaviyo_schema: KLAVIYO
    shopify_schema: SHOPIFY

### Package versions

packages:

  • package: dbt-labs/codegen
    version: 0.9.0
  • package: dbt-labs/dbt_utils
    version:
    • '>=1.0.0'
    • <2.0.0
  • package: fivetran/ad_reporting
    version: 1.5.0
  • package: fivetran/klaviyo
    version: 0.5.0
  • package: fivetran/shopify
    version: 0.8.1
  • package: fivetran/shopify_holistic_reporting
    version: 0.4.0

### What database are you using dbt with?

snowflake

### dbt Version

Core:

  • installed: 1.5.0
  • latest: 1.7.2 - Update available!

Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:

  • snowflake: 1.5.0 - Update available!

At least one plugin is out of date or incompatible with dbt-core.
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation


### Additional Context

_No response_

### Are you willing to open a PR to help address this issue?

- [ ] Yes.
- [X] Yes, but I will need assistance and will schedule time during our [office hours](https://calendly.com/fivetran-solutions-team/fivetran-solutions-team-office-hours) for guidance
- [ ] No.

[Feature] Setting multiple timezones for unioned multiple connectors

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

We are using multiple shopify connectors, each of them runs in different timezone - atm I am able to assign only one timezone per package - it would be great to be able to assign a particular timezone to particular connector ( ie Europe/London -> shopify_uk , Europe/Paris - > shopify_fr. Otherwise unioning multiple connections have sense only if they are in the same timezone.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Bug] Compilation error in model int_shopify__order__shipping_aggregates

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

When trying to upgrade from the fivetran/shopify dbt package version 0.7 to 0.8 I'm experiencing the following Compilation Errors in the models.

However when I updated from DBT-bigquery 1.4.0 to 1.4.5 the error disappeared.

This probably means that the require-dbt-version needs to be updated from [">=1.3.0", "<2.0.0"] to [">=1.4.5", "<2.0.0"] for the version 0.8.0 to work.

Kind regards,
Kristof.

Relevant error log or model output

`Compilation Error in model int_shopify__order__shipping_aggregates (models/intermediate/int_shopify__order__shipping_aggregates.sql)
13:27:09    dbt was unable to infer all dependencies for the model "int_shopify__order__shipping_aggregates".
13:27:09    This typically happens when ref() is placed within a conditional block.
13:27:09    
13:27:09    To fix this, add the following hint to the top of the model "int_shopify__order__shipping_aggregates":
13:27:09    
13:27:09    -- depends_on: {{ ref('stg_shopify__order_shipping_line') }}
13:27:17  20 of 24 OK created sql table model dev_kristof_base.shopify__order_lines ...... [CREATE TABLE (4.8m rows, 2.3 GB processed) in 8.45s]
13:27:17  22 of 24 SKIP relation dev_kristof_base.shopify__products due to ephemeral model error  [ERROR SKIP]
13:27:17  Compilation Error in model int_shopify__products_with_aggregates (models/intermediate/int_shopify__products_with_aggregates.sql)
13:27:17    dbt was unable to infer all dependencies for the model "int_shopify__products_with_aggregates".
13:27:17    This typically happens when ref() is placed within a conditional block.
13:27:17    
13:27:17    To fix this, add the following hint to the top of the model "int_shopify__products_with_aggregates":
13:27:17    
13:27:17    -- depends_on: {{ ref('stg_shopify__collection_product') }}
13:27:17  23 of 24 SKIP relation dev_kristof_entities.products due to ephemeral model error  [ERROR SKIP]
13:27:17  Compilation Error in model int_shopify__products_with_aggregates (models/intermediate/int_shopify__products_with_aggregates.sql)
13:27:17    dbt was unable to infer all dependencies for the model "int_shopify__products_with_aggregates".
13:27:17    This typically happens when ref() is placed within a conditional block.
13:27:17    
13:27:17    To fix this, add the following hint to the top of the model "int_shopify__products_with_aggregates":
13:27:17    
13:27:17    -- depends_on: {{ ref('stg_shopify__collection_product') }}
13:27:17  24 of 24 SKIP relation dev_kristof_bi.product_assortment_analysis due to ephemeral model error  [ERROR SKIP]
13:27:17  Compilation Error in model int_shopify__products_with_aggregates (models/intermediate/int_shopify__products_with_aggregates.sql)
13:27:17    dbt was unable to infer all dependencies for the model "int_shopify__products_with_aggregates".
13:27:17    This typically happens when ref() is placed within a conditional block.
13:27:17    
13:27:17    To fix this, add the following hint to the top of the model "int_shopify__products_with_aggregates":
13:27:17    
13:27:17    -- depends_on: {{ ref('stg_shopify__collection_product') }}
13:27:17  
13:27:17  Finished running 9 view models, 15 table models in 0 hours 0 minutes and 24.21 seconds (24.21s).
13:27:17  
13:27:17  Completed with 4 errors and 0 warnings:
13:27:17  
13:27:17  Compilation Error in model.shopify.shopify__orders, caused by compilation error in referenced ephemeral model model.shopify.int_shopify__order__shipping_aggregates
13:27:17  
13:27:17  Compilation Error in model.shopify.shopify__products, caused by compilation error in referenced ephemeral model model.shopify.int_shopify__products_with_aggregates
13:27:17  `

Expected behavior

These models come fresh out of the box so I would expect those models to generate without an issue.

dbt Project configurations

vars:
shopify_union_schemas: [
'shopify_store_nl',
'shopify_store_fr',
'shopify_store_en',
] # use this if the data is in different schemas/datasets of the same database/project
shopify_database: 'pitpit-data-raw'
shopify_using_fulfillment_event: false

Package versions

packages:

  • package: fivetran/shopify
    version: ['>=0.8.0', '<0.9.0']

What database are you using dbt with?

bigquery

dbt Version

Core:

  • installed: 1.4.1
  • latest: 1.4.5 - Update available!

Your version of dbt-core is out of date!
You can find instructions for upgrading here:
https://docs.getdbt.com/docs/installation

Plugins:

  • bigquery: 1.4.0 - Update available!

At least one plugin is out of date or incompatible with dbt-core.

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance and will schedule time during our office hours for guidance
  • No.

QUESTION - Dynamically Generate Schemas to Union

Are you a Fivetran customer?
Yes. Russ Bradberry, Thrasio

Your Question
How do we dynamically supply schema names to union. We have hundreds of schemas and are adding new ones all the time, we can easily get the list of schemas we need by running the following SQL:

    SELECT LOWER(schema_name) AS schema_name
      FROM source_fivetran.information_schema.schemata
     WHERE schema_name ILIKE 'shopify_%'

So we would like to create a macro and somehow pass that list along without using the vars property in dbt_project.yml

Additional context
we tried creating a macro like the following:

{% macro shopify_schemas(prefix='shopify_') %}

{% set shopify_schema_query %}
    SELECT LOWER(schema_name) AS schema_name
      FROM source_fivetran.information_schema.schemata
     WHERE schema_name ILIKE '{{ prefix }}%'
{% endset %}

{% set results = run_query(shopify_schema_query) %}

{% if execute %}
{%   set results_list = results.columns[0].values() %}
{% else %}
{%   set results_list = [] %}
{% endif %}

{{  log(results_list) }}
{{ return(results_list) }}

{% endmacro %}

and using it like:

vars:
  shopify_database: source_fivetran
  union_schemas: "{{ shopify_schemas() }}"

but it seems that the yaml is interpreting the list as a string so wasn't able to make any progress this way.

Please indicate the level of urgency and business impact of this request
While we can attempt to maintain a static list, it isn't scalable as the team doing the fivetran integrations is not the same as the team maintaining DBT so things may get lost, ideally we would be able to dynamically generate the list to avoid this requirement.

[Feature] Update readme

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Readme needs to be upgraded

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Feature] Shopify v2022-07 -> v2023-04 API Update Schema Change Considerations

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

Shopify is upgrading from their v2022-07 to v2023-04 API. This is resulting in a number of deprecations and new field additions. For this package (and likely the downstream dbt_shopify package), the following changes will need to be made to reflect the connector changes:

  • OrderLine Resource (ORDER_LINE Table)
    • origin_location (deprecated)
    • destination_location (deprecated)
    • fulfillment_service (deprecated)
  • Order Resource
    • total_price_usd (deprecated)
    • processing_method (2023-04) (unsupported)
  • ProductVariant Resource (PRODUCT_VARIANT Table)
    • title (deprecated)

Describe alternatives you've considered

This will still likely work following the API upgrade, but it would be best to upgrade to ensure our package is using the latest fields provided by the API and connector.

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

[Feature] explore removing references to the transactions table

Is there an existing feature request for this?

  • I have searched the existing issues

Describe the Feature

we've confirmed that monetary values stored in non-transaction tables (orders, order_lines, etc) are reported in the shop currency, so perhaps we don't need to use the transaction table's exchange rate data to get order totals.

Describe alternatives you've considered

No response

Are you interested in contributing this feature?

  • Yes.
  • Yes, but I will need assistance and will schedule time during your office hours for guidance.
  • No.

Anything else?

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.