Coder Social home page Coder Social logo

dbt-assertions's People

Contributors

axelthevenot avatar b-per avatar vvaneeclo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dbt-assertions's Issues

Add integration tests to facilitate adding support for new adapters

It would be great if the package included some integration tests, like in dbt_project_evaluator for example: https://github.com/dbt-labs/dbt-project-evaluator/tree/main/integration_tests

While it would be awesome to have those tests run in a CI/CD pipeline, before that, having tests defined would allow more people to contribute, making sure that their implementation is working.

The way it works in dbt_project_evaluator is that the folder is a dbt project itself, that installs the package from .. and people can then cd to the project and do a dbt build, which run models and also tests that the output of those models matches some predefined seeds.

feat(doc): add script to override dbt docs

Add a script to overide dbt docs generate results (from manifest.json ?) in order to add documentation about the assertions in the .yml definition which are ignored by dbt.

This can be a simple Python script.

I think a resulting Markdown table will be a good idea.

Example

assertion description expression null_as_error
unique Rows must be unique over columns (key_1, key_2) n/a false
open_date_is_past Open date must be in the past open_date <= CURRENT_DATE() false
... ... ... ...

feat(soft_unique): add `__soft_unique__` helper

__unique__ helper will flag duplicated rows but some usages requires to flag rows without having to remove all of them but keeping one version of each.

__soft_unique__ can be a new helper replacing 1 = COUNT(1) OVER(...) by 1 < ROW_NUMBER() OVER(...)
But how to discriminate rows with an ordering ?

Maybe rethink the v2 with

__unique__:
    by:
        - key_1
        - key_2
    soft: true
    order_by: <str> | <List<str>>
__not_null__:
    by:
        - key_1
        - key_2
    layered: true

Add support for Snowflake

First of all, thanks for releasing this package! I really think that it offers a new way to solve specific problems in dbt.

Now, from an adapter point of view, I am keen to help supporting Snowflake.

Arrays functions are so different between different warehouses that there is a bit of work required to make it work.

For now, I have tested snowflake___assertions_expression which seems to work

{%- macro snowflake___assertions_expression(from_column, assertions) -%}

ARRAY_CAT(
    {%- for assertion_id, assertion_config in assertions.items() %}

    {%- set expression      = assertion_config.expression if '\n' not in assertion_config.expression else assertion_config.expression | indent(12) -%}
    {%- set description     = assertion_config.description -%}
    {%- set null_as_error = 'FALSE' if (assertion_config.null_as_error is not defined or assertion_config.null_as_error is true) else 'TRUE' %}

    /* {{ assertion_id }}: {{ description }} */
    {{ 'ARRAY_CAT(' if not loop.last }}
    IFF(
        COALESCE({{ expression }}, {{ null_as_error }}) = FALSE,
        ['{{ assertion_id }}'],
        []
    ){{ ',' if not loop.last }}
    {%- endfor %}
    {{ '[]' if assertions | length == 0}}
    {{ ')' * (assertions | length - 1) ~ ','}}
    []
) AS {{ from_column }}

{%- endmacro %}

I will tackle assertions_filter next week.

Question:

Would it make sense to rename some of the default__assertions_filter macros to bigquery__assertions_filter? The difference is so significant for some macros that I think it makes sense to do so.

feat(contrib): add pull request & issue templates

In order to facilitate contributions create a PR template.

In order to facilitate ideas and issue tracking, open issues template for at least new feature request and bugs.

Don't know yet how, but keep it simple for both

Terminology - rename `__unique__` to `__unique_combination__` (and/or change behaviour of `__unique__`)

I looked at __unique__ and __not_null__ this morning and got a bit confused by the different behaviour when listing multiple columns. The README is accurate but I was surprised by how the 2 macros handle multiple columns.

Example with col1 and col2 checked:

  • __unique__ will check that the combination of col1 and col2 is unique
  • __not_null__ will check that the each of col1 and col2 is unique, creating 2 different assertions

I feel that there is an opportunity to maybe update __unique__ to align it with __not_null__ and maybe create an additional __unique_combination__. With the same use case as above:

  • __unique__ could check the uniqueness of each column, creating 2 assertions
  • __unique_combination__ could check the overall uniqueness of the combination of columns (like __unique__ today)

What do you think?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.