Coder Social home page Coder Social logo

Contract identity issue about soda-core HOT 5 OPEN

tombaeyens avatar tombaeyens commented on May 26, 2024
Contract identity issue

from soda-core.

Comments (5)

tools-soda avatar tools-soda commented on May 26, 2024

SAS-3274

from soda-core.

tombaeyens avatar tombaeyens commented on May 26, 2024

The problem with the current identity generation is that dataset level checks often will need an identity_suffix. We need to improve the default behavior so that in less circumstances the user needs to fiddle with the identity.

from soda-core.

tombaeyens avatar tombaeyens commented on May 26, 2024

Goal: The goal is create a better correlation mechanism between checks in the files and checks in Soda Cloud. We want to impose the least amount of burden on the user and avoid the need for users to configure correlation ids themselves. But we also want to ensure that users can edit contract files freely and move checks around while still preserving the identity. Not preserving the identity causes a new check on Soda Cloud being created, loosing the metric history and historic check results.

Proposed solution 1:

The check identity is composed of

  • scope
    • warehouse_name
    • schema_name
    • dataset_name
    • column_name
    • check type
  • check type specific correlation properties
    • expression_sql hash for metric_expression checks
    • query_sql hash for metric_query checks
  • user defined correlation_id

Users then only have to specify a correlation_id if the scope and check type specific correlation properties do not distinctly identify a check in the source YAML file.

Proposed solution 2:

Check identity is a composition of

  • warehouse_name
  • schema_name
  • dataset_name
  • column_name
  • check type
  • check name

We can use the check name to provide uniqueness. But in that case, users have to know that changing the name potentially will change the check id and hence break history in Soda Cloud.

from soda-core.

aabf avatar aabf commented on May 26, 2024

In the hellofresh use case, the name is handled as a unique ID, and the user must fill it.
I think the solution 2 can be the best one. We can write our docs explaining that the name is also linked to the soda cloud, and by changing the name, the history will be lost. But I think the users expect that because it is a unique ID for us.

from soda-core.

tombaeyens avatar tombaeyens commented on May 26, 2024

More analysis notes:

Solution 2 would also work best on the Soda Cloud backend.
Based on the correlation properties, contracts lib should create an identity property in the genrated SodaCL using a hash (and not the full property serialized text)
We can offer a renaming check workflow through an name_deprecated property. That should translate to an identity_deprecated in SodaCL. Soda Cloud backend work should be planned to support the identity migration. The changes needed in Soda Cloud backend should not be too big as the data model is matching this strategy.

from soda-core.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.