Coder Social home page Coder Social logo

checkita-data-quality's People

Contributors

cibaa-team-user avatar dmrmlvv avatar gabb1er avatar semantic-release-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

checkita-data-quality's Issues

Develop CI scripts for project

Implement following project functionality by means of GitHub actions:

  • Build and test project in pool requests
  • Verify that pool request title follows angular commits style
  • Build project after push to main branch (release will be added later)
  • Add semantic release CI. It should be started manually to prevent making releases too often
  • Add CI to publish documentation (only during semantic release).

Prepare Majar Version Change (to 1.0)

Move to open source the major updates on Checkita Data Quality. Framework have been significantly refactored. Configuration API is changed as well.

Update with recent internal releases

Since this project was published, there are some features had been added internally. This issue addresses their implementation into open-source version of the framework.

Add `duplicateValues` metric

One of the most commonly used check for data sources is to search for duplicate values. It could be either a values from single column, or a tuple of values from several columns, depending on what is primary key in the source.

Currently, in order to do that it is required to calculate two metrics: find the number of distinct values and number of rows. Then, it is possible to create check that will indicate the presence of duplicates in the data set by comparing these two metrics.

It would be good to have a separate metric for this case, that will also allows for error collection when duplicate values are found.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.