raiffeisen-dgtl / checkita-data-quality Goto Github PK
View Code? Open in Web Editor NEWFast data quality framework for modern data infrastructure
License: GNU Lesser General Public License v3.0
Fast data quality framework for modern data infrastructure
License: GNU Lesser General Public License v3.0
Implement following project functionality by means of GitHub actions:
Move to open source the major updates on Checkita Data Quality. Framework have been significantly refactored. Configuration API is changed as well.
It would be good to have functionality that also allows customisation of emails' subjects.
Since this project was published, there are some features had been added internally. This issue addresses their implementation into open-source version of the framework.
One of the most commonly used check for data sources is to search for duplicate values. It could be either a values from single column, or a tuple of values from several columns, depending on what is primary key in the source.
Currently, in order to do that it is required to calculate two metrics: find the number of distinct values and number of rows. Then, it is possible to create check that will indicate the presence of duplicates in the data set by comparing these two metrics.
It would be good to have a separate metric for this case, that will also allows for error collection when duplicate values are found.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.