Coder Social home page Coder Social logo

google / data-quality-monitor Goto Github PK

View Code? Open in Web Editor NEW
26.0 7.0 8.0 2.02 MB

Data Quality Monitor (DQM) - Continuously validate your data with easy, customizable rules.

License: Apache License 2.0

Makefile 0.77% Python 34.44% HCL 8.14% Shell 0.33% JavaScript 0.04% TypeScript 54.83% HTML 0.77% CSS 0.67%
bigquery cloudstorage data-quality-checks google-cloud-platform gcp python terraform

data-quality-monitor's Introduction

Data Quality Monitor

Continuously validate your data with easy, customizable rules

Context

Data is the most important part of a modern business strategy. However, it's hard to maintain the robust foundation necessary for supporting data-driven decisions.

Data Quality Monitor (DQM) aims to empower clients with an easy way to monitor their data. It runs on Google Cloud Platform (GCP) and can act on any data sitting in BigQuery, including exports from various Google Ads & Marketing Platform connectors. The checks/rules are configured with a simple JSON file and managed with scheduled Cloud Workflows. The output are logs that can be visualized and monitored for subsequent action. We also provide templates for common use cases.

Disclaimer

  • DQM is fully owned and managed by you, within your GCP project.
  • DQM os open-source and free - you only pay for the underlying GCP resource usage.
  • DQM has a Webapp to manage the configuarations etc. It is optional to install. Want to know more about DQM Webapp? DQM Webapp

Resources

Join the Google group to:

  • View the slide deck with a high level pitch, the solution architecture, and example use cases.
  • Receive email updates on new features and updates.
  • Connect with DQM's developers and other users.

DQM and WebApp Deployment Architecture

DQM is deployed using Terraform, which comes pre-installed on Google Cloud Shell.

Either follow the manual installation guide, linked below or directly click on the following Google Cloud console button and follow the interactive steps for the complete installation in your Google cloud account.

Open in Cloud Shell

Please follow along in the installation docs.

DQM is configured with simple JSON files, stored on Google Cloud Storage. If you install DQM Webapp, you do not need to manage these configuration files via Json file. You can create as many config files you want via simple, dynamically generated form.

You can read more in the configuration docs.

DQM can be automated with Cloud Scheduler. Simply set pause_scheduler to false in the .tfvars file during deployment.

It outputs extensive logging, which can be leveraged for notifications or dashboards.

You can read more in the usage docs.

We provide DQM as an open-source solution, so you can contribute to or expand on its features.

You can learn more in the development docs.

License

This is not an officially supported Google product.

Copyright 2023 Google LLC. This solution, including any related sample code or data, is made available on an "as is", "as available", and "with all faults" basis, solely for illustrative purposes, and without warranty or representation of any kind. This solution is experimental, unsupported and provided solely for your convenience. Your use of it is subject to your agreements with Google, as applicable, and may constitute a beta feature as defined under those agreements. To the extent that you make any data available to Google in connection with your use of the solution, you represent and warrant that you have all necessary and appropriate rights, consents and permissions to permit Google to use and process that data. By using any portion of this solution, you acknowledge, assume and accept all risks, known and unknown, associated with its usage, including with respect to your deployment of any portion of this solution in your systems, or usage in connection with your business, if at all.

data-quality-monitor's People

Contributors

achembarpu avatar psnelg avatar vishwamitra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

data-quality-monitor's Issues

Allow google workflow.yaml to filter out DRAFT Config files

Google Cloud workflow.yaml, should be able to process only those config files which do not start with DRAFT in their file name. Whenever the DQM frontend is creating a config file as a DRAFT version, it appends DRAFT at the start of the config file name.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.