Coder Social home page Coder Social logo

cicd-templates's Introduction

Databricks Labs CI/CD Templates: Automated Databricks CI/CD pipeline creation and deployment

asciicast

Demo: https://www.youtube.com/watch?v=Gjns_Z0zxt8&feature=emb_logo

Short instructions:

  1. Install Cookiecutter and dependencies from requirements.txt
  2. cookiecutter [email protected]:databricks/mlflow-deployments.git (or the HTTPS equivalent)
  3. Create new GitHub repo and push created project files there
  4. Add DATABRICKS_HOST and DATABRICKS_TOKEN as Github secrets to the newly created repo
  5. Implement DEV tests in dev-tests folder. These pipelines will be run on every push
  6. Implement Integration Test pipelines in folder integration-test. These pipelines will be used for testing of new release
  7. Implement production pipelines in pipeline folder.

Please note: 1)Python 3.8 is not supported yet

Project Organization

.
├── cicd1
│   └── model.py
├── create_cluster
├── deployment
│   └── databrickslabs_cicdtemplates-0.2.3-py3-none-any.whl
├── deployment.yaml
├── dev-tests
│   ├── pipeline1
│   │   ├── job_spec_aws.json
│   │   ├── job_spec_azure.json
│   │   └── pipeline_runner.py
│   └── pipeline2
│       ├── job_spec_aws.json
│       ├── job_spec_azure.json
│       └── pipeline_runner.py
├── integration-tests
│   ├── pipeline1
│   │   ├── job_spec_aws.json
│   │   ├── job_spec_azure.json
│   │   └── pipeline_runner.py
│   └── pipeline2
│       ├── job_spec_aws.json
│       ├── job_spec_azure.json
│       └── pipeline_runner.py
├── pipelines
│   ├── pipeline1
│   │   ├── job_spec_aws.json
│   │   ├── job_spec_azure.json
│   │   └── pipeline_runner.py
│   └── pipeline2
│       ├── job_spec_aws.json
│       ├── job_spec_azure.json
│       └── pipeline_runner.py
├── requirements.txt
├── run_now
├── run_pipeline
├── runtime_requirements.txt
├── setup.py
└── tests
    └── test_example.py

Project based on the cookiecutter data science project template. #cookiecutterdatascience

Azure Devops Cookiecutter instructions

Once you have created your project/repo for Azure Devops, you should do the following:

  • Create a new Azure Devops Project/pipeline and link it to the "az_dev_ops/azure-pipelines.yml" file in your repo.

  • Create a variable group named "Databricks-environment" that will be used in your az_dev_ops/azure-pipelines.yml pipeline definition.

  • Under that new variable group, create the following variables:

    • DATABRICKS_HOST: Databricks Host without orgid. Example "https://uksouth.azuredatabricks.net".
    • DATABRICKS_TOKEN: Databricks Personal Access Token of the user that will be used to run the automated pipelines.
    • MLFLOW_TRACKING_URI: Normally databricks.
    • DATABRICKS_USERNAME: Username of the system user in the Databricks environment under which the artifacts will be registered.
    • CURRENT_CLOUD: Optional. Use the "CURRENT_CLOUD" environment variable to overwrite the cloud where the data pipelines will run. It takes precedence over the "cloud" parameter in the deployment.yaml file.
  • If you want to change the name of the variable group, you should do it in Azure Devops first and then reflect that name in the variables/group section of your az_dev_ops/azure-pipelines.yml file.

Additional

  • Use the "CURRENT_CLOUD" environment variable to overwrite the cloud where the data pipelines will run. It takes precedence over the "cloud" parameter in the deployment.yaml file.

cicd-templates's People

Contributors

mshtelma avatar dkamotsky avatar miguelperalvo avatar koernigo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.