Coder Social home page Coder Social logo

rohit-db / terraform-databricks-lakehouse-blueprints Goto Github PK

View Code? Open in Web Editor NEW

This project forked from databricks/terraform-databricks-lakehouse-blueprints

0.0 0.0 0.0 1.93 MB

Set of Terraform automation templates and quickstart demos to jumpstart the design of a Lakehouse on Databricks. This project has incorporated best practices across the industries we work with to deliver composable modules to build a workspace to comply with the highest platform security and governance standards.

License: Other

Python 28.09% Scala 5.70% HCL 66.21%

terraform-databricks-lakehouse-blueprints's Introduction

Deploy Your Lakehouse Architecture

Purpose

This set of terraform templates is designed to allow every industry practitioner and devops team to get started quickly with the canonical Regulated Industries security best practices and governance setup as well as highly valuable industry libraries and quickstarts directly in your environment.

Lakehouse Blueprints


Details on what is Packaged

What's include in this sequence of Terraform modules?

AWS | Azure

There are 4 main modules which can be composed together. (1-4). There is also a full end-to-end example of a workspace deployment with governance and industry quickstarts included. See the test_aws_full_lakehouse_example for this version.

  1. Creation of Databricks-compliant VPC in aws_base or azure_spoke_vnet (AWS | Azure)
  2. Platform Security Built in to Workspace deployment in aws_customer_managed_vpc and azure_vnet_injected_databricks_workspace module (Private Link, VPC endpoints, and secure connectivity) (AWS | Azure)
  3. Unity Catalog Installation in aws_uc and azure_uc module (AWS | Azure)
  4. Industry Quickstarts with Sample Job and Pre-installed Libraries for Time Series, Common Domain Models (see aws_fs_lakehouse module)
  5. Full End-to-End example on AWS for Composition of all modules above (see below). This composed example is available the examples folder (test_full_aws_lakehouse_example) This can be similarly applied for Azure modules.
module "aws_base" {
  source                      = "../../modules/aws_base/"
  cidr_block                  = var.cidr_block
  tags                        = var.tags
  region                      = var.region
  databricks_account_password = var.databricks_account_password
  databricks_account_id       = var.databricks_account_id
  databricks_account_username = var.databricks_account_username
}

data "aws_vpc" "prod" {
  id = module.aws_base.vpc_id
}


module "aws_customer_managed_vpc" {
  source                      = "../../modules/aws_customer_managed_vpc/"
  databricks_account_id       = var.databricks_account_id
  databricks_account_username = var.databricks_account_username
  databricks_account_password = var.databricks_account_password
  region                      = var.region
  relay_vpce_service          = var.relay_vpce_service
  workspace_vpce_service      = var.workspace_vpce_service
  vpce_subnet_cidr            = cidrsubnet(data.aws_vpc.prod.cidr_block, 3, 3)
  vpc_id                      = module.aws_base.vpc_id
  subnet_ids                  = module.aws_base.subnets
  security_group_id           = module.aws_base.security_group[0]
  cross_account_arn           = module.aws_base.cross_account_role_arn

  providers = {
    databricks = databricks.mws
  }
  depends_on = [module.aws_base]
}


module "aws_uc" {
  source                      = "../../modules/aws_uc/"
  databricks_account_id       = var.databricks_account_id
  databricks_account_username = var.databricks_account_username
  databricks_account_password = var.databricks_account_password
  region                      = var.region
  workspaces_to_associate     = [split("/", module.aws_customer_managed_vpc.workspace_id)[1]]
  databricks_workspace_url    = module.aws_customer_managed_vpc.workspace_url
}


module "aws_fs_lakehouse" {
  source                      = "../../modules/aws_fs_lakehouse/"
  workspace_url               = module.aws_customer_managed_vpc.workspace_url
  databricks_account_username = var.databricks_account_username
  databricks_account_password = var.databricks_account_password
  crossaccount_role_name      = split("/", module.aws_base.cross_account_role_arn)[1]
  allow_ip_list               = var.allow_ip_list
  use_ip_access_list          = var.use_ip_access_list

  providers = {
    databricks = databricks.workspace
  }

  depends_on = [module.aws_uc]
}

Azure-Specific Changes

  • Hub and Spoke Architecture with Azure Databricks workspace created per Spoke - The infrastructure deployed matches the design in the Data Exfiltration Prevention blog released by Databricks here

GCP

  • Bring-your-own-VPC configuration with GCP (see GCP folder)

terraform-databricks-lakehouse-blueprints's People

Contributors

flaviomalavazi avatar nathanknox avatar nkvuong avatar rportilla-databricks avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.