Coder Social home page Coder Social logo

confluentinc / confluent-hybrid-cloud-workshop Goto Github PK

View Code? Open in Web Editor NEW
43.0 145.0 33.0 46.05 MB

Confluent Hybrid Cloud Workshop

License: Apache License 2.0

CSS 28.37% Shell 6.96% Dockerfile 0.21% Python 11.03% HCL 44.54% HTML 5.98% Smarty 0.18% JavaScript 2.73%

confluent-hybrid-cloud-workshop's Introduction

Confluent Hybrid-Cloud Workshop

Overview

This repository allows you to configure and provision a cloud-based workshop using your preferred cloud provider GCP, AWS or Azure. Each workshop participant connects to their own virtual machine and is intended to act as a psuedo on-premise datacenter. A single Confluent Cloud cluster is shared by all workshop participants.

For a single workshop participant, the logical architecture looks like this.

workshop

From a physical architecture point of view, each component, except for Confluent Cloud, is hosted on the participant's virtual machine.

Each workshop participant will work through a series of Labs to create the following ksqlDB Supply & Demand Application.

workshop

Prerequisites

Things to know before starting

This repository is going to create the required Confluent Cloud features (Environment, Cluster, API keys...). The cluster type by default is Basic. If it is necessary to use Standard or Dedicated cluster, please check this link to make the required changes: https://registry.terraform.io/providers/confluentinc/confluent/latest/docs/resources/confluent_kafka_cluster .

Provisioning a Workshop

Create an empty directory somewhere that will contain your workshop configuration file.

mkdir ~/myworkshop

Copy workshop-example-<cloud provider>.yaml to workshop.yaml in the directory you just created.

cp workshop-<cloud provider>-example.yaml ~/myworkshop/workshop.yaml

Edit ~/myworkshop/workshop.yaml and make the required changes.

Change your current directory to the root of the repository

cd ~/confluent-hybrid-cloud-workshop

To start provisioning the workshop, run workshop-create.py and pass in your workshop directory

python3 workshop-create.py --dir ~/myworkshop

Maybe you will need to execute the following commands before executing the previous script:

python3 -m pip install boto3
python3 -m pip install google
python3 -m pip install google-api-python-client
python3 -m pip install azure-cli

When you are finished with the workshop you can destroy it using workshop-destroy.py

python3 workshop-destroy.py --dir ~/myworkshop

Troubleshooting

If you ever need root access on the docker containers use:

docker exec --interactive --tty --user root --workdir / kafka-connect-ccloud bash

See this blog post for more info

License

This project is licensed under the Apache 2.0 - see the LICENSE.md file for details

confluent-hybrid-cloud-workshop's People

Contributors

eric-asuncion avatar gianlucanatali avatar justinrlee avatar lyoung-confluent avatar moliele avatar tjunderhill avatar tmcgrath avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

confluent-hybrid-cloud-workshop's Issues

Cannot re-run apply if Big query extension is used

If you use the Big QUery extension, re-running terraform apply will fail with error:
Error: Error updating Dataset "projects/XXX/datasets/test_tf": googleapi: Error 400: Cannot remove all owners

You might find yourself in this situation if the creation of the workshop environments failed for some reason. Re run ./workshop-create.py works if you are not using Big Query extension.
This issue is caused by the problem described here: hashicorp/terraform-provider-google#5152

A new approach to handle access was released so we should switch to this one now: https://www.terraform.io/docs/providers/google/r/bigquery_dataset_access.html

Provisioner SSH Access Failed (Connection Timeout)

During the provisioning process it seems that the "null resource" is not able to connect to the EC2 instance.

This is the error that I get:
"Error: timeout - last error: SSH authentication failed (dc01@18...**:22): ssh: handshake failed: ssh: unable to authenticate, attempted methods [none], no supported methods remain"

Any ideas?

Wrong screenshot in lab 15

image
This screenshot shows the cluster as CC, that is confusing as the new topic should be done for the on prem cluster

Default is latest for KSQL

In the workshop documentation we say the default should be earliest.
People in the lab experienced that the default was latest

Customers, Products, and Suppliers do not flow to Google Big Query

At the end of the lab, I see transactional data in Big Query, but not the customers, products, or suppliers data.

In my local confluent control center, I see this data in their respective topics, and CC is properly showing their values (so it is schema-aware)

In confluent cloud (which is where the connector is configured to pull from) I see the data, but for the values i see the binary representation of the AVRO encoded data. I suspect that for whatever reason, the confluent cloud cluster is unable to deserialize the data?

The connector is running


Name                 : DC01_GCS_SINK
Class                : io.confluent.connect.gcs.GcsSinkConnector
Type                 : sink
State                : RUNNING
WorkerId             : kafka-connect-ccloud:18084

 Task ID | State   | Error Trace
---------------------------------
 0       | RUNNING |
---------------------------------

with no errors.

Could you comment on why dont I see any errors? How can I view messages that the connector "skipped" when running in KSQL mode?

Leverage of Cloud Shell on GCP

I know that Azure also has an equivalent, and AWS recently released one, but I'm familiar with GCP and built a script that can help the general public to make a quick deploy of this workshop with the use of Cloud Shell. It's a free service and it's pre-loaded with everything you need for this workshop.

It requires to go to console.cloud.google.com and run Cloud Shell with a selected Project. I extended workshop-create.py and named it workshop-quickcreate-gcp.py.

It creates the myworkshop directory, copies the yaml file, creates a key for the CE default service account (if non-existent), re-creates the yaml file for GCP using easy-to-follow prompts. After I cloned the repo, I placed it in confluent-hybrid-cloud-workshop directory.

Ran it:
python3 workshop-quickcreate-gcp.py --dir ~/myworkshop

Here is the code:


import argparse
import os
import yaml
import json
import shutil
import fileinput
import re
import glob
from subprocess import run, PIPE
from pathlib import Path

print()
print()

argparse = argparse.ArgumentParser()
argparse.add_argument('--dir', help="Workshop directory", required=True)
args = argparse.parse_args()

key_location = f'{Path.home()}/key.json'

def create_sa_key():
    s_a = run(['gcloud iam service-accounts list | grep "Compute Engine default"'],
              shell = True,
              stdout = PIPE,
              stderr = PIPE).stdout.decode('utf-8').split()[-2].strip()

    proc = run([f'gcloud iam service-accounts keys create ~/key.json --iam-account {s_a}'],
               shell = True,
               stdout = PIPE,
               stderr = PIPE)
    print(proc.stdout.decode('utf-8'))

if os.path.exists(key_location):
    with open(key_location, 'r') as fh:
        ky = json.load(fh)
        try:
            print('Using service account: ' + ky['client_email'])
        except:
            create_sa_key()
else:
    create_sa_key()

print()

run(['mkdir -p ~/myworkshop'], shell = True, stdout = PIPE, stderr = PIPE)
run(['cp workshop-example-gcp.yaml ~/myworkshop/workshop.yaml'], shell = True, stdout = PIPE, stderr = PIPE)

with open(f'{args.dir}/workshop.yaml', 'r') as fh:
    wkshp_yaml = yaml.full_load(fh)


wkshp_yaml['workshop']['core']['project'] = os.environ['DEVSHELL_PROJECT_ID']
wkshp_yaml['workshop']['core']['credentials_file_path'] = key_location

try:
    wkshp_yaml['workshop'].pop('extensions')
except:
    pass

def iterate_config_variables(var_name, var_type, var_value = None, gcp_list = None, gcp_filter = None, gcp_zone = None):

    print()
    message = f'Select a value for {var_name} (default = "{var_value}"): '
    if gcp_list in ['regions', 'zones', 'machine-types']:
        print(f'Fetching {gcp_list}...')
        command = f'gcloud compute {gcp_list} list'

        if gcp_zone:
            command += f' --zones {gcp_zone}'

        if gcp_filter:
            command += f' | grep "{gcp_filter}"'
        opts = run([command],
                    shell = True,
                    stdout = PIPE,
                    stderr = PIPE).stdout.decode('utf-8').split('\n')[1:-1]
        opts_dict = {i+1:item.split()[0].strip() for i, item in enumerate(opts)}
        for i, item in opts_dict.items():
            print(f'{i}\t{item}')
        message = f'Select a number from the options listed above (default = {var_value}): '
    while(True):
        var_input = input(message) or var_value
        if gcp_list:
            try:
                var_input = opts_dict[int(var_input)]
                break
            except:
                print('Not a valid option. Try again...')

        elif var_input:
            try:
                var_input = var_type(var_input)
                break
            except:
                print('Not a valid option. Try again...')
    return var_input

wkshp_yaml['workshop']['name'] = iterate_config_variables('workshop name',
                                                          str,
                                                          var_value = 'dc')
wkshp_yaml['workshop']['participant_count'] = iterate_config_variables('number of participants',
                                                                       int,
                                                                       var_value = 1)
wkshp_yaml['workshop']['participant_password'] = iterate_config_variables('workshop password',
                                                                          str,
                                                                          var_value = 'workshop123!')
reg = iterate_config_variables('region',
                               str,
                               var_value = 18,
                               gcp_list = 'regions')
wkshp_yaml['workshop']['core']['region'] = reg

zne = iterate_config_variables('zone',
                               str,
                               var_value = 1,
                               gcp_list = 'zones',
                               gcp_filter = reg)
wkshp_yaml['workshop']['core']['region_zone'] =zne

wkshp_yaml['workshop']['core']['vm_type'] = iterate_config_variables('machine type',
                                                                     str,
                                                                     var_value = 4,
                                                                     gcp_list = 'machine-types',
                                                                     gcp_filter = 'n1-standard',
                                                                     gcp_zone = zne)

wkshp_yaml['workshop']['core']['ccloud_bootstrap_servers'] = iterate_config_variables('ccloud bootstrap servers',
                                                                                      str)
wkshp_yaml['workshop']['core']['ccloud_api_key'] = iterate_config_variables('ccloud api key',
                                                                                      str)
wkshp_yaml['workshop']['core']['ccloud_api_secret'] = iterate_config_variables('ccloud api secret',
                                                                                      str)

print()
print('Writing this configuration to YAML file...')
print(json.dumps(wkshp_yaml, indent = 4))

with open(f'{args.dir}/workshop.yaml', 'w') as fh:
    wkshp_yaml = yaml.dump(wkshp_yaml,fh)

docker_staging=os.path.join(args.dir, ".docker_staging")
terraform_staging=os.path.join(args.dir, ".terraform_staging")

# Open and parse configuration file
with open( os.path.join(args.dir, "workshop.yaml"), 'r') as yaml_file:
    try:
        config = yaml.safe_load(yaml_file)
    except yaml.YAMLError as exc:
        print(exc)

def copytree(src, dst):
  if not os.path.exists(dst):
    os.makedirs(dst)
    shutil.copystat(src, dst)
  lst = os.listdir(src)
  for item in lst:
    s = os.path.join(src, item)
    d = os.path.join(dst, item)
    if os.path.isdir(s):
      copytree(s, d)
    else:
      shutil.copy2(s, d)

if int(config['workshop']['participant_count']) > 35:
  print()
  print("*"*70)
  print("WARNING: Make sure your Confluent Cloud cluster has enough free partitions")
  print("to support this many participants. Each participant uses ~50 partitions.")
  print("*"*70)
  print()
  while True:
    val = input('Do You Want To Continue (y/n)? ')
    if val == 'y':
      break
    elif val =='n':
      exit()

#----------------------------------------
# Create the Terraform staging directory
#----------------------------------------

# Copy core terraform files to terraform staging
copytree(os.path.join("./core/terraform", config['workshop']['core']['cloud_provider']), terraform_staging)
copytree("./core/terraform/common", os.path.join(terraform_staging, "common"))

# Copy extension terraform files to terraform staging
if 'extensions' in config['workshop'] and config['workshop']['extensions'] != None:
    for extension in config['workshop']['extensions']:
        if os.path.exists(os.path.join("./extensions", extension, "terraform")):
            copytree(os.path.join("./extensions", extension, "terraform"), terraform_staging)

# Create Terraform tfvars file
with open(os.path.join(terraform_staging, "terraform.tfvars"), 'w') as tfvars_file:

    # Process high level
    for var in config['workshop']:
        if var not in ['core', 'extensions']:
            tfvars_file.write(str(var) + '="' + str(config['workshop'][var]) + "\"\n")
    for var in config['workshop']['core']:
        if var != 'cloud_provider':
            tfvars_file.write(str(var) + '="' + str(config['workshop']['core'][var]) + "\"\n")
    if 'extensions' in config['workshop'] and config['workshop']['extensions'] != None:
        for extension in config['workshop']['extensions']:
            if os.path.exists(os.path.join("./extensions", extension, "terraform")):
                if config['workshop']['extensions'][extension] != None:
                    for var in config['workshop']['extensions'][extension]:
                        tfvars_file.write(str(var) + '="' + str(config['workshop']['extensions'][extension][var]) + "\"\n")

#----------------------------------------------------------------------------
# Create the Docker staging directory, this directory is uploaded to each VM
#----------------------------------------------------------------------------
# remove stage directory
if os.path.exists(docker_staging):
    shutil.rmtree(docker_staging)
# Create staging directory and copy the required docker files into it
os.mkdir(docker_staging)
os.mkdir(os.path.join(docker_staging, "extensions"))
copytree("./core/docker/", docker_staging)
# Copy asciidoc directory to .docker_staging
copytree(os.path.join("./core/asciidoc"), os.path.join(docker_staging, "asciidoc"))
# Deal with extensions
if 'extensions' in config['workshop'] and config['workshop']['extensions'] != None:
    # Add each extensions asciidoc file as an include in the main workshop.adoc file
    includes = []
    include_str=""
    for extension in config['workshop']['extensions']:
        if os.path.isdir(os.path.join("./extensions", extension, "asciidoc")):
            includes.append(glob.glob(os.path.join("./extensions", extension, "asciidoc/*.adoc"))[0])
    # Build extension include string
    for include in includes:
        include_str += 'include::.' + include + '[]\n'

    # Add extension includes to core workshop.adoc
    for line in fileinput.input(os.path.join(docker_staging, "asciidoc/workshop.adoc"), inplace=True):
        line=re.sub("^#EXTENSIONS_PLACEHOLDER#",include_str,line)
        print(line.rstrip())
    # Copy extension asciidoc files to docker staging
    for extension in config['workshop']['extensions']:
        if os.path.isdir(os.path.join("./extensions", extension, "asciidoc")):
            copytree(os.path.join("./extensions", extension, "asciidoc"), os.path.join(docker_staging, "extensions", extension, "asciidoc"))
    # Copy extension images to docker staging
    for extension in config['workshop']['extensions']:
        if os.path.isdir(os.path.join("./extensions", extension, "asciidoc/images")):
            copytree(os.path.join("./extensions", extension, "asciidoc/images"), os.path.join(docker_staging, "asciidoc/images"))
    # Copy extension docker files to docker staging and create docker .env file
    for extension in config['workshop']['extensions']:
        if os.path.isdir(os.path.join("./extensions", extension, "docker")):
            copytree(os.path.join("./extensions", extension, "docker"), os.path.join(docker_staging, "extensions", extension, "docker"))
            # Create .env file for docker
            if config['workshop']['extensions'][extension] != None:
                for var in config['workshop']['extensions'][extension]:
                    with open(os.path.join(docker_staging, "extensions", extension, "docker/.env"), 'a') as env_file:
                        env_file.write(var + '=' + config['workshop']['extensions'][extension][var] + "\n")
else:
    for line in fileinput.input(os.path.join(docker_staging, "asciidoc/workshop.adoc"), inplace=True):
        line=re.sub("^#EXTENSIONS_PLACEHOLDER#","",line)
        print(line.rstrip())


#-----------------
# Create Workshop
#-----------------

os.chdir(terraform_staging)

# Terraform init
os.system("terraform init")

# Terraform plan
os.system("terraform plan")

# Terraform apply
os.system("terraform apply -auto-approve")

# Show workshop details
os.system("terraform output -json external_ip_addresses > workshop_details.out")
if os.path.exists("workshop_details.out"):
    with open('workshop_details.out') as wd:
        ip_addresses = json.load(wd)
        print("*" * 65)
        print("\n WORKSHOP DETAILS\n Copy & paste into Google Sheets and share with the participants\n")
        print("*" * 65)
        print('=SPLIT("SSH USERNAME,GETTING STARTED URL,PARTICIPANT NAME/EMAIL",",")')
        for id, ip_address in enumerate(ip_addresses, start=1):
            print('=SPLIT("dc{:02d},http://{}", ",")'.format(id, ip_address))
        #print('=SPLIT("{}-{},http://{}", ",")'.format(config['workshop']['name'], id, ip_address))

    os.remove("workshop_details.out")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.