Coder Social home page Coder Social logo

picatz / terraform-google-nomad Goto Github PK

View Code? Open in Web Editor NEW
78.0 6.0 16.0 380 KB

๐Ÿ“— Terraform Module for Nomad clusters with Consul on GCP

Home Page: https://registry.terraform.io/modules/picatz/nomad/google

License: MIT License

HCL 66.35% Shell 13.05% Go 11.11% Makefile 9.50%
nomad terraform gcp mtls ssh acls consul-connect packer consul

terraform-google-nomad's Introduction

Nomad Cluster

Nomad Version Consul Version

Terraform Module for Nomad clusters with Consul on GCP.

Module Features

  • Includes HashiCorp's Consul service mesh
  • Gossip encryption, mTLS, and ACLs enabled for Nomad and Consul
  • Optional load balancer and DNS configuration
  • Optional SSH bastion host
  • Only the Docker task driver is enabled
  • Installs the gVisor container runtime (runsc)
  • Installs the Falco runtime security monitor

Cloud Shell Interactive Tutorial

For a full interactive tutorial to get started using this module:

Open in Cloud Shell

Infrastructure Diagram

Infrastructure Diagram

Logs

Logs are centralized using GCP's Cloud Logging. You can use the following filter to see all Nomad agent logs:

$ gcloud logging read 'resource.type="gce_instance" jsonPayload.ident="nomad"'
...
$ gcloud logging read 'resource.type="gce_instance" jsonPayload.ident="nomad" jsonPayload.host="server-0"' --format=json | jq -r '.[] | .jsonPayload.message' | less
...

Logs can also be collected within the cluster using Promtail and Loki, then visualized using Grafana (optionally exposed using a public load balancer and DNS name).

$ DNS_ENABLED=true PUBLIC_DOMAIN="nomad.your-domain.com" make terraform/apply
...
$ export CONSUL_HTTP_TOKEN=$(terraform output -json | jq -r .consul_master_token.value)
$ make consul/metrics/acls
...
๐Ÿ”‘ Creating Consul ACL Token to Use for Prometheus Consul Service Discovery
AccessorID:       15b9a51d-7af4-e8d4-7c09-312c594a5907
SecretID:         2a1c7926-b6e3-566e-ddf5-b19279fa134e
Description:
Local:            false
Create Time:      2021-04-11 16:16:03.90231.6.1 +0000 UTC
Roles:
   6ae941.6.1c07-49a7-fa95-8ce14aa8a75e - metrics

$ consul_acl_token=2a1c7926-b6e3-566e-ddf5-b19279fa134e make nomad/metrics
$ make nomad/logs
$ make nomad/ingress
$ GRAFANA_PUBLIC_DOMAIN="grafana.your-domain.com" GRAFANA_LOAD_BALANCER_ENABLED=true DNS_ENABLED=true PUBLIC_DOMAIN="nomad.your-domain.com" make terraform/apply
$ open http://public.grafana.your-domain.com:3000/login

Bootstrap ACL Token

If the cluster is started with ACLs enabled, which is the default behavior of this module, you may see this:

$ export NOMAD_ADDR="https://$(terraform output -json | jq -r .load_balancer_ip.value):4646"
$ nomad status
Error querying jobs: Unexpected response code: 403 (Permission denied)

We can bootstrap ACLs to get the bootstrap management token like so:

$ nomad acl bootstrap
Accessor ID  = a1495889-37ce-6784-78f3-31.6.1984bca
Secret ID    = dc8c0349-c1fd-dc2c-299c-d513e5dd6df2
Name         = Bootstrap Token
Type         = management
Global       = true
Policies     = n/a
Create Time  = 2020-04-27 05:24:43.734587566 +0000 UTC
Create Index = 7
Modify Index = 7

Then we can use that token (Secret ID) to perform the rest of the ACL bootstrapping process:

$ export NOMAD_TOKEN="dc8c0349-c1fd-dc2c-299c-d513e5dd6df2"
$ nomad status
No running jobs
$ ...

Use ssh-mtls-terminating-proxy to access the Nomad UI

When using the SSH bastion, you can use the ssh-mtls-terminating-proxy.go helper script to tunnel a connection from localhost to the Nomad server API:

$ make ssh/proxy/mtls
2021/04/11.16.18:28 getting terraform output
2021/04/11.16.18:29 Bastion IP: "34.73.106.60"
2021/04/11.16.18:29 Server IP: "1.6.168.2.8"
2021/04/11.16.18:29 Setting up SSH agent
2021/04/11.16.18:29 connecting to the bastion
2021/04/11.16.18:29 connecting to the server through the bastion
2021/04/11.16.18:30 wrapping the server connection with SSH through the bastion
2021/04/11.16.18:30 tunneling a new connection for Consul to the server with SSH through the bastion
2021/04/11.16.18:30 loading Consul TLS data
2021/04/11.16.18:30 tunneling a new connection for somad to the server with ssh through the bastion
2021/04/11.16.18:30 loading Nomad TLS data
2021/04/11.16.18:30 starting Consul local listener on localhost:8500
2021/04/11.16.18:30 starting Nomad local listener on localhost:4646
...

Then open your browser at http://localhost:4646/ui/ to securely access the Nomad UI.

terraform-google-nomad's People

Contributors

allisson avatar crankycoder avatar dependabot-preview[bot] avatar dependabot[bot] avatar picatz avatar pljeff avatar strixbe avatar tgross avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

terraform-google-nomad's Issues

Install the Stackdriver monitoring agent

To start getting the builtin observability features working.

curl -sSO https://dl.google.com/cloudagents/add-monitoring-agent-repo.sh && sudo bash add-monitoring-agent-repo.sh --also-install && sudo service stackdriver-agent start

Likely need to add a new scope to the vm module:

https://www.googleapis.com/auth/monitoring.write

Interpolation-only Expressions Are Deprecated

$ terraform validate .

Warning: Interpolation-only expressions are deprecated

  on modules/open-port/firewall.tf line 6, in resource "google_compute_firewall" "open_port":
   6:     protocol = "${var.protocol}"

Terraform 0.11 and earlier required all non-constant expressions to be
provided via interpolation syntax, but this pattern is now deprecated. To
silence this warning, remove the "${ sequence from the start and the }"
sequence from the end of this expression, leaving just the inner expression.

Template interpolation syntax is still used to construct strings from
expressions when the template includes multiple interpolation sequences or a
mixture of literal strings and interpolations. This deprecation applies only
to templates that consist entirely of a single interpolation sequence.

(and 4 more similar warnings elsewhere)

Does not work on `darwin_arm64`

$ terraform init 
โ•ท
โ”‚ Error: Incompatible provider version
โ”‚ 
โ”‚ Provider registry.terraform.io/hashicorp/template v2.2.0 does not have a package available for your current platform, darwin_arm64.
โ”‚ 
โ”‚ Provider releases are separate from Terraform CLI releases, so not all providers are available for all platforms. Other versions of
โ”‚ this provider may have different platforms supported.
โ•ต

The last release was two years ago.

Prevent Plaintext Secrets in the Compute Metadata Service

As of v2.0.0/#18 , this module deploys a Consul cluster in tandem with the Nomad cluster. Moreover, it uses the metadata service to perform the majority of the dynamic server configuration, and exposes many secrets to malicious/compromised workloads on Nomad client instances.

All secrets should be removed from the metadata service, or at least not stored in plaintext.

Permission error when setting up a server

I have this error when trying to setup the consul server, not sure how to add the required permissions?

Cannot discover address: cluster=LAN address="provider=gce project_name=<project>  tag_value=server" error="discover-gce: googleapi: Error 403: Required 'compute.zones.list' permission for 'projects/<project-name>'"

Stackdriver Logging agent error, VMs have insufficient authentication scopes

All nodes within the cluster are configured with the google-fluentd Stackdriver Logging agent. When looking at the logs from the agent, I'm getting the following errors:

$ cat /var/log/google-fluentd/google-fluentd.log 
2020-07-12 18:15:56 +0000 [warn]: Failed to extract log entry errors from the error details: "Request had insufficient authentication scopes.". error_class=JSON::ParserError error="String"
...

โ˜๏ธ This also means the VM instance logs aren't available in the GCP console.

I think I need to expand the service_accounts.scopes for VMs to also include a logging scope to fix this error.

Run nomad as non root user

Hi!

I was referencing this repo for deploying a similar stack on AWS. I came across install_nomad.sh#L27 and I think this is a typo. Should it not be owned by nomad user instead of root user?

sudo chown --recursive nomad:nomad /nomad

Errors and other alerts when following the tutorial steps

In step 4, right after executing the command 'terraform plan...' I have the output below, please how can I solve this problem?

Terraform v0.12.28

  • provider.google v3.65.0
  • provider.local v2.1.0
  • provider.random v3.1.0
  • provider.template v2.2.0
  • provider.tls v3.1.0

Warning: registry.terraform.io: This version of Terraform has an outdated GPG key and is unable to verify new provider releases. Please upgrade Terraform to at least 0.12.31 to receive new provider updates. For details see: https://discuss.hashicorp.com/t/hcsec-2021-12-codecov-security-event-and-hashicorp-gpg-key-exposure/23512

Error: Reference to undeclared output value

on outputs.tf line 4, in output "ca_cert":
4: value = module.nomad.ca_cert

An output value with the name "ca_cert" has not been declared in module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 10, in output "cli_cert":
10: value = module.nomad.cli_cert

An output value with the name "cli_cert" has not been declared in
module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 16, in output "cli_key":
16: value = module.nomad.cli_key

An output value with the name "cli_key" has not been declared in module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 38, in output "nomad_server_ip":
38: value = module.nomad.nomad_server_ip

An output value with the name "nomad_server_ip" has not been declared in
module.nomad.


Terraform v0.12.31

  • provider.google v3.65.0
  • provider.local v2.1.0
  • provider.random v3.1.0
  • provider.template v2.2.0
  • provider.tls v3.1.0

Error: Reference to undeclared output value

on outputs.tf line 4, in output "ca_cert":
4: value = module.nomad.ca_cert

An output value with the name "ca_cert" has not been declared in module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 10, in output "cli_cert":
10: value = module.nomad.cli_cert

An output value with the name "cli_cert" has not been declared in
module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 16, in output "cli_key":
16: value = module.nomad.cli_key

An output value with the name "cli_key" has not been declared in module.nomad.

Error: Reference to undeclared output value

on outputs.tf line 38, in output "nomad_server_ip":
38: value = module.nomad.nomad_server_ip

An output value with the name "nomad_server_ip" has not been declared in
module.nomad.


Terraform v1.1.9
on linux_amd64

  • provider registry.terraform.io/hashicorp/google v4.20.0
  • provider registry.terraform.io/hashicorp/local v2.2.2
  • provider registry.terraform.io/hashicorp/random v3.1.3
  • provider registry.terraform.io/hashicorp/template v2.2.0
  • provider registry.terraform.io/hashicorp/tls v3.3.0

โ”‚ Warning: Argument is deprecated
โ”‚
โ”‚ with module.nomad.tls_self_signed_cert.consul-ca,
โ”‚ on .terraform/modules/nomad/consul_tls_ca.tf line 10, in resource "tls_self_signed_cert" "consul-ca":
โ”‚ 10: key_algorithm = tls_private_key.consul-ca.algorithm
โ”‚
โ”‚ This is now ignored, as the key algorithm is inferred from the private_key_pem.
โ”‚
โ”‚ (and 13 more similar warnings elsewhere)
โ•ต
โ•ท
โ”‚ Error: Unsupported attribute
โ”‚
โ”‚ on outputs.tf line 4, in output "ca_cert":
โ”‚ 4: value = module.nomad.ca_cert
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”‚ โ”‚ module.nomad is a object, known only after apply
โ”‚
โ”‚ This object does not have an attribute named "ca_cert".
โ•ต
โ•ท
โ”‚ Error: Unsupported attribute
โ”‚
โ”‚ on outputs.tf line 10, in output "cli_cert":
โ”‚ 10: value = module.nomad.cli_cert
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”‚ โ”‚ module.nomad is a object, known only after apply
โ”‚
โ”‚ This object does not have an attribute named "cli_cert".
โ•ต
โ•ท
โ”‚ Error: Unsupported attribute
โ”‚
โ”‚ on outputs.tf line 16, in output "cli_key":
โ”‚ 16: value = module.nomad.cli_key
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”‚ โ”‚ module.nomad is a object, known only after apply
โ”‚
โ”‚ This object does not have an attribute named "cli_key".
โ•ต
โ•ท
โ”‚ Error: Unsupported attribute
โ”‚
โ”‚ on outputs.tf line 38, in output "nomad_server_ip":
โ”‚ 38: value = module.nomad.nomad_server_ip
โ”‚ โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
โ”‚ โ”‚ module.nomad is a object, known only after apply
โ”‚
โ”‚ This object does not have an attribute named "nomad_server_ip".

Allow more configuration of Consul integration

Currently, there is no way to really tune the Consul integration outside of enabling/disabling Consul ACLs and the default policy.

variable "consul_acls_enabled" {
type = bool
default = true
description = "If ACLs should be enabled for the Consul cluster."
}
variable "consul_acls_default_policy" {
type = string
default = "deny"
description = "The default policy to use for Consul ACLs (allow/deny)."
}

But there are many options available. These should be exposed as Terraform variables with secure defaults.

consul {
ssl = true
verify_ssl = true
address = "127.0.0.1:8501"
ca_file = "/consul/config/consul-ca.pem"
cert_file = "/consul/config/server.pem"
key_file = "/consul/config/server-key.pem"
token = "{CONSUL-TOKEN}"
}

Extra important ones to consider would be allow_unathenticated and share_ssl. Consider disabling these by default with adjustments to documentation and examples.

Consul configuration defines unused options in server and client

Client agents do not need to enable connect, this is only used on servers:

Enabling Connect requires changing the configuration of only your Consul servers (not client agents).

connect {
enabled = true
}

Servers do not need to enable the gRPC port, this is only used on clients:

There might be others, but these should definitely be removed.

Falco Installation Broken

$ make packer/build
2021-05-30T11:51:05-04:00: ==> client: E: Failed to fetch https://dl.bintray.com/falcosecurity/deb/dists/stable/InRelease  403  Forbidden [IP: 54.185.54.139 443]
2021-05-30T11:51:05-04:00: ==> client: E: The repository 'https://dl.bintray.com/falcosecurity/deb stable InRelease' is not signed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.