Coder Social home page Coder Social logo

national-parks-demo's People

Contributors

alainlubin avatar bdausses avatar danf425 avatar dcotech avatar dslanec avatar ericcalabretta avatar ericheiser avatar ghenkhaus avatar jmery avatar jvogt avatar kenlangdon avatar niamhcahill avatar ricardolupo avatar scottford-io avatar scottpgallagher avatar scthornton avatar smford22 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

national-parks-demo's Issues

SUDO timeout hit in azure terraform apply

It looks like the terraform/azure config will sometimes take long enough in the initial yum update that its sudo session times out, and it continually prompts for the sudo password, halting the deploy.

I'm not sure yet if it's just a matter of updating an image somewhere, but at the very least as a workaround I found this seemed to work:

In terraform/azure/main.tf, reaplace all instances of:

"sudo groupadd hab",

with:

"echo ${var.azure_image_password} | sudo -S groupadd hab"

I'll see if I can't poke around later in the week and see about getting an official fix in place. Might be worth managing sudoers directly -- will report back.

if us-west-2d is selected terraform fails

I'm sure we've all seen this, normally a destroy/apply loop will pick a different AZ and get you past this error.

It is very annoying and we should refactor at some point to remove this issue. Putting this out here so we remember, and folks don't think it's an issue unique to them.


* aws_instance.permanent_peer: Error launching source instance: Unsupported: Your requested instance type (m4.xlarge) is not supported in your requested Availability Zone (us-west-2d). Please retry your request by not specifying an Availability Zone or choosing us-west-2c, us-west-2a, us-west-2b.

Documentation Updates for newcomers

Ran through some suggestions for the README.md provided by @alainlubin and @danf425. Some high-level topics discussed included:

  • Make sure the instructions call out running a terraform init on a user's first spin-up, as the terraform apply will fail if this is not done
  • Update the sup-log section of local specifying a ctrl+c to exit out when you're done.
  • Guidence on working with branches for contributing or working with PRs (this can likely go in a separate CONTRIBUTING.md)

Thinking of using this as a good topic to get some folks together for a quick contributing session to guide folks through the git-fu required to get changes pushed to the repo.

Update tfvars example for azure

Some folks have seen behavior in the terraform/azure config wherein if the azure_image_password contains special characters, launching instances can fail with an unhelpful error.

Propose we remove the example password entirely to dissuade folks from using a secret that lives in the Github repo, and instead append with a comment suggesting folks create their own secure alphanumeric password.

See this issue for more background on launch failures.

Chef Automate ALB -> ELB

The Chef Automate configuration in terraform for AWS makes use of an application load balancer (ALB) to act as an endpoint for a dynamically generated route53 domain. While this works well for forwarding web requests, it presents challenges for forwarding non-web TCP requests, as with the event stream configuration needed for the EAS dashboard.

Since TLS is not currently supported in the EAS event stream, this is currently addressed by pointing supervisors to the Automate Server's IP address directly over port 4222. This works, but should be considered a stopgap solution.

Long term, we want to be able to point things to the proper hostname, as with data collection. Per a conversation with @jvogt, this can be accomplished by using an elastic load balancer (ELB) in favor of the current ALB setup.

Here is a reference from one of his repos to help guide development on this change: https://github.com/jvogt/2019-demo-terraform/blob/4901d9a10f6be198062a9b00e5984e8327e4771a/automate/aws/chef_automate_elb.tf

Project should support all 0.11.x versions

Terraform configs are explicitly tied to version 0.11.11 of terraform, as there were some breaking changes introduced in 0.12. However, 0.11.14 is the current latest version of that release, and should work, so we'll need to find where that's specified and make the rule a bit more lenient (e.g. < 0.12 or similar)

Performance issues with latest mongodb package

Recent builds of the core/mongodb package are significantly larger than previous builds had been, which causes a number of unintended consequences that affect the national parks demo:

  1. The time it takes to run a hab svc load core/mongodb can be quite long
  2. Running the package in a local studio (particularly the docker studio on mac) can fail in a difficult to diagnose way -- the supervisor never starts the service or any constituent services until the supervisor is terminated and relaunched.

The second bullet here is the real killer -- particularly because it's inconsistent. Sometimes bumping the memory allocation to docker can fix it, and sometimes not.

As a short-term solution, @jvogt called out that pinning to an earlier release, particularly core/mongodb/3.2.10/20171016003652, solves this issue.

The problem is on the hab team's radar, but low priority. Recommend we update configs to use this release for the time being to make demo life more manageable.

DCA - Allow the demo to apply effortless-config seperate

@ChefRycar
The demo is great but it scans using effortless-audit and then automatically applies the effortless-config habitat service, remediating the CentOS nodes very quickly.

This does not allow time for an Architect to talk through the DCA concept with the customer and then show the remediation being applied (effortless-config).

It would be great if we could apply the remediation via a flag in Terraform or simply 'hab svc load effortless-config' on the node command line and then show the Centos nodes being updated.

Update tfvars.example files with explanatory comments

While our documentation instructs folks to create a terraform.tfvars file from the examples provided in each folder's tfvars.example, it's not immediately obvious what some of the variables are used for without parsing the README for details.

To make things less opaque, suggest updating the example files with comments briefly describing what each variable or group of variables is used for.

write a health-check for national-parks

I'd be good to have a meaningful health check for national-parks to better visualize the status in Automate.

It'd also be good to use the health-check in the demo in some meaningful way, Like start with a failing health-check, promote a package or make a change that corrects that issue ending with an "ok" status.

Problem Provisioning in Azure

When performing a terraform apply in the National Parks Demo ...terraform/azure directory. The following error occurs:


4 error(s) occurred:

* azurerm_virtual_machine.permanent-peer: error executing "/tmp/terraform_1759793539.sh": Process exited with status 1
* azurerm_virtual_machine.mongodb: error executing "/tmp/terraform_1114131904.sh": Process exited with status 1
* azurerm_virtual_machine.haproxy: error executing "/tmp/terraform_1395668471.sh": Process exited with status 1
* azurerm_virtual_machine.app: error executing "/tmp/terraform_1564002431.sh": Process exited with status 1```

Logging into each machine and manually executing the appropriate scripts allowed us to repair and complete the provisioning but this is not ideal.

Make API token and Password configurable at launch for A2

Spinning up the full National Parks demo at present is a multi-step process. Before you can launch any application instances, you must first launch chef-automate, and use the automate_hostname, automate_token, and automate_password produced by that process to seed the terraform.tfvars file for any national parks instances. This is problematic, because it means you have a built-in waiting period after launching your automate server before you can begin provisioning your application instances.

Per a conversation with @jvogt, we can better automate this process by making the token and password user configurable at provisioning time, which allows us to pre-seed them in our app configs, and potentially launch the entire environment with a single terraform apply as needed.

Some details on implementing this can be found in one of Jeff's repos here: https://github.com/jvogt/2019-demo-terraform/blob/4901d9a10f6be198062a9b00e5984e8327e4771a/automate/aws/chef_automate_elb.tf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.