Coder Social home page Coder Social logo

googlecloudplatform / anthos-service-mesh-packages Goto Github PK

View Code? Open in Web Editor NEW
131.0 30.0 161.0 3.04 MB

Packaged configuration for setting up a Kubernetes cluster with Anthos Service Mesh features enabled

Home Page: https://cloud.google.com/anthos/service-mesh

License: Apache License 2.0

Dockerfile 0.25% Shell 99.48% Starlark 0.27%

anthos-service-mesh-packages's Introduction

Anthos Service Mesh Config Packages

This repository contains packaged configuration for setting up a GKE cluster with Anthos Service Mesh features enabled.

Package Descriptions:

  • asm: Contains various manifests to help configure and install ASM
  • docs: Contains useful reading material around design, development and release
  • samples: Contains sample application and gateway manifests
  • scripts: Contains the executables to install or modify ASM

anthos-service-mesh-packages's People

Contributors

aryan16 avatar backward-compatible avatar bianpengyuan avatar cloud-pharaoh avatar cody-clark avatar davidebbo avatar davidhauck avatar elfinhe avatar gargnupur avatar hemendrateli avatar howardjohn avatar jimmycyj avatar jtrbs avatar lei-tang avatar maasen avatar mathieu-benoit avatar nan-yu avatar patchworkguilt avatar richardwxn avatar ruigulala avatar sergii-ssh avatar shankgan avatar stewartbutler avatar tangiel avatar williamaronli avatar wonjekang avatar xulingqing avatar yangminzhu avatar yanyuan-meiyi avatar zhengzheyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anthos-service-mesh-packages's Issues

bind_user_to_iam_policy timesout

Hi, I am following this GCP guide https://cloud.google.com/service-mesh/docs/scripted-install/gke-asm-onboard-1-7#running_the_script to deploy anthos service mesh using this script.

The function bind_user_to_iam_policy timesout with the following:

./install_asm \
>   --project_id $PROJECT_ID \
>   --cluster_name $CLUSTER_NAME \
>   --cluster_location $CLUSTER_LOCATION \
>   --mode install \
>   --option cloud-tracing \
>   --option egressgateways \
>   --option envoy-access-log \
>   --option iap-operator
install_asm: Setting up necessary files...
install_asm: Creating temp directory...
install_asm: Generating a new kubeconfig...
install_asm: Checking installation tool dependencies...
install_asm: Downloading ASM..
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 45.6M  100 45.6M    0     0  29.2M      0  0:00:01  0:00:01 --:--:-- 29.2M
install_asm: Downloading ASM kpt package...
fetching package /asm from https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages to asm
automatically set 28 field(s) for setter "gcloud.core.project" to value "redacted" in package "asm" derived from gcloud config
automatically set 0 field(s) for setter "gcloud.project.projectNumber" to value "redacted" in package "asm" derived from gcloud config
install_asm: Checking for project redacted...
install_asm: Confirming cluster information...
install_asm: Confirming node pool requirements...
install_asm: Fetching/writing GCP credentials to kubeconfig file...
install_asm: Verifying connectivity (20s)...
install_asm: Checking Istio installations...
install_asm: Checking required APIs...
install_asm: Successfully validated all requirements to install ASM from this computer.
install_asm: Getting account information...
install_asm: Binding redacted to required IAM roles...
install_asm: Failed, retrying...(1 of 3)
install_asm: Failed, retrying...(2 of 3)
install_asm: Failed, retrying...(3 of 3)

Switch mesh id to the new format

Soon, mesh id will switch to a format that uses the project number instead of id. Let's figure out the new format, and make the change in master.

Fix type of gcloud.container.nodepool.max-nodes

Hi folks!

When I set max-nodes to 5 using:
kpt cfg set asm gcloud.container.nodepool.max-nodes 5

I have an error:
Error: The input value doesn't validate against provided OpenAPI schema: validation failure list: gcloud.container.nodepool.max-nodes in body must be of type int: "integer"

The same error occurs if I try to list the values using:
kpt cfg list-setters asm/

Don't bind IAM permissions by default

Right now the install script will always bind the current user/service user to the required IAM permissions. We shouldn't do that by default, and instead do the same thing we're doing with APIs--add a flag to enable it. There's been a lot of feedback about this from various customers.

validate_cli_dependencies doesn't include dependencies version check

Scriptaro requires pretty, if not most, recent kubectl. Without it, I've encountered the following error.

install_asm: Running: 'kubectl apply -f istio-1.8.2-asm.2/manifests/charts/base/files/gen-istio-cluster.yaml --record=false --overwrite=false --force-conflicts=true --server-side'
install_asm: -------------
Error: unknown flag: --force-conflicts

install_asm tests leak load balancers on GCP

Creating a k8s ingress on GCP creates a fowarding rule and target pool for load balancing, but deleting the cluster doesn't free up the resources. As integration tests create/delete clusters, the load balancing rules pile up, and eventually a quota will prevent any more from being created.

We should fix the tests to clean these up properly before deleting the cluster.

Installer sets wrong account type for iam_user when querying IAM policy

If I activate SA independent of the installer, it fails to get the IAM roles due to wrong account type - gcloud projects get-iam-policy PROJECT_ID --flatten='bindings[].members' --filter="bindings.members:user:SA@PROJECT_ID.iam.gserviceaccount.com" --format="value(bindings.role)" as opposed to gcloud projects get-iam-policy PROJECT_ID --flatten='bindings[].members' --filter="bindings.members:serviceAccount:SA@PROJECT_ID.iam.gserviceaccount.com" --format="value(bindings.role)"

Running gcloud auth list:

gcloud auth list
                          Credentialed Accounts
ACTIVE  ACCOUNT
*       SA@PROJECT_ID.iam.gserviceaccount.com

A fix could be in local_iam_user to detect if the current account is of form *.iam.gserviceaccount.com to determine account type.

Allow script to export variables when run with source (e.g. revision, istioctl location)

If the script can be run with source then it will be possible to set environment variables in the current shell. That would make it easier to build the script into other scripts and to use it in documentation without having to ask the user to copy and paste or run additional commands. It would be really useful to set the following:

  • ISTIO_BIN (absolute)
  • ISTIOCTL (absolute)
  • REVISION (control plane revision to be used when labelling namespaces, or editing Istio config map, e,g. asm-172-3)

The outro should show the variables that have been set and their values and could include a few examples of how they can be used, e.g.

  • kubectl label namespace NAMESPACE istio.io/rev=${REVISION} --overwrite
  • $ISTIOCTL analyze
  • PATH=$PATH:${ISTIO_BIN}

If the compiled YAML could be written to a file it would also be useful to set a variable with the path to that file.

Allow GitOps integration

Currently, the script generates and installs ASM config, which is not suitable for GitOps integration. Ideally, users should treat their config as code, and review a diff of all changes before deploying to production via GitOps. I think all that is needed here is a way to dump the config to stdout instead of applying it directly.

When --managed is enabled, can_modify_gcp_iam_roles always returns true

can_modify_gcp_iam_roles() {
if [[ "${ENABLE_ALL}" -eq 0 && "${ENABLE_GCP_IAM_ROLES}" -eq 0 ]] && \
! is_managed ||
! can_modify_at_all; then false; fi
}

This makes impossible to run the script if the user prefers setting IAM rules on her own.

Modifying the function as follows solves the issue.

can_modify_gcp_iam_roles() {
  if [[ "${ENABLE_ALL}" -eq 0 && "${ENABLE_GCP_IAM_ROLES}" -eq 0 ]] || \
    ! can_modify_at_all; then false; fi
}

Add Github Issue creation link on an unexpected failure

If the scripts exit with an unexpected error, it'd be nice if the user is presented with a link to one-click create an issue (or print some command that does it) - ideally by automatically populating all the available fields. Also, if the scripts kept the logs in a temp file, that could be uploaded to the issue as well. This will ease error reporting.

Creating internal ingress gateway

Is there a way of setting the ingress gateway component load balancer type in istio-operator.yaml to 'internal' when using asm-patch and kpt?

I see it's set up in asm/istio/options/internal-load-balancer.yaml.

How does one deploy this following these installation instructions?

Better logic around downloading kpt package

Since the script itself is in the kpt package, we shouldn't re-download it if we already have the package. install_asm should try to figure out whether it's in the kpt pkg or not, and if it is, use the existing cfg.

Suggestions on installation script improvement

  1. is it OK that we show the location of the temporary directory we created right after the log
install_asm: Creating temp directory...

. Currently everything will show up after a successful installation. But if some steps failed in the middle, especially during the installation of ASM, and people want to check the configured kpt files, it is not easy.

  1. Currently "run one-command" will hide the error from the failed command "one-command". Maybe populate it out or have a mod like debug mod for the script will help.

Error: accumulating resource after #124 when installing via Anthos CLI

We have been receiving this error during kustomize build step when installing to existing cluster via Anthos CLI
This has started with #124

+ kustomize build -o ../asm-base-dir/all.yaml
Error: accumulating resources: accumulateFile "accumulating resources from 'resources': '/workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources' must resolve to a file", accumulateDirector: "recursed accumulation of path '/workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources': accumulating resources: accumulateFile \"accumulating resources from 'namespace.yaml': evalsymlink failure on '/workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources/namespace.yaml' : lstat /workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources/namespace.yaml: no such file or directory\", loader.New \"Error loading namespace.yaml with git: url lacks orgRepo: namespace.yaml, dir: evalsymlink failure on '/workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources/namespace.yaml' : lstat /workspace/test/fixtures/simple_zonal_with_asm/asm-dir/asm-patch/resources/namespace.yaml: no such file or directory, get: invalid source string: namespace.yaml\""

Quick workaround was to use asm-patch@6941cf9f714485518f3f87eb0eda0cd47ffb96e4
Happy to provide additional logs if needed.

Add multicluster overlay for scriptoro by default

Add the multicluster overlay for scriptoro install by default.
https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages/blob/release-1.7-asm/asm/istio/options/multicluster.yaml

this overlay requires kpt cfg set for project id, location, and cluster name.

Background

In ASM 1.7, we move multicluster overlay out of the original base istio-operator.yaml because it can save some steps when users install ASM for single cluster manually, also reduced the chance of broken installation by mismatch between location+cluster_name and the actual location+cluster_name.

But scriptoro won't make similar human mistake in location or cluster name, and the validation part will also avoid this issue. So I'd suggest adding multicluster overlay by default, so when users want to apply multicluster feature, they do not have to re-apply the multicluster overlay to each cluster. Just follow this guide, the multicluster will work: https://cloud.google.com/service-mesh/docs/gke-install-multi-cluster.

Get rid of eastwest gateway script

Instead of requiring people to run this script, we should just put the different options as discrete files in istio/options and then add some logic to combine them when running the kubectl command to set up the gateway.

Track releases with tags

In order for installs to be reproducible, we should use tags to represent patch releases in each minor branch. Docs will also need to be updated to point to the tags, rather than the branch.

Default node-pool created by asm/anthoscli has no gcr access

The default node pool does not include the necessary oauthScopes to access gcr.
There is also no convenient setter for kpt available to enable this.

As a workaround we manually patched our nodepool.yaml to be able to access the images in GCR belonging to the same GCP project.

validate-asm errors and multiple nodepools

Hi,

I have a regional GKE cluster with multiple nodepools and while validating asm install using kpt fn source ${BASE_DIR} | kpt fn run --image gcr.io/kustomize-functions/validate-asm:v0.1.0 produces the following errors/warnings:

2 warning(s) occurred:
        * Warning - spec.nodeCount is 2 in ContainerNodePool asm-node-pool (all.yaml [8]). ASM requires at least four nodes. If you need to add nodes, see https://bit.ly/2RnVL2T
        * Warning - spec.initialNodeCount missing in ContainerNodePool default-pool (all.yaml [9])

2 error(s) occurred:
        * Error - unsupported machine type: n1-standard-1 in ContainerCluster foo-cluster (all.yaml [7]). The minimum machine type is n1-standard-4, which has four vCPUs. If the machine type for your cluster doesn't have at least four vCPUs, change the machine type as described here https://bit.ly/2V0KPdu
        * Error - unsupported machine type: n1-standard-1 in ContainerNodePool default-pool (all.yaml [9]). The minimum machine type is n1-standard-4, which has four vCPUs. If the machine type for your cluster doesn't have at least four vCPUs, change the machine type as described here https://bit.ly/2V0KPdu

My default-pool has 0 nodes and asm-node-pool has 6 (2 per zone) of type e2-standard-4. Based on the documentation my understanding was that the requirement was

  • At least four nodes.
  • The minimum machine type is e2-standard-4

Any insight into these errors or if I am missing something would be great!

add_cluster_labels function fails in install_asm script

install_asm script retrieves existing labels and applies them back with some more labels. Labels retrieve by gcloud container clusters describe command returns values separated by semicolons. When the script executes gcloud container clusters update with some values separated by semicolons it fails. Values should be separated by comma.

Add anthos.googleapis.com to required services

Anthos Service Mesh dashboards are not visible in the Cloud Console until anthos.googleapis.com is enabled. Since only enabling API does not trigger Anthos subscription, it could be considered to enable the Anthos API by default.

install_asm script is failing on Mac

install_asm script is failing on mac with the following error when I use --output_dir option. Is there any plan for making this script available for Mac?

nstall_asm: Setting up necessary files...
readlink: illegal option -- f
usage: readlink [-n] [file ...]

GCP_METADATA is not set correctly when environ Id is different from project Id

This is for configuring multi-cluster in multiprojects. Since we will use the same environ Id for all the clusters in different projects, the clusters not created in the environ project will have GCP_METADATA misconfigured.

Reproduce steps:

  1. Download package:
    kpt pkg get https://github.com/GoogleCloudPlatform/anthos-service-mesh-packages.git/[email protected] .
  2. Set environment variables. Here PROJECT_ID_2 is the environ project:
export PROJECT_ID_1=<project 1>
export PROJECT_ID_2=<project 2>
export CLUSTER_1=<cluster 1>
export CLUSTER_2=<cluster 2>
export LOCATION_1=<location 1>
export LOCATION_2=<location 2>
export PROJECT_NUMBER=$(gcloud projects describe ${PROJECT_ID_2} --format="value(projectNumber)") 
  1. Set kpt package:
kpt cfg set asm gcloud.container.cluster ${CLUSTER_1}
kpt cfg set asm gcloud.compute.location ${LOCATION_1}
kpt cfg set asm gcloud.core.project ${PROJECT_ID_1}
kpt cfg set asm gcloud.project.projectNumber ${PROJECT_NUMBER}
kpt cfg set asm anthos.servicemesh.trustDomainAliases ${PROJECT_ID_2}.svc.id.goog

After the steps, we will see GCP_METADATA is configured as
<project 1>|<project number 2>|<cluster 1>|<location 1>

install_asm fails to call meshconfig API

Hello! I'm using install_asm as downloaded from the storage bucket here:

https://storage.googleapis.com/csm-artifacts/asm/install_asm_1.7

I've got a very simple setup. Just trying to install ASM into a single GKE cluster in one project.

I'm using a service account created specifically for this purpose.

The validation succeeds (i.e. running install_asm with --only_validate).

However, when I run the script to actually install ASM, I get a "401 Unauthorized" error when the script tries to access the meshconfig API. Here's how I'm invoking the script:

    ./install_asm -v \
          --project_id myprojectid \
          --cluster_name myclustername \
          --cluster_location us-central1-a \
          --mode install \
          --enable_apis \
          --service_account [email protected] \
          --key_file ./sa.json

It gets all the way down to "Initializing Mesh CA", and then fails with a "401 Unauthorized" as below:

install_asm: Initializing Mesh CA...
install_asm: Running: 'curl --request POST --fail --data  -o /dev/null https://meshconfig.googleapis.com/v1alpha1/projects/myprojectid:initialize --header @-'
install_asm: -------------
curl: (22) The requested URL returned error: 401 Unauthorized

I notice that the script is granting the following roles to the service account:

editor
compute.admin
container.admin
resourcemanager.projectIamAdmin
iam.serviceAccountAdmin
iam.serviceAccountKeyAdmin
gkehub.admin

However, the script does not grant meshconfig.admin. I'm guessing that that might be required here?

Also interesting: I see no roles related to the meshconfig API in the Google Cloud web console for my project, not even when I search for the meshconfig string in the Roles section.

The meshconfig API is fully enabled for my project.

I'm at a bit of a loss here. Any help would be appreciated. Thank you!

RAW_YAML: unbound variable for ASM MCP installation

Installation printed error message in the information part.

curl https://storage.googleapis.com/csm-artifacts/asm/install_asm_1.8 > install_asm
install_asm: *****************************
install_asm: The ASM control plane installation is now complete.
install_asm: To enable automatic sidecar injection on a namespace, you can use the following command:
install_asm: kubectl label namespace <NAMESPACE> istio-injection- istio.io/rev=asm-managed --overwrite
install_asm: If you use 'istioctl install' afterwards to modify this installation, you will need
install_asm: to specify the option '--set revision=asm-managed' to target this control plane
install_asm: instead of installing a new one.
install_asm: To finish the installation, enable Istio sidecar injection and restart your workloads.
install_asm: For more information, see:
install_asm: https://cloud.google.com/service-mesh/docs/proxy-injection
install_asm: The ASM package used for installation can be found at:
install_asm: /tmp/tmp.DP5ltEPsWS/asm
install_asm: The version of istioctl that matches the installation can be found at:
install_asm: /tmp/tmp.DP5ltEPsWS/istio-1.8.1-asm.5/bin/istioctl
install_asm: The combined configuration generated for installation can be found at:
./install_asm: line 1831: RAW_YAML: unbound variable

question: how to use install_asm to upgrade exiting installation?

i've used the install_asm with default approach (gcp profile)

now if i want to add eagressgateway or other options that do not exist on asm/istio/options files (such as multiple ingresses), how can i achieve that?
can i rerun npm_install with custom overlay?
it looks like it will reinstall everything and not upgrade

Refactor install_asm tests

There's a lot of repeated logic, poor variable names, and magic constants that should be cleaned up in scripts/asm_installer/tests. We should go through and apply the same style standards that the script itself follows to the tests as well.

Too many roles are assigned on istiod service account

The following line

bind_user_to_iam_policy "$(required_iam_roles)" "$(iam_user)"

attempts to add the following IAM roles on the managed istiod service account:

roles/servicemanagement.admin
roles/serviceusage.serviceUsageAdmin
roles/meshconfig.admin
roles/compute.admin
roles/container.admin
roles/resourcemanager.projectIamAdmin
roles/iam.serviceAccountAdmin
roles/iam.serviceAccountKeyAdmin
roles/gkehub.admin
roles/privateca.admin

These roles are the same roles that the local IAM user needs to do many things. I highly doubt the managed istiod SA requires such extensive roles on user's project like resourcemanager.projectIamAdmin or compute.admin. In fact I was able to verify that without at least resourcemanager.projectIamAdmin, things work fine. Can these roles be reviewed?

Save config when installing

As a follow up for #169, it makes sense to dump this yaml all of the time anyways so that users can keep it around if they ever need to roll back to it. We already specify an output directory, so just write asm-${DATE}.yaml. It probably also makes sense to put a comment at the top with the invocation used. (i.e. echo "# ${*}" > file; print_config >> file)

ASM Install Script - asm package not accessible from GCS bucket

I suspect this issue is a result of not having a git release based off 1.7 tag.

install_asm: Setting up necessary files...
install_asm: Creating temp directory...
install_asm: Generating a new kubeconfig...
install_asm: Checking installation tool dependencies...
install_asm: Downloading ASM..
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   214  100   214    0     0   6903      0 --:--:-- --:--:-- --:--:--  6903

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now
curl -L https://storage.googleapis.com/gke-release/asm/istio-1.7.3-asm.3-linux-amd64.tar.gz
<?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: gke-release/asm/istio-1.7.3-asm.3-linux-amd64.tar.gz</Details></Error>

Validate ASM image

We should validate that the Docker image exists/is fetchable before attempting to install ASM, even if the config packages are available.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.