Coder Social home page Coder Social logo

stolostron / submariner-addon Goto Github PK

View Code? Open in Web Editor NEW
17.0 4.0 34.0 47.07 MB

An addon of submariner in ocm to provide network connectivity and service discovery among clusters.

License: Apache License 2.0

Dockerfile 0.09% Makefile 1.78% Go 93.32% Shell 4.80%

submariner-addon's Introduction

submariner-addon

An integration between ACM and Submariner. Submariner enables direct networking between Pods and Services in different Kubernetes clusters.

Community, discussion, contribution, and support

Check the CONTRIBUTING Doc for how to contribute to the repo.

Test Locally with kind

The steps below can be used for testing on a local environment:

Note: kind, kubectl, and imagebuilder are required.

  1. Clone this repository by using git clone.

  2. Build the submariner-addon image locally by running make images.

  3. Prepare clusters by running make clusters. This will:

    • Create two clusters: cluster1 and cluster2. cluster1 is going to be used as the Hub.
    • Load the local Docker images to the kind cluster cluster1.
    • Deploy the operator-lifecycle-manager
    • Deploy the ClusterManager and submariner-addon on cluster1. This includes the required Hub cluster components.
    • Deploy the Klusterlet on cluster1 and cluster2. This includes the required the managed cluster agents.
    • Join cluster1 and cluster2 to the Hub cluster cluster1, the cluster1 and cluster2 are the managed clusters.
  4. Run the demo by issuing make demo. This will:

    • Label the managed clusters with cluster.open-cluster-management.io/clusterset: clusterset1.
    • Create a ClusterSet.
    • Create ManagedClusterAddon on each managed cluster namespaces.
    • Deploy the Submariner Broker on the Hub cluster and the required Submariner components on the managed clusters.
    • Interconnect cluster1 and cluster2 using Submariner.

To delete the kind environment, use make clean.

Test with OCP

Note: minimum supported version is OpenShift 4.4/Kubernetes 1.17

The steps below can be used to test with OpenShift Container Platform (OCP) clusters on AWS:

Setup of Cluster Manager and Klusterlet

  1. Prepare 3 OCP clusters (1 Hub cluster and 2 managed clusters) on AWS for Submariner. Please refer to this section for detailed instructions.

  2. On the Hub cluster, install Cluster Manager Operator and instance (version >= 0.2.0) from OperatorHub.

  3. On the managed clusters, install Klusterlet Operator and instance (version >= 0.2.0) from OperatorHub.

  4. Approve the ManagedClusters on the hub cluster.

    $ oc get managedclusters
    $ oc get csr | grep <managedcluster name> | grep Pending
    $ oc adm certificate approve <managedcluster csr>
    
  5. Accept the ManagedClusters on the Hub cluster.

    $ oc patch managedclusters <managedcluster name> --type merge --patch '{"spec":{"hubAcceptsClient":true}}'
    

Install the Submariner-addon on the Hub cluster

  1. Apply the manifests of submariner-addon.

    $ oc apply -k deploy/config/manifests
    

Setup Submariner on the Hub cluster

  1. Create a ManagedClusterSet.

    apiVersion: cluster.open-cluster-management.io/v1beta1
    kind: ManagedClusterSet
    metadata:
      name: pro
    
  2. Join the ManagedClusters into the ManagedClusterSet.

    $ oc label managedclusters <managedcluster name> "cluster.open-cluster-management.io/clusterset=pro" --overwrite
    
  3. Create a ManagedClusterAddon in the managed cluster namespace to deploy the Submariner on the managed cluster.

    apiVersion: addon.open-cluster-management.io/v1alpha1
    kind: ManagedClusterAddOn
    metadata:
      name: submariner
      namespace: <managedcluster name>
    spec:
      installNamespace: submariner-operator
    

    Note: the name of ManagedClusterAddOn must be submariner

Test with ACM

The add-on has been integrated into ACM 2.2 as a default component:

  1. Install ACM following the deploy repo.

  2. Import or create OCP clusters as managed cluster through the ACM console UI.

    Note: The manged clusters must meet the Prerequisites for Submariner.

  3. To test an in-development version of the addon, build an image and push it to one of your repositories on Quay, then edit the ClusterServiceVersion resource:

    kubectl edit ClusterServiceVersion -n open-cluster-management
    

    to replace all instances of the addon with your image tag.

    You can find the appropriate digest on Quay by clicking on the “download” button and choosing “Docker Pull (by digest)”.

  4. Start deploying Submariner to managed clusters following the Setup of Submariner on the Hub cluster above.

To use a different version of Submariner itself, edit submariner.io-submariners-cr.yaml and rebuild your image.

submariner-addon's People

Contributors

aswinsuryan avatar cloudbehl avatar dependabot[bot] avatar dfarrell07 avatar dislbenn avatar giannisalinetti avatar jaanki avatar ldpliu avatar maayanf24 avatar maxbab avatar mkolesnik avatar nyechiel avatar openshift-ci[bot] avatar openshift-merge-robot avatar qiujian16 avatar ruromero avatar sataqiu avatar skeeey avatar skitt avatar sridhargaddam avatar tpantelis avatar vthapar avatar yboaron avatar zhiweiyin318 avatar zhujian7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

submariner-addon's Issues

Switch m5n.large to c5d.large as default image for submariner gateways in AWS

in AWS if we move from m5n.large to c5d.large instances, the GW to GW bandwidth jumps from 1Gbps to 2.3Gbps 🚀
using IPSEC.

This is because we get better CPU (cascadelake architecture) but worse network interface, and since we know the bottleneck is the CPU for both IPsec and wireguard this performs better.

Finish upgrading to Submariner 0.12

The code dependencies have been handled, but we still need to bump the dependency:

  • in pkg/hub/submarineragent/manifests/operator/submariner.io-submariners-cr.yaml (the repository also needs to be changed to registry.redhat.io/rhacm2)
  • submver in scripts/deploy.sh

Use cloud-prepare for cloud setup

All the cloud-specific setup in the addon is also implemented in Submariner’s cloud-prepare; using the latter instead would simplify development.

Missing ClusterRole for clustermanagementaddons

In submariner-addon Pod the following errors are noticed.

E0124 10:17:48.202163       1 reflector.go:138] k8s.io/[email protected]+incompatible/tools/cache/reflector.go:167: Failed to watch *v1alpha1.ClusterManagementAddOn: failed to list *v1alpha1.ClusterManagementAddOn: clustermanagementaddons.addon.open-cluster-management.io is forbidden: User "system:serviceaccount:open-cluster-management:submariner-addon" cannot list resource "clustermanagementaddons" in API group "addon.open-cluster-management.io" at the cluster scope

Add support for Azure cloud prepare

Add support for Azure cloud prepare in submariner addon,

*By opening the required port for submariner
*By creating a dedicated g/w node for submariner.

Add lifecycle plugin

The plugin closes, reopens, flags and/or unflags an issue or PR as frozen/stale/rotten.

After 90 days with no activity, an issue is automatically labeled as lifecycle/stale. The issue will be automatically closed if the lifecycle is not manually reverted using the /remove-lifecycle stale command.

An issue with lifecycle/frozen label will not become stale after 90 days of inactivity. A user manually adds this label to issues that need to remain open for much longer than 90 days.

Move AWS cloud preparation to the spoke

GCP cloud preparation is done in the spoke so, for consistency, simplicity, and maintainability, we should perform all cloud preparation in the spoke. This would make the controllers platform agnostic and put all platform decision logic in the cloud provider factory.

A connection between the clusters is not working with kind deployment

After a deployment of a test environment with kind, the connection between the clusters is not working.
A test deployment has an errors on submariner-operator pod:

{"level":"error","ts":1606133404.3297968,"logger":"controller_submariner","msg":"failed to update the Submariner status","Request.Namespace":"submariner-operator","Request.Name":"submariner","error":"Submariner.submariner.io "submariner" is invalid: [status.engineDaemonSetStatus.currentNumberScheduled: Required value, status.engineDaemonSetStatus.desiredNumberScheduled: Required value, status.engineDaemonSetStatus.numberMisscheduled: Required value, status.engineDaemonSetStatus.numberReady: Required value, status.routeAgentDaemonSetStatus.currentNumberScheduled: Required value, status.routeAgentDaemonSetStatus.desiredNumberScheduled: Required value, status.routeAgentDaemonSetStatus.numberMisscheduled: Required value, status.routeAgentDaemonSetStatus.numberReady: Required value, status.globalnetDaemonSetStatus.currentNumberScheduled: Required value, status.globalnetDaemonSetStatus.desiredNumberScheduled: Required value, status.globalnetDaemonSetStatus.numberMisscheduled: Required value, status.globalnetDaemonSetStatus.numberReady: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsubmariner-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/submariner-io/submariner-operator/pkg/controller/submariner.(*ReconcileSubmariner).Reconcile\n\tsubmariner-operator/pkg/controller/submariner/submariner_controller.go:250\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:246\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
{"level":"error","ts":1606133404.3300195,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"submariner-controller","name":"submariner","namespace":"submariner-operator","error":"Submariner.submariner.io "submariner" is invalid: [status.engineDaemonSetStatus.currentNumberScheduled: Required value, status.engineDaemonSetStatus.desiredNumberScheduled: Required value, status.engineDaemonSetStatus.numberMisscheduled: Required value, status.engineDaemonSetStatus.numberReady: Required value, status.routeAgentDaemonSetStatus.currentNumberScheduled: Required value, status.routeAgentDaemonSetStatus.desiredNumberScheduled: Required value, status.routeAgentDaemonSetStatus.numberMisscheduled: Required value, status.routeAgentDaemonSetStatus.numberReady: Required value, status.globalnetDaemonSetStatus.currentNumberScheduled: Required value, status.globalnetDaemonSetStatus.desiredNumberScheduled: Required value, status.globalnetDaemonSetStatus.numberMisscheduled: Required value, status.globalnetDaemonSetStatus.numberReady: Required value]","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tsubmariner-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tsubmariner-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tsubmariner-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Move resources creation logic to submariner-operator

submariner-addon should not bother about the details of the resources Submariner needs. It should just be able to CRUD resources when needed. The how and why of the resources should be handled by submariner-operator. Offload resources creation functionality to submariner-operator. Below resource creation will be moved

  • Broker namespace
  • submariner namespace
  • submariner CRDs
  • RBAC

Advantages:
Any changes made to any the above resources by Submariner need not be duplicated to submariner-addon as well.

gcp: Gateway nodes not labelled due to resource limits

Installing OCP 4.10 with ACM 2.5.1 on GCP failed on one of the clusters with error Gateway Nodes not labelled.

apiVersion: submarineraddon.open-cluster-management.io/v1alpha1
kind: SubmarinerConfig
metadata:
  creationTimestamp: "2022-06-22T08:41:18Z"
  finalizers:
  - submarineraddon.open-cluster-management.io/config-cleanup
  generation: 2
  name: submariner
  namespace: blue
  resourceVersion: "101685"
  uid: 8d8db683-912c-4447-857b-63c1c14d6ba3
spec:
  IPSecIKEPort: 500
  IPSecNATTPort: 4500
  NATTDiscoveryPort: 4900
  NATTEnable: true
  cableDriver: vxlan
  credentialsSecret:
    name: blue-gcp-creds
  gatewayConfig:
    aws:
      instanceType: m5n.large
    gateways: 1
  imagePullSpecs: {}
  loadBalancerEnable: false
  subscriptionConfig:
    source: redhat-operators
    sourceNamespace: openshift-marketplace
status:
  conditions:
  - lastTransitionTime: "2022-06-22T08:41:24Z"
    message: SubmarinerConfig was applied
    reason: SubmarinerConfigApplied
    status: "True"
    type: SubmarinerConfigApplied
  - lastTransitionTime: "2022-06-22T08:41:31Z"
    message: Submariner cluster environment was prepared
    reason: SubmarinerClusterEnvPrepared
    status: "True"
    type: SubmarinerClusterEnvironmentPrepared
  - lastTransitionTime: "2022-06-22T08:41:31Z"
    message: The 0 worker nodes labeled as gateways ("") does not match the desired
      number 1
    reason: InsufficientNodes
    status: "False"
    type: SubmarinerGatewaysLabeled
  managedClusterInfo:
    clusterName: blue
    infraId: vthapar-blue-vr7gj
    platform: GCP
    region: us-east1
    vendor: OpenShift

UPDATE: This was due to resource quote limit issues, but it was not clear enough from logs. There is nothing to fix but improve the logging and error message. Instead of telling user "gw not labelled" it would be better to give error message "Failed to create gw nodes" alongwith relevant resource limit error msg string.

We need a job to verify that the catalog is kept up-to-date

Currently, nothing checks that the catalog manifests are kept up-to-date, leading to discrepancies between developers’ expectations and what actually ends up on clusters (see #291 and #292).

This should be enforce by CI to make sure that the catalog is updated whenever anything feeding it is changed.

Descope the addon

Epic Description

The Submariner addon currently has wide-ranging privileges. It doesn’t need to be able to access anything outside the namespaces it manages, so this should be reduced. See https://hackmd.io/wVfLKpxtSN-P0n07Kx4J8Q for background.

This might not be appropriate if the addon needs to be able to manage namespaces which aren’t known ahead of time. If so, the justification for its cluster-wide privileges needs to be documented.

Acceptance Criteria

The operator is de-scoped, ideally with no ClusterRole, at minimum with justifications for every permission in its ClusterRole.

See also submariner-io/enhancements#75 for the Submariner operator.

Definition of Done (Checklist)

  • Code complete
  • Relevant metrics added
  • The acceptance criteria met
  • Unit/e2e test added & pass
  • CI jobs pass
  • Deployed using cloud-prepare+subctl
  • Deployed using ACM/OCM addon
  • Deploy using Helm
  • Deployed on supported platforms (for e.g kind, OCP on AWS, OCP on GCP)
  • Run subctl verify, diagnose and gather
  • Uninstall
  • Troubleshooting (gather/diagnose) added
  • Documentation added
  • Release notes added

Work Items

Updating the images in ClusterServiceVersion is not restarting the pods

Following the readme, I modified the submariner-addon image by updating the ClusterServiceVersion. However, the submariner-addon pods in the open-cluster-management namespaces are not restarted and the corresponding deployment is not updated as well.

Manually updating the submariner-addon deployment in open-cluster-management namespace fixed the issue.

Add support for pre-existing gateway nodes

sometime people will want to have full control on how to create the gateway node. If deploying via Hive, it's very easy to declaratively add a new machineset with full control on the machine characteristics, for example. It should be possible to specify that the gateway not should not be created and perhaps specify a node selector to select the gatewya nodes. If the selector if fixed (I believe it's hard coded to be submariner.io/gateway: 'true') then the system might just check to see if nodes like that exists and create new ones only if they don't.

Add support for VPC peering

when deploying submariner to a set of clusters deployed on the same cloud provider, it makes sense to Peer the VPCs of the clusters. This provides both an optimized and secure network path.
It would be nice to have the submariner add on taking care of the peering. Here is how it can be doine in AWS:

export infrastructure_id1=$(oc --context cluster1 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
export infrastructure_id2=$(oc --context cluster2 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
export infrastructure_id3=$(oc --context cluster3 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
export region1="us-east-1"
export region2="us-east-2"
export region3="us-west-2"
export vpc_id1=$(aws --region ${region1} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id1}-vpc | jq -r .Vpcs[0].VpcId)
export vpc_id2=$(aws --region ${region2} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id2}-vpc | jq -r .Vpcs[0].VpcId)
export vpc_id3=$(aws --region ${region3} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id3}-vpc | jq -r .Vpcs[0].VpcId)
export vpc1_main_route_table_id=$(aws --region ${region1} ec2 describe-route-tables --filter Name=tag:Name,Values=${infrastructure_id1}-public | jq -r .RouteTables[0].RouteTableId)
export vpc2_main_route_table_id=$(aws --region ${region2} ec2 describe-route-tables --filter Name=tag:Name,Values=${infrastructure_id2}-public | jq -r .RouteTables[0].RouteTableId)
export vpc3_main_route_table_id=$(aws --region ${region3} ec2 describe-route-tables --filter Name=tag:Name,Values=${infrastructure_id3}-public | jq -r .RouteTables[0].RouteTableId)
export cluster1_node_cidr=$(aws --region ${region1} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id1}-vpc | jq -r .Vpcs[0].CidrBlock)
export cluster2_node_cidr=$(aws --region ${region2} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id2}-vpc | jq -r .Vpcs[0].CidrBlock)
export cluster3_node_cidr=$(aws --region ${region3} ec2 describe-vpcs  --filter Name=tag:Name,Values=${infrastructure_id3}-vpc | jq -r .Vpcs[0].CidrBlock)

# make all the peering requests
export peering_connection1_2=$(aws --region ${region1} ec2 create-vpc-peering-connection --vpc-id ${vpc_id1} --peer-vpc-id ${vpc_id2} --peer-region ${region2} | jq -r .VpcPeeringConnection.VpcPeeringConnectionId)
export peering_connection1_3=$(aws --region ${region1} ec2 create-vpc-peering-connection --vpc-id ${vpc_id1} --peer-vpc-id ${vpc_id3} --peer-region ${region3} | jq -r .VpcPeeringConnection.VpcPeeringConnectionId)
export peering_connection2_3=$(aws --region ${region2} ec2 create-vpc-peering-connection --vpc-id ${vpc_id2} --peer-vpc-id ${vpc_id3} --peer-region ${region3} | jq -r .VpcPeeringConnection.VpcPeeringConnectionId)

# accept peering requests

aws --region ${region2} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection1_2}
aws --region ${region3} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection1_3}
aws --region ${region3} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection2_3}

#modify peering requests

aws --region ${region1} ec2 modify-vpc-peering-connection-options --vpc-peering-connection-id ${peering_connection1_2} --requester-peering-connection-options AllowDnsResolutionFromRemoteVpc=true
aws --region ${region1} ec2 modify-vpc-peering-connection-options --vpc-peering-connection-id ${peering_connection1_3} --requester-peering-connection-options AllowDnsResolutionFromRemoteVpc=true
aws --region ${region2} ec2 modify-vpc-peering-connection-options --vpc-peering-connection-id ${peering_connection2_3} --requester-peering-connection-options AllowDnsResolutionFromRemoteVpc=true

# accept peering request modification

aws --region ${region2} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection1_2}
aws --region ${region3} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection1_3}
aws --region ${region3} ec2 accept-vpc-peering-connection --vpc-peering-connection-id ${peering_connection2_3}

# create routing tables

aws --region ${region1} ec2 create-route --destination-cidr-block ${cluster2_node_cidr} --vpc-peering-connection-id ${peering_connection1_2} --route-table-id ${vpc1_main_route_table_id}
aws --region ${region2} ec2 create-route --destination-cidr-block ${cluster1_node_cidr} --vpc-peering-connection-id ${peering_connection1_2} --route-table-id ${vpc2_main_route_table_id}
aws --region ${region1} ec2 create-route --destination-cidr-block ${cluster3_node_cidr} --vpc-peering-connection-id ${peering_connection1_3} --route-table-id ${vpc1_main_route_table_id}
aws --region ${region3} ec2 create-route --destination-cidr-block ${cluster1_node_cidr} --vpc-peering-connection-id ${peering_connection1_3} --route-table-id ${vpc3_main_route_table_id}
aws --region ${region2} ec2 create-route --destination-cidr-block ${cluster3_node_cidr} --vpc-peering-connection-id ${peering_connection2_3} --route-table-id ${vpc2_main_route_table_id}
aws --region ${region3} ec2 create-route --destination-cidr-block ${cluster2_node_cidr} --vpc-peering-connection-id ${peering_connection2_3} --route-table-id ${vpc3_main_route_table_id}

here is an example with google cloud:

export gcp_project_id=$(cat ~/.gcp/osServiceAccount.json | jq -r .project_id)
export network_1=$(oc --context cluster1 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')-network
export network_2=$(oc --context cluster2 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')-network
export network_3=$(oc --context cluster3 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')-network

# 1-2
gcloud compute networks peerings create peer-12 --network ${network_1} --peer-project ${gcp_project_id} --peer-network ${network_2} --import-custom-routes --export-custom-routes
gcloud compute networks peerings create peer-21 --network ${network_2} --peer-project ${gcp_project_id} --peer-network ${network_1} --import-custom-routes --export-custom-routes

# 1-3
gcloud compute networks peerings create peer-13 --network ${network_1} --peer-project ${gcp_project_id} --peer-network ${network_3} --import-custom-routes --export-custom-routes
gcloud compute networks peerings create peer-31 --network ${network_3} --peer-project ${gcp_project_id} --peer-network ${network_1} --import-custom-routes --export-custom-routes

# 2-3
gcloud compute networks peerings create peer-23 --network ${network_2} --peer-project ${gcp_project_id} --peer-network ${network_3} --import-custom-routes --export-custom-routes
gcloud compute networks peerings create peer-32 --network ${network_3} --peer-project ${gcp_project_id} --peer-network ${network_2} --import-custom-routes --export-custom-routes

export infrastructure_1=$(oc --context cluster1 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
export infrastructure_2=$(oc --context cluster2 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
export infrastructure_3=$(oc --context cluster3 get infrastructure cluster -o jsonpath='{.status.infrastructureName}')
gcloud compute firewall-rules create --network ${infrastructure_1}-network --target-tags ${infrastructure_1}-worker --direction Ingress --source-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_1}-submariner-in
gcloud compute firewall-rules create --network ${infrastructure_2}-network --target-tags ${infrastructure_2}-worker --direction Ingress --source-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_2}-submariner-in
gcloud compute firewall-rules create --network ${infrastructure_3}-network --target-tags ${infrastructure_3}-worker --direction Ingress --source-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_3}-submariner-in

gcloud compute firewall-rules create --network ${infrastructure_1}-network --direction OUT --destination-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_1}-submariner-out
gcloud compute firewall-rules create --network ${infrastructure_2}-network --direction OUT --destination-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_2}-submariner-out
gcloud compute firewall-rules create --network ${infrastructure_3}-network --direction OUT --destination-ranges 0.0.0.0/0 --allow udp:500,udp:4500,udp:4800,esp ${infrastructure_3}-submariner-out

when using VPC peering there is no need to use external IP addresses nor to encrypt the tunnel. internal IPs should be used with no NAT and a cable wire option with no encryption should be possible to select in this instance.

Use Shipyard to build and deploy KIND clusters

Shipyard provides common tooling like Makefiles, scripts etc and to deploy KIND clusters which can be used for development and e2e testing. Currently, submariner-addon repo uses its own files for building images and to deploying clusters. Adding Shipyard to submariner-addon allows us to have a consistent experience with other Submariner repos.

Run make demo as part of CI

make demo is used for dev testing changes locally in a kind setup but it is not run as part of CI/CD. This makes it difficult to catch bugs that break local kind deployments.

Create a ClusterRole to allow brokers.submariner.io object

Starting with ACM 2.5, a user who is installing Submariner Addon on the ManagedCluster should have permissions to create/update/read brokers.submariner.io object in the ManagedClusterSet broker namespace. This object needs to be created whether Globalnet is enabled or not.

A clusterAdmin has necessary privileges to create this object, but for a non-ClusterAdmin, an explicit role has to be assigned.
This issue is to track the ClusterRole creation as part of Submariner addon initialization.

The following PR installed the ClusterRole, but we had to revert it as it was creating issues. So we have to identify the appropriate approach and implement the necessary changes in the code to install this CR.

Support Load-Balancer mode when deploying Submariner

What would you like to be added:
Submariner supports Loadbalancer mode for supporting tunnels. It can be used instead of creating external-ips and associating them to the instances.

Why is this needed:
This will ease the deployment process.

running `make clusters` fails

When running make clusters the creation of kind clusters runs in parallel and causes a network ambiguity

ERROR: failed to create cluster: docker run error: command "docker run --hostname cluster1-control-plane --name cluster1-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro --detach --tty --label io.x-k8s.kind.cluster=cluster1 --net kind --restart=on-failure:1 --publish=127.0.0.1:45585:6443/TCP kindest/node:v1.18.0" failed with error: exit status 125
Command Output: f12cc0db5f1ebbb04fb13aa305c82e79ef5f9a01174c2b11d61bae75b80719a5
docker: Error response from daemon: network kind is ambiguous (3 matches found on name).
ERROR: failed to create cluster: docker run error: command "docker run --hostname cluster3-worker --name cluster3-worker --label io.x-k8s.kind.role=worker --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro --detach --tty --label io.x-k8s.kind.cluster=cluster3 --net kind --restart=on-failure:1 kindest/node:v1.18.0" failed with error: exit status 125
Command Output: 4909026c5336555aed3c8af047fab0cf13d47b7b6af729f471029814743bf27b
docker: Error response from daemon: network kind is ambiguous (3 matches found on name).
ERROR: failed to create cluster: docker run error: command "docker run --hostname cluster2-control-plane --name cluster2-control-plane --label io.x-k8s.kind.role=control-plane --privileged --security-opt seccomp=unconfined --security-opt apparmor=unconfined --tmpfs /tmp --tmpfs /run --volume /var --volume /lib/modules:/lib/modules:ro --detach --tty --label io.x-k8s.kind.cluster=cluster2 --net kind --restart=on-failure:1 --publish=127.0.0.1:40165:6443/TCP kindest/node:v1.18.0" failed with error: exit status 125
Command Output: 462b2973e7dcd7560b5df67a98ef3546be9b256c55d25002a661672eece81788
docker: Error response from daemon: network kind is ambiguous (3 matches found on name).

Settings should be carried over from one cluster to another

When installing the addon on multiple clusters, credentials and instance types etc. need to be specified for each cluster in succession. It would be useful if one cluster’s information could be carried over to the next, pre-populating the entries (but allowing them to be overriden).

Perform auto uninstall-add-on when cluster is deleted

When a cluster that is part of the clusterset is deleted without invoking "Uninstall add-on", we should internally cleanup the necessary endpoints from the hub cluster broker namespace. Otherwise, stale endpoints show up on the member clusters.

Add OVN support

When using OVNKubernetes as CNI we don't need to open all the vxlan ports in the cloud. So only open the vxlan ports required based on which CNI is used during cloud prepare step.

  1. Detect if OVNKubernetes is the CNI
  2. If yes, open the minimal set of ports.

Submariner changes before ACM 2.3 release

This issue is for tracking the changes for submariner v.9 before releasing ACM 2.3

  • Update the default channel to alpha-0.8 and alpha-0.9 respectively
  • Update the submariner CRDs (get synced with submariner-io projects on branch release-0.9).
  • Update the version field in Submariner CR to v0.9 currently it is set to v0.8

Support VXLAN connectivity

What would you like to be added:
Support VXLAN connectivity

Why is this needed:
This would help to use vxlan cable driver in ACM deployments

PRs should be merged without squashing

Currently, PRs are squashed and merged, which renders moot any work done to split them up into multiple sensible commits. They should instead be merged without squashing.

CRDs and CSVs should be checked during the build

make verify-scripts should check that regenerating the CRDs (make update-crds) and the CSV (make update-csv) doesn’t result in any change.

CRDs are already supposed to be verified using hack/verify-crds.sh, but it doesn’t compare the content of the CRDs with their theoretical content. A similar script should be written for the CSV.

ACM setup fails to resolve DNS queries with NODATA results

Description

When the submariner-addon is installed on a RHACM with one hub and two managed clusters DNS resolution fails with NODATA results thus preventing correct communication among services.

Envinronment

Hub Cluster

  • Cluster name: local-cluster
  • OCP version: 4.6.4
  • RHACM version: 2.1
  • Provider: AWS
  • Control plane: 3x m5.xlarge
  • Worker nodes: 3x m5.xlarge
  • Installation method: IPI
  • Networking: OpenShiftSDN
  • Service Network: 172.30.0.0/16
Managed Cluster 1
  • Cluster name: demo-cluster-01
  • OCP version: 4.6.4
  • Provider: AWS
  • Control plane: 3x m5.xlarge
  • Worker nodes: 3x m5.large
  • Installation method: IPI
  • Networking: OpenShiftSDN
  • Service Network: 172.31.0.0/16
Managed Cluster 2
  • Cluster name: demo-cluster-02
  • OCP version: 4.6.4
  • Provider: AWS
  • Control plane: 3x m5.xlarge
  • Worker nodes: 3x m5.large
  • Installation method: IPI
  • Networking: OpenShiftSDN
  • Service Network: 172.32.0.0/16

Steps to reproduce

  1. Install RHACM operator and create MulticlusterHub resource.
  2. Join the managed clusters using the generated command from the RHACM web console
  3. Wait for the clusters to be correctly joined:
    $ oc get managedclusters
    NAME              HUB ACCEPTED   MANAGED CLUSTER URLS   JOINED   AVAILABLE   AGE  
    demo-cluster-01   true                                  True     True        2d12h
    demo-cluster-02   true                                  True     True        2d12h
    local-cluster     true                                  True     True        3d7h
    
  4. Apply the submanriner manifest addon using kustomize and the resources provided in this repo
    $ oc apply -k deploy/config/manifests
    
  5. Wait for the submariner addon componentes to be deployed on both clusters
  6. Inspect correct DNS forwarding on the managed nodes
    $ oc get dnses.operator.openshift.io/default -o jsonpath='{.spec.servers[]}'
    map[forwardPlugin:map[upstreams:[172.31.92.81]] name:lighthouse zones:[clusterset.local]]
    
    $ oc get dnses.operator.openshift.io/default -o jsonpath='{.spec.servers[]}'
    map[forwardPlugin:map[upstreams:[172.32.169.237]] name:lighthouse zones:[clusterset.local]]
    
  7. Create a ManagedClusterSet on the hub cluster
    cat << EOF | kubectl apply -f
    apiVersion: cluster.open-cluster-management.io/v1alpha1
    kind: ManagedClusterSet
    metadata:
      name: pro
    
  8. Enable Submariner on the managed clusters
    $ oc label managedclusters <managedcluster name> "cluster.open-cluster-management.io/submariner-agent=true" --overwrite
    
  9. Join the managed clusters to the ManagedClusterSet
    $ oc label managedclusters <managedcluster name> "cluster.open-cluster-management.io/submariner-agent=true" --overwrite
    
  10. Create a test application on demo-cluster-02:
kubectl -n default create deployment nginx --image=nginx
kubectl -n default expose deployment nginx --port=80
kubectl -n default get svc nginx
  1. Export the application service on demo-cluster-02:
cat << EOF | kubectl -n default apply -f -
apiVersion: lighthouse.submariner.io/v2alpha1
kind: ServiceExport
metadata:
  name: nginx
  namespace: default
EOF
  1. Wait for the creation of the associated ServiceImport on the ManagedClusterSet:
$ kubectl get serviceimports.lighthouse.submariner.io --all-namespaces
NAMESPACE                          NAME                                  AGE
submariner-clusterset-pro-broker   nginx-default-demo-cluster-02         10h
  1. Install a dnstools pod to test connectivity on demo-cluster-01
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: dnstools
  labels:
    app: dnstools
spec:
  containers:
  - name: dnstools
    image: infoblox/dnstools:latest
    command: ["/bin/sleep", "3650d"]
EOF
  1. Test the name resolution
$ kubectl exec -it dnstools -- dig nginx.default.svc.clusterset.local

Expected results

The dig command should return NOERROR from the queried DNS (lighthouse) and a non-empty set of A records.

Actual results

The dig command returns NOERROR with an empty set of records (witch is the way for dig to return NODATA)

$ kubectl exec -it dnstools -- dig A nginx.default.svc.clusterset.local

; <<>> DiG 9.11.3 <<>> A nginx.default.svc.clusterset.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44834
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 30b5d7ca2e073162 (echoed)
;; QUESTION SECTION:
;nginx.default.svc.clusterset.local. IN	A

;; Query time: 3 msec
;; SERVER: 172.31.0.10#53(172.31.0.10)
;; WHEN: Sun Dec 13 00:44:09 UTC 2020
;; MSG SIZE  rcvd: 75

Notice that the query is conducted to the default cluster DNS service (dns-default), whose endpoints are the CoreDNS pods running on each node.

Additional considerations

The DNS requests looks correctly forwarded to the submariner-lighthouse-coredns pods. To verify this, simply query an NXDOMAIN (ie a non exported service) and the submariner-lighthouse-coredns pods will print logs similar to this:

[ERROR] plugin/errors: 3 nxdomain.default.svc.clusterset.local. A: plugin/lighthouse: record not found

This is enough to assume that the traffic is correctly forwarded from the clusters CoreDNS pods to the lighthouse CoreDNS pods.

The same results are produced by calling the lighthouse-coredns service directly:

$ kubectl exec -it dnstools -- dig @$(oc get svc/submariner-lighthouse-coredns -n submariner-operator -o jsonpath={.spec.clusterIP}) nginx.default.svc.clusterset.local
; <<>> DiG 9.11.3 <<>> @172.31.92.81 nginx.default.svc.clusterset.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61178
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 55d541eabac4035d (echoed)
;; QUESTION SECTION:
;nginx.default.svc.clusterset.local. IN	A

;; Query time: 0 msec
;; SERVER: 172.31.92.81#53(172.31.92.81)
;; WHEN: Sun Dec 13 00:59:24 UTC 2020
;; MSG SIZE  rcvd: 75

This demonstrates that the issue should not be related to DNS forwarding.

Upgrade the addon to Submariner 0.12.0-rc0

This involves a number of changes.

  • Upgrade the dependencies in go.mod and deal with the fallout.
  • Ensure that 0.12.0-rc0 images are available in the appropriate repository.
  • Change the default deployed version in pkg/hub/submarineragent/manifests/operator/submariner.io-submariners-cr.yaml.
  • Change the deployed version in scripts/deploy.sh.

Support NAT Traversal

  • What would you like to be added:
    Support enable/disable NAT Traversal feature of Submariner

  • Why is this needed:
    NAT Traversal is always enabled. Submariner supports option to disable but this option is missing in submariner-addon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.