Comments (10)
I would like to work on a PR for this specific issue, but in case I won't be able to work on it on a short-term I'd like to lay down some recommendations for whoever would like to propose a PR before I do:
Limitations
There should be a check on the Kubernetes version, as mentioned above this feature was introduced only starting from the version 1.17.6-gke.11
. I am pretty confident that in Helm it is possible to verify the current version and disable features on unsupported releases.
Annotations
The following annotations should be configured automatically IMO as they are not trivial and it takes quite some time to find them in the GKE documentation (they are buried in some exotic pages about Load Balancers and Ingresses):
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/backend-config: '{"default": "http-public"}'
BackendConfig
The following resource should be added to the resources of the chart and should be toggled by the typical enabled
flag:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: http-public
namespace: hydra
spec:
healthCheck:
checkIntervalSec: 5
timeoutSec: 3
healthyThreshold: 1
unhealthyThreshold: 3
type: HTTP
requestPath: /health/ready
port: 4444
Template
The values.yaml
should probably implement something on the line of:
# The following configuration enabled a custom BackendConfig and HealthCheck on GKE.
# This configuration *must* be enabled if you want to use an Ingress on the "public" endpoint on GKE.
# If you want to enable TLS on this port, please change the protocol to "HTTPS", additionally, you will need to add the annotation "cloud.google.com/app-protocols: '{"4444": "HTTPS"}'" to the Service "public".
# If you are running a VPC-native cluster, please check the issue https://github.com/ory/k8s/issues/113 for current limitations.
backendConfig:
enabled: false
path: /health/ready
port: 4444
protocol: HTTP
interval: 60
timeout: 60
healthyThreshold: 1
unhealthyThreshold: 10
The backend-config.yaml
should probably look like this:
{{- if .Values.backendConfig.enabled }}
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: {{ include "hydra.fullname" . }}
spec:
healthCheck:
checkIntervalSec: {{ .Values.backendConfig.interval }}
timeoutSec: {{ .Values.backendConfig.timeout }}
healthyThreshold: {{ .Values.backendConfig.healthyThreshold }}
unhealthyThreshold: {{ .Values.backendConfig.unhealthyThreshold }}
type: {{ .Values.backendConfig.protocol }}
requestPath: {{ .Values.backendConfig.path }}
port: {{ .Values.backendConfig.port }}
{{- end }}
Additional information
As mentioned before this CDR does not work properly for VPC-native clusters, therefore I find it appropriate to point out this issue, the documentation of Google, or a warning that mentions the workaround necessary, i.e. an additional firewall rule that has to be configured manually:
gcloud compute firewall-rules create hydra-http-public --source-ranges=130.211.0.0/22,35.191.0.0/16 --network=default --allow=tcp:4444
Finally, if you want to enable TLS, you have to follow these steps:
a) The spec.healthCheck.type
in the BackendConfig
must be set to HTTPS
b) The service requires an additional notation: cloud.google.com/app-protocols: '{"4444": "HTTPS"}'
where 4444
is the port number where TLS has been enabled.
The second point has cost me an entire day of GKE documentation and tests.
Hopefully, this comment will save some time to somebody else who wants to enable TLS with custom Backend and health checks.
from k8s.
The problem here is a little more complex than that, GCE / GKE ingress has many limitations and among them, there is the problem that it doesn't correctly pick up the readinessProbe
path on many occasions. This bit me more times than I would like to admit, debugging this issue is extremely time expensive and information is not transparent at all on GCP.
Now, the issue is the following:
- it is not possible to provide multiple
readinessProbe
- GCE ingress correctly picks up the health check path
/health/ready
forhttp-admin
- GCE doesn't find any
readinessProbe
rule forhttp-public
and uses the default/
path - the default
/
path returns 404 and not 200, this makes the health check fails and the backend will resultUNHEALTHY
and therefore the ingress won't come up online.
Switching readiness probe to http-public
will result in http-admin
to fail I am afraid. The same problem will happen, but the ports will be switched, one succeeds and the other fails.
Merging the two ingress together in a single ingress doesn't fix the issue.
A solution to this issue might be healthcheck configuration via BackendConfig
CDR, see kubernetes/ingress-gce#1010 and https://cloud.google.com/kubernetes-engine/docs/concepts/backendconfig, that will be included in the version 1.10 of the GCE ingress.
Another solution would be to make two separate deployments for hydra-public
and hydra-admin
.
from k8s.
Wow thank you for the detailed write-up! That sounds really frustrating and should definitely be fixed. I think we can deploy two instances of Hydra to resolve this issue on GKE.
However, this would not work with the in-memory database which is what some deployments are currently using. We're however thinking about removing in-memory in favor of SQLite (which also supports in-memory but would use a mount in helm).
Is there any other way we can work around this for GKE?
Personally I have to say that I had so many issues with the GKE Ingress from being very slow to update to not supporting basic features like path rewrites that we ended up using Nginx ingress on GKE. While this doesn't support some features like Global Forwarding Rules (I think that's the name?) it doesn't actually cause 20minutes downtimes when the GCE ingress is updating :D
from k8s.
Hello @aeneasr, I'm glad my insights about this issue were useful!
I agree that GCE ingress isn't where it should be, it's an obsolete piece of software and its development is going forward at a very slow pace. On the other hand, as you also already mentioned, it is the default ingress on GCP and it supports some Google-specific features that NGINX and other ingresses do not.
I am looking forward to the SQLite solution and I think it's a step in the right direction for this specific GKE-related issue.
Is there any other way we can work around this for GKE?
There is this issue kubernetes/ingress-gce#647 that describes a problem similar to this one. Maybe quickly going through the ticket might give some ideas on how to deal with it.
A quick workaround to solve this issue could be what was described in this ticket kubernetes/ingress-gce#674 which is to return a 200 HTTP status on the root path /
of the application when the User-Agent
has prefix GoogleHC
. It's not too ugly nor complex and it's highly unlikely (if not impossible) that this User-Agent will be used by your customers for other purposes.
As an additional note, this issue kubernetes/ingress-gce#42 might also be interesting for this issue.
from k8s.
Yeah, considered the status change on / too, felt wrong. Im thinking 2 hydra instances might be the simplest approach.
Thanks for the insights on GCE load balancers! I didn't really know there was a thing (but recall other issues 🤷♀).
Thanks
PS: Hydra is an awesome!
from k8s.
A working solution is now available.
/ping @NoelJames @aeneasr
GKE supported versions
NOTE: This solution works only from GKE version 1.17.6-gke.11
according to https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-features - BackendConfig - Custom load balancer health check configuration.
VPC-native and Network Endpoint Group (NEG)
If you enabled Network Endpoint Group (NEG)
aka VPC-native
is enabled in your cluster, you first need to execute the following command to create a new firewall rule:
gcloud compute firewall-rules create hydra-http-public --source-ranges=130.211.0.0/22,35.191.0.0/16 --network=default --allow=tcp:4444
According to Google, this is a short-term workaround:
The short-term workaround (until automated NEG-based health checks are supported) is to manually deploy a firewall rule to allow Google Cloud health check probes to access NEG IP:port endpoints directly.
Solution
The following resource has to be manually created before the deployment of the Service / Ingress:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: http-public
namespace: hydra
spec:
healthCheck:
checkIntervalSec: 5
timeoutSec: 3
healthyThreshold: 1
unhealthyThreshold: 3
type: HTTP
requestPath: /health/ready
port: 4444
The following annotations have to be added to the Service
of the public endpoint:
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/backend-config: '{"default": "http-public"}'
from k8s.
Awesome, thank you for the update! Does that mean that we need to change something in the chart?
from k8s.
Awesome, thank you for the great write-up! This will certainly help with implementation. I'll also not be able to work in this in the near future so if anyone wants to pick this up please do :)
My only suggestion would be to probably make Values.backendConfig
obvious to be GKE only - maybe with Values.gkeBackendConfig
or something along those lines.
from k8s.
Thank you @christian-roggia for the detailed answer and follow ups! Can tell it saved me a bunch of time :)
from k8s.
I am closing this issue as it has not received any engagement from the community or maintainers in a long time. That does not imply that the issue has no merit. If you feel strongly about this issue
- open a PR referencing and resolving the issue;
- leave a comment on it and discuss ideas how you could contribute towards resolving it;
- open a new issue with updated details and a plan on resolving the issue.
We are cleaning up issues every now and then, primarily to keep the 4000+ issues in our backlog in check and to prevent maintainer burnout. Burnout in open source maintainership is a widespread and serious issue. It can lead to severe personal and health issues as well as enabling catastrophic attack vectors.
Thank you to anyone who participated in the issue! 🙏✌️
from k8s.
Related Issues (20)
- helm chart kratos does not implement loading environment variable from file for courier HOT 1
- hydra helm image update to 2.1? HOT 1
- Inconsistency with service account annotations on maester charts
- OathKeeper Default Helm Chart Issue | Pod throwing 503. HOT 11
- Extend Test Helm Charts for Hydra, Keto and Oathkeeper to allow user defined labels for test pod HOT 1
- Warning when setting a namespaces location in keto HOT 3
- deploy image of oathkeeper-maester to arm64 HOT 4
- Helm Chart Missing Keto Link
- failed to download "https://k8s.ory.sh/helm/charts/kratos-0.36.0.tgz" at version "0.36.0" HOT 1
- Unable to rotate secretsCookie in k8s helm chart HOT 1
- 0.37 release is wrongly numbered HOT 1
- Hydra helm chart values miss hydra.config examples. HOT 2
- Hydra Maester chart does not allow env variables, but maester v0.0.31 requires it
- Kratos selfservice UI incorrectly supports `BASE_PATH`
- Ory hydra dsn configuration through existing secret causes env var to not be defined
- DSN environmental variable is not set optionally HOT 1
- Unable to use NodePort while deploying kratos and kratos-selfservice-ui-node helm charts
- Cannot "inject" values for email templates from files HOT 5
- support hooks HOT 2
- Upgrade Oathkeeper helm chart 0.41 causes 503 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from k8s.