Coder Social home page Coder Social logo

caraml-dev / turing Goto Github PK

View Code? Open in Web Editor NEW
70.0 70.0 23.0 11.95 MB

Fast, scalable and extensible system to deploy and evaluate ML experiments in production

License: Apache License 2.0

Dockerfile 0.32% Makefile 0.50% Go 48.78% Shell 0.47% Smarty 0.47% HTML 0.06% JavaScript 12.72% SCSS 0.20% Python 36.40% Jinja 0.03% Mustache 0.05% Ruby 0.01%

turing's People

Contributors

ariefrahmansyah avatar ashwinath avatar davidheryanto avatar deadlycoconuts avatar krithika369 avatar leonlnj avatar mbruner avatar pradithya avatar romanwozniak avatar shydefoo avatar terryyylim avatar tiopramayudi avatar zenovore avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

turing's Issues

Propagate traffic_rule label to custom metrics when the router timeout is reached

A traffic rule label (traffic_rule) has been introduced in #280, which captures the selected rule on the mlp_turing_route_request_duration_ms and mlp_turing_turing_comp_request_duration_ms metrics, when the router has traffic rules enabled.

However, in a scenario where the overall timeout is reached on the router, the mlp_turing_turing_comp_request_duration_ms failure metric will not always capture the traffic_rule label. This is because, during context cancellation, the parent Fiber components may return first without waiting for the child component(s) to return and thus, the label set in the nested components may not be propagated upstream.

helm install turing turing/turing Error: INSTALLATION FAILED

I got this error during installation following this guideline https://github.com/gojek/turing/blob/main/infra/charts/turing/README.md.

Steps

  1. helm install incubator/sparkoperator --generate-name

WARNING: This chart is deprecated
W1128 22:23:02.075723 56032 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W1128 22:23:02.080260 56032 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W1128 22:23:02.314421 56032 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W1128 22:23:02.324651 56032 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
NAME: sparkoperator-1638112978
LAST DEPLOYED: Sun Nov 28 22:23:01 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

  1. helm install turing-init turing/turing-init --namespace infrastructure --create-namespace

NAME: turing-init
LAST DEPLOYED: Sun Nov 28 22:23:53 2021
NAMESPACE: infrastructure
STATUS: deployed
REVISION: 1
TEST SUITE: None

  1. helm install turing turing/turing

Error: INSTALLATION FAILED: unable to build kubernetes objects from release manifest: [unable to recognize "": no matches for kind "Gateway" in version "networking.istio.io/v1beta1", unable to recognize "": no matches for kind "VirtualService" in version "networking.istio.io/v1beta1"]

My local setup

Error in handling `Accept-Encoding`

We found an issue with how turing uses Accept-Encoding. We saw turing adding Accept-Encoding: gzip to outbound requests, but not parsing the response as gzip. This results in clients calling turing receiving gzipped response even when they did not request for it.

Reproduction ->

  1. deploy a service that supports gzip response encoding
  2. configure turing to route to that end point
  3. make a request through Turing without setting Accept-Encoding

cc @peterjrichens

helm install turing-init turing/turing-init failed - secret "turing-init-spark-operator-webhook-certs" not found

Hi, I'm following this guideline https://github.com/gojek/turing/blob/main/infra/charts/turing-init/README.md, but turing-init installation is still failed.

Step that I executed
$ helm install turing-init turing/turing-init

$ kubectl get pod

NAME                                               READY   STATUS             RESTARTS       AGE
turing-init-spark-operator-webhook-init--1-vs9zw   0/1     Completed          0              9m9s
turing-init-init--1-xcdnl                          0/1     Error              0              9m9s
turing-init-init--1-4bk9l                          1/1     Running            0              3m36s
turing-init-spark-operator-5b89d6567d-6vtwx        0/1     CrashLoopBackOff   6 (3m2s ago)   9m9s

$ kubectl describe pod turing-init-spark-operator-5b89d6567d-6vtwx

Type     Reason       Age                From               Message
  Normal   Scheduled    58s                default-scheduler  Successfully assigned default/turing-init-spark-operator-5b89d6567d-6vtwx to k3d-mycluster-server-0
  Warning  FailedMount  52s (x5 over 59s)  kubelet            MountVolume.SetUp failed for volume "webhook-certs" : secret "turing-init-spark-operator-webhook-certs" not found
  Normal   Pulled       1s (x4 over 43s)   kubelet            Container image "gcr.io/spark-operator/spark-operator:v1beta2-1.2.3-3.1.1" already present on machine
  Normal   Created      1s (x4 over 43s)   kubelet            Created container spark-operator
  Normal   Started      1s (x4 over 43s)   kubelet            Started container spark-operator
  Warning  BackOff      1s (x5 over 41s)   kubelet            Back-off restarting failed container

$ kubectl logs turing-init-spark-operator-5b89d6567d-6vtwx

I0218 08:29:59.335140      11 main.go:144] Starting the Spark Operator
I0218 08:29:59.335336      11 main.go:177] Enabling metrics collecting and exporting to Prometheus
I0218 08:29:59.335425      11 metrics.go:142] Started Metrics server at localhost:10254/metrics
I0218 08:29:59.336345      11 webhook.go:218] Starting the Spark admission webhook server
I0218 08:29:59.350034      11 webhook.go:412] Creating a MutatingWebhookConfiguration for the Spark pod admission webhook
F0218 08:29:59.355648      11 main.go:208] the server could not find the requested resource

Based on log above, seems like the crash is due to MountVolume.SetUp failed for volume "webhook-certs" : secret "turing-init-spark-operator-webhook-certs" not found
But based on checking on this pod, kubectl logs turing-init-spark-operator-webhook-init--1-vs9zw, turing-init-spark-operator-webhook-certs is created successfully.

How to fix it?

Inefficient regular expression

https://github.com/gojek/turing/blob/9d6415bbe9b8e3efe29c3f0199411be3cc8f6396/ui/src/utils/validation.js#L2-L2

const urlRegex =
  /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/;

This part of the regular expression may cause exponential backtracking on strings starting with '//0.' and containing many repetitions of '00.'.

export const isValidUrl = (value) => {
  return value.match(urlRegex) !== null;

Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, or even allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.

The regular expression engines provided by many popular JavaScript platforms use backtracking non-deterministic finite automata to implement regular expression matching. While this approach is space-efficient and allows supporting advanced features like capture groups, it is not time-efficient in general. The worst-case time complexity of such an automaton can be polynomial or even exponential, meaning that for strings of a certain shape, increasing the input length by ten characters may make the automaton about 1000 times slower.

Typically, a regular expression is affected by this problem if it contains a repetition of the form r* or r+ where the sub-expression r is ambiguous in the sense that it can match some string in multiple ways. More information about the precise circumstances can be found in the references.

Recommendation

Modify the regular expression to remove the ambiguity, or ensure that the strings matched with the regular expression are short enough that the time-complexity does not matter.

Example

Consider this regular expression:

			/^_(__|.)+_$/
		

Its sub-expression "(__|.)+?" can match the string "__" either by the first alternative "__" to the left of the "|" operator, or by two repetitions of the second alternative "." to the right. Thus, a string consisting of an odd number of underscores followed by some other character will cause the regular expression engine to run for an exponential amount of time before rejecting the input.

This problem can be avoided by rewriting the regular expression to remove the ambiguity between the two branches of the alternative inside the repetition:

			/^_(__|[^_])+_$/

References

OWASP: Regular expression Denial of Service - ReDoS.
Wikipedia: ReDoS.
Wikipedia: Time complexity.
James Kirrage, Asiri Rathnayake, Hayo Thielecke: Static Analysis for Regular Expression Denial-of-Service Attack.
Common Weakness Enumeration: CWE-1333.
Common Weakness Enumeration: CWE-730.
Common Weakness Enumeration: CWE-400.

Add more robust checks in Turing API config validation

Currently the validation for Config object in Turing API mostly checks for value presence only with validate:required
https://github.com/gojek/turing/blob/c0e775eb2b08c1ab4077a4ac252d186ff3ecbdd1/api/turing/config/config.go#L36-L40

However there are other validations we need to perform as well. For instance, making sure URL values follows valid URL syntax or that value for log level is only one of info,debug,warn etc.

More throrough config validation ensures config values are properly validated and any config value error is returned to the user as early as possible with clear error message.

Improper Neutralization of Special Elements used in a Command in Shell-quote

Upgrade shell-quote to fix 1 alert in ui/yarn.lock

shell-quote@^1.7.3:
  version "1.7.3"

The shell-quote package before 1.7.3 for Node.js allows command injection. An attacker can inject unescaped shell metacharacters through a regex designed to support Windows drive letters. If the output of this package is passed to a real shell as a quoted argument to a command with exec(), an attacker can inject arbitrary commands. This is because the Windows drive letter regex character class is {A-z] instead of the correct {A-Za-z]. Several shell metacharacters exist in the space between capital letter Z and lower case letter a, such as the backtick character.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.