Coder Social home page Coder Social logo

prebid-cache's People

Contributors

austinb avatar bretg avatar dbemiller avatar dbrain avatar dependabot[bot] avatar ducchau avatar dukevenkat avatar erikdubbelboer avatar grossjo avatar guscarreon avatar hhhjort avatar jbartek25 avatar mansinahar avatar mkendall07 avatar onkarvhanumante avatar pubmatic-openwrap avatar sebmil-daily avatar sonali-more-xandr avatar syntaxnode avatar vamsiikrishna avatar weej89 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prebid-cache's Issues

Prometheus namespace and subsystem should be allowed to be empty, like in Prebid Server

There is currently a check in prebid-cache to validate that the Prometheus namespace and subsystem are non-empty https://github.com/prebid/prebid-cache/blob/0.21.0/config/config.go#L301-L306

However, prebid-server allows empty namespaces
https://github.com/prebid/prebid-server/blob/master/docs/developers/metrics-configuration.md, defaulting to empty without any validation https://github.com/prebid/prebid-server/blob/v0.225.0/config/config.go#L808-L809

We already prefix our Datadog metrics ingested from Prometheus with a pb_cache prefix, and would like to have metrics in the format of pb_cache.connection_closed rather than pb_cache.{namespace}_{subsystem}_connection_closed and maintain parity with our existing pb_server.connections_closed metrics that do not require a namespace.

Potential race condition

It is possible that two requests will result in the same UUID key. Given how UUIDs are generated, this won't happen practically. But now we also have a PR in place to allow a key to be generated externally, which will only increase the possibility of a key collision. The key set code does check if a key already exists, so the window to overwrite is small, two parallel requests must both run the key exists check before either of them executes the key write function.

It behoves us to consider the potential consequences of the race condition hitting. The worst case of writing two pieces of data to the same location concurrently is having corrupted data stored in that location. However any backend worth using in production should prevent this, and leave us with one of the values stored intact, and the other value(s) discarded. So corrupt data should not be a concern.

So the real concern with this issue is having the wrong payload stored under a key, from the perspective of a particular user of cache. I see two vectors where this can happen with any frequency.

  1. A bad actor is sniffing traffic to the cache server, and is pushing malicious content into the cache under the same key as a legitimate app.

  2. A poorly coded app is using the set key feature, but not choosing keys unique enough to avoid key collisions in practice.

  3. requires sniffing traffic and generating cache requests in real time, which will hit the cache server at the same time as the initial request. (If the bad actor can conduct a man-in-the-middle attack in order to sync up the initial request and the duplicate key request, they already have a better attack vector than this issue provides) It is a daunting feat, but perhaps not impossible. If such an attack were to take place, due to latency factors and others out of the attackers control, the "replacement" payloads would have a 50% chance of landing before the intended payload. Also given network latency noise is much greater than the race condition window, in almost every case the app will get a failed to cache message rather than a successful cache response, so the attack will be very noticeable and still fail to succeed in nearly all attempts unless the app is poorly coded to ignore the cache return value and just assumes a successful store was done.

  4. Given that network latencies have an order of magnitude more noise that the race window, nearly all collisions will hit a failed to cache condition rather than a successful return + overwrite. So again, only poorly coded apps that do not check the cache return values will suffer from this race condition.

Fixing this race condition would have to be done in the backend modules, to make "check for key + write value" into an atomic operation. Not all backends may support such functionality, and thus not be fixable. The result of fixing it is we will always return a failure message and never overwrite (rather than only doing it for 99.9% of the time). So even if we fix the issue, the poorly written apps that might suffer from the race condition will continue to suffer.

Given all this, it does not seem worth fixing. It is worth recording so people remain aware of it, and as a warning to app writers to not write poorly coded apps. :)

non-empty UUID with backend PUT error

On line 305 in endpoints/put.go, you don't set the uuid to be empty even though there was an error performing the PUT, making the uuid itself useless since it can't actually be used to fetch from the backend. The solution is to do resp.UUID = "" in both branches of the if/else. I would create a pull request for this, but I don't have permission to push a branch to this repository.

Misleading metrics when custom key used

From README:

...a cache key can be specified for the cached object. If an entry already exists for "ArbitraryKeyValueHere", it will not be overwitten, and "" will be returned for the uuid value of that entry. This is to prevent bad actors from trying to overwrite legitimate caches...

To achieve the above, code here first calls the Get method to check if the key already exists in cache. The downside of this approach is that the Get check leads to a large number of Get errors (PBC prebid_cache_gets_backend{status="error"} metric) and Get misses (backend storage metric, in my case Memcached). Also there's a question of efficiency: Get + Put is certainly more expensive than a single operation which only writes the item if the key doesn't exist in cache.

I'm currently testing with Memcached removing the check for key existence and replacing the Memcached Set method used by Put with Add method which only writes the item if the key hasn't been set already. With no other code change, an attempt to write value for an existing key will end up with response 500 and message memcache: item not stored. It will work fine for my use case but it's no acceptable as a general fix as it changes the API (no error + empty uuid for existing entry).

Restrict access to POST method

I understand that the GET method for prebid-cache should be publicly accessible, but there's no reason that the POST method should be public. Some mechanism to secure the POST method needs to be added to prevent abuse.

Currently, anyone can write to the Appnexus prebid-cache instance, since it doesn't require any kind of access key to POST:

curl -X "POST" "https://prebid.adnxs.com/pbc/v1/cache" \
     -H 'Content-Type: application/json' \
     -d $'{
  "puts": [
    {
      "type": "xml",
      "value": "<tag>Your XML content goes here!</tag>"
    },
    {
      "type": "json",
      "value": [
        1,
        true,
        "JSON value of any type can go here."
      ]
    }
  ]
}'

Expected result: a 400 error (Invalid request), since it should require an API key of some kind to be in the request.

Actual result: a 200 response, including the uuids issued for each element stored to the cache.

400 bad request, request header or cookie too large

Type of issue

bug

Description

I have observed the Prebid Cache endpoint (https://prebid.adnxs.com/pbc/v1/cache) returning error status 400 and a message along the lines of "bad request, request header or cookie too large". This was in turn causing the pbjs exception: "Failed to save to the video cache", and no video bids were being considered.

Please note this is distinct from the "Post cache element n exceeded max size" error. It seems to be coming from the HTTP server itself ie. NGINX.

I checked the HTTP requests and observed at least 2 very large (multi-kilobyte) cookies set in the header. The error subsided after clearing adnxs.com cookies. It occurred to me that there could be circumstances that large cookies are being set on this domain which is then causing errors with the cache endpoint. However I did not save the cookies prior to troubleshooting so I cannot verify precisely which cookies were causing the issue.

I thought it might be worth doing a sanity check to ensure that no API endpoints on prebid.adnxs.com are setting abnormally large cookies which could be then causing failed requests on the cache endpoint. I don't know exactly what other codebases utilize this domain, so not sure exactly how to check myself.

Also it might be worth checking the NGINX logs to discover how many 400 errors are being emitted with this particular reason.

Steps to reproduce

While I cannot reproduce the origin of the large cookies, the problem was replicated
while load a page containing Prebid with configured video ad units such as https://www.si.com/nfl/2020/01/13/nfl-playoffs-divisional-round-49ers-titans-chiefs-packers-head-coaching-carousel-mmqb?pbjs_debug=true

Expected results

Prebid cache requests should return 200 status codes.

Actual results

Prebid cache returned 400 status codes due to abnormally large cookies on the http request.

Platform details

Prebid 2.44.1
Windows

Failed validation / build on clean clone of master

I'm building a prebid-cache for testing in conjunction with the bidder I'm working on for prebid-server, and I'm getting a validation error and a failed build on the current master. Here's what I'm getting:

zroop:prebid jim$ git clone https://github.com/prebid/prebid-cache.git
Cloning into 'prebid-cache'...
remote: Counting objects: 237, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 237 (delta 0), reused 0 (delta 0), pack-reused 233
Receiving objects: 100% (237/237), 60.30 KiB | 76.00 KiB/s, done.
Resolving deltas: 100% (123/123), done.
zroop:prebid jim$ cd prebid-cache/
zroop:prebid-cache jim$ ./validate.sh 
# github.com/prebid/prebid-cache/backends
backends/azure_table_test.go:11:18: multiple-value uuid.NewV4() in single-value context
backends/azure_table_test.go:46:18: multiple-value uuid.NewV4() in single-value context
# github.com/prebid/prebid-cache/endpoints
endpoints/put.go:89:40: multiple-value uuid.NewV4() in single-value context
# github.com/prebid/prebid-cache/endpoints
endpoints/put.go:89:40: multiple-value uuid.NewV4() in single-value context

zroop:prebid-cache jim$ go build .
# github.com/prebid/prebid-cache/endpoints
endpoints/put.go:89:40: multiple-value uuid.NewV4() in single-value context

Default settings result in unacceptable combination of CORS response headers

By default, the CORS object in prebid-cache is initialised as follows with AllowCredentials set to true:

coresCfg := cors.New(cors.Options{AllowCredentials: true})

Due to this, when the Origin request header is passed, for example http://foo/bar, then prebid-cache end-points return the following response headers:

Access-Control-Allow-Origin: * Access-Control-Allow-Credentials: true

This results in a CORS error: Credential is not supported if the CORS header ‘Access-Control-Allow-Origin’ is ‘*’

Ideally, the response header should contain the Origin value from the request header, in this case: http://foo/bar. The CORS module should generate the correct combination of headers by default.

Unit Test `TestNewAerospikeBackend` fails

Since a couple of weeks, the one of the unit tests does not seem to run anymore:

$ go test ./... 
?       github.com/prebid/prebid-cache  [no test files]
?       github.com/prebid/prebid-cache/compression      [no test files]
?       github.com/prebid/prebid-cache/endpoints/routing        [no test files]
?       github.com/prebid/prebid-cache/metrics  [no test files]
?       github.com/prebid/prebid-cache/metrics/metricstest      [no test files]
?       github.com/prebid/prebid-cache/version  [no test files]
time="2024-02-08T14:10:11+01:00" level=fatal msg="Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]\nResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail\nResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail\nRequested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576"
time="2024-02-08T14:10:11+01:00" level=info msg="config.backend.aerospike.host is being deprecated in favor of config.backend.aerospike.hosts"
time="2024-02-08T14:10:11+01:00" level=fatal msg="Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]\nResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail\nResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail\nRequested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576"
time="2024-02-08T14:10:11+01:00" level=info msg="config.backend.aerospike.host is being deprecated in favor of config.backend.aerospike.hosts"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Aerospike backend: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See `Policy.Timeout`"
--- FAIL: TestNewAerospikeBackend (30.48s)
    aerospike_test.go:97: 
                Error Trace:    aerospike_test.go:97
                Error:          Not equal: 
                                expected: "Error creating Aerospike backend: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See `Policy.Timeout`"
                                actual  : "Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]\nResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail\nResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail\nRequested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576"
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1,4 @@
                                -Error creating Aerospike backend: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See `Policy.Timeout`
                                +Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]
                                +ResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail
                                +ResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail
                                +Requested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576
                Test:           TestNewAerospikeBackend
                Messages:       Unable to connect hosts fakeTestUrl panic and log fatal error when passed additional hosts
    aerospike_test.go:97: 
                Error Trace:    aerospike_test.go:97
                Error:          Not equal: 
                                expected: "Error creating Aerospike backend: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See `Policy.Timeout`"
                                actual  : "Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]\nResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail\nResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail\nRequested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576"
                            
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -1 +1,4 @@
                                -Error creating Aerospike backend: ResultCode: TIMEOUT, Iteration: 0, InDoubt: false, Node: <nil>: command execution timed out on client: See `Policy.Timeout`
                                +Error creating Aerospike backend: ResultCode: INVALID_NODE_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: Failed to connect to hosts: [34.206.39.153:8888 45.223.56.75:8888]
                                +ResultCode: NETWORK_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: network error. Checked the wrapped error for detail
                                +ResultCode: COMMON_ERROR, Iteration: 0, InDoubt: false, Node: <nil>: common, none-aerospike error. Checked the wrapped error for detail
                                +Requested new buffer size is invalid. Requested: 92703365869105, allowed: 0..1048576
                Test:           TestNewAerospikeBackend
                Messages:       Unable to connect host and hosts panic and log fatal error when passed additional hosts
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Ignite backend: configuration is missing ignite.schema, ignite.host, ignite.port or ignite.cache.name"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Ignite backend: configuration is missing ignite.schema, ignite.host, ignite.port or ignite.cache.name"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Ignite backend: configuration is missing ignite.schema, ignite.host, ignite.port or ignite.cache.name"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Ignite backend: configuration is missing ignite.schema, ignite.host, ignite.port or ignite.cache.name"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Error creating Ignite backend: error parsing Ignite host URL parse \":invalid:://127.0.0.1:8080/ignite?cacheName=myCache\": missing protocol scheme"
time="2024-02-08T14:10:41+01:00" level=info msg="Prebid Cache will write to Ignite cache name: myCache"
time="2024-02-08T14:10:41+01:00" level=info msg="Prebid Cache will write to Ignite cache name: myCache"
time="2024-02-08T14:10:41+01:00" level=fatal msg="Discovery polling duration is invalid"
2024/02/08 14:10:41 Warning: First poll for discovery service failed due to memcache: no servers configured or available
FAIL
FAIL    github.com/prebid/prebid-cache/backends 30.488s
ok      github.com/prebid/prebid-cache/backends/config  (cached)
ok      github.com/prebid/prebid-cache/backends/decorators      (cached)
ok      github.com/prebid/prebid-cache/config   (cached)
ok      github.com/prebid/prebid-cache/endpoints        (cached)
ok      github.com/prebid/prebid-cache/metrics/influx   (cached)
ok      github.com/prebid/prebid-cache/metrics/prometheus       (cached)
ok      github.com/prebid/prebid-cache/server   (cached)
ok      github.com/prebid/prebid-cache/utils    (cached)
FAIL

This still works:

$ go test ./... --skip TestNewAerospikeBackend

I don't know anything about AeroSpike, so I'm a bit lost here. Can you help me out?

Greets & Thanks

Default aerospike connection

config.backend.aerospike.host is deprecated, but have default value "aerospike.prebid.com". In that case if prebid-cache started with something like PBC_BACKEND_AEROSPIKE_HOSTS=127.0.0.1 then we see message "Connected to Aerospike host(s) [127.0.0.1 aerospike.prebid.com] on port 3000". I think that default value must be removed or changed to default "hosts" instead of default "host".

Hitting limit on prebid cache server for video responses

The prebid cache server has a limit of 10kb per value passed to the server, if the file goes over the limit there will be a 400 error thrown from the server. For video publishers, the XML file can get over this range pretty frequently leaving bidders our of the competition. We should add compression or raise the limit to the server so temporary storage is used for all video programmatic partners. I can add

Amazon Keyspaces as a backend

Hi,

I'm wondering if anyone successfully ran prebid-cache using an AWS Keyspaces managed cache backend?

My first try is resulting in this error : Error creating Cassandra backend: gocql: unable to create session: unable to discover protocol version: dial tcp 3.234.248.205:9042: i/o timeout

Hosts and keyspace settings were defined the config.yaml

Any insights would be helpful. Thanks.

Monitoring friendly index route

Hi,

when we open prebid-cache index route, it returns a 404 response.
Most of the monitoring systems will treat a 404 response like a fail response and marks the application as down.

can we add a index page to display some message like the prebid-server does ?

We can just reuse this from README :

This application stores short-term data for use in Prebid. It exists to support Video Ads from Prebid.js, as well as prebid-native.

Thank you.

Redis running full after update

We were running prebid cache 0.13.0 and tried an update. But after updating to 0.14.0 our redis (elasticache) is getting full after about an hour. Normally we are at about 30% capacity max.
I tried updating one commit at a time and the one that breaks it is this one: #89

As far as i understand, the expiration parameter for redis is replaced with config.request_limits.max_ttl_seconds, but will still work, since the lower of the two values is used. The desired expiration time is 10 minutes, so i tried setting this, using both parameters, using config.request_limits.max_ttl_seconds without config.backend.redis.expiration, setting one lower than the other, but the result is always the same. The redis starts filling up and when it reaches 100%, the cache hit rate collapses.

Am i misconfiguring something? This is the config that works on 0.13.0:

port: 2424 admin_port: 2525 log: level: "info" rate_limiter: enabled: false num_requests: 100 request_limits: allow_setting_keys: false max_size_bytes: 40960 max_num_values: 10000 max_ttl_seconds: 3600 backend: type: "redis" redis: host: "censored" port: 6379 password: "censored" db: 1 expiration: 10 compression: type: "snappy" metrics: type: "none"

Redis backend

Hi,
Can I submit a PR which adds Redis as a Backend ?
Thanks !

doesn't build correctly with golang version < 1.8

Had to upgrade go to 1.9 to get it to compile.

[dchau@ip-172-31-35-81 prebid-cache]$ /usr/bin/go version
go version go1.7.5 linux/amd64
[dchau@ip-172-31-35-81 prebid-cache]$ /usr/bin/go build .
# github.com/prebid/prebid-cache
./azure_table.go:205: unknown httptrace.ClientTrace field 'TLSHandshakeDone' in struct literal   
[dchau@ip-172-31-35-81 prebid-cache]$ /usr/local/go/bin/go version
go version go1.9 linux/amd64
[dchau@ip-172-31-35-81 prebid-cache]$ /usr/local/go/bin/go build .

We should update README to state version of golang code is compatible with.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.