Comments (10)
From RFC 2109:
4.2.3 Controlling Caching
An origin server must be cognizant of the effect of possible caching
of both the returned resource and the Set-Cookie header. Caching
"public" documents is desirable. For example, if the origin server
wants to use a public document such as a "front door" page as a
sentinel to indicate the beginning of a session for which a Set-
Cookie response header must be generated, the page should be stored
in caches "pre-expired" so that the origin server will see further
requests. "Private documents", for example those that contain
information strictly private to a session, should not be cached in
shared caches.
If the cookie is intended for use by a single user, the Set-cookie
header should not be cached. A Set-cookie header that is intended to
be shared by multiple users may be cached.
The origin server should send the following additional HTTP/1.1
response headers, depending on circumstances:
* To suppress caching of the Set-Cookie header: Cache-control: no-
cache="set-cookie".
and one of the following:
* To suppress caching of a private document in shared caches: Cache-
control: private.
* To allow caching of a document and require that it be validated
before returning it to the client: Cache-control: must-revalidate.
* To allow caching of a document, but to require that proxy caches
(not user agent caches) validate it before returning it to the
client: Cache-control: proxy-revalidate.
* To allow caching of a document and request that it be validated
before returning it to the client (by "pre-expiring" it):
Cache-control: max-age=0. Not all caches will revalidate the
document in every case.
HTTP/1.1 servers must send Expires: old-date (where old-date is a
date long in the past) on responses containing Set-Cookie response
headers unless they know for certain (by out of band means) that
there are no downsteam HTTP/1.0 proxies. HTTP/1.1 servers may send
other Cache-Control directives that permit caching by HTTP/1.1
proxies in addition to the Expires: old-date directive; the Cache-
Control directive will override the Expires: old-date for HTTP/1.1
proxies.
So we could add a generic mechanism to remove the cookie based on Cache-Control: no-cache="set-cookie". Then in the case where origin servers aren't cooperating, the "override" fix would be something like:
ledge.bind("origin_fetched", function(req, res)
if res.cacheable() and res.header["Set-Cookie"] ~= nil then
res.header["Cache-Control"] = "no-cache=\"set-cookie\""
end
end)
Thoughts?
from ledge.
So the origin server should be setting 'Cache-control: no-cache="set-cookie".' if its sending a cookie and wants the page to be cached? or is the the other way around?
We just need to be sure that the set-cookie header is passed through on cache misses otherwise nobody will ever be able to get a session.
from ledge.
No, the origin server can send Expires headers and so on with Set-Cookie, and by default it assumed therefore that the cookie can be stored in a shared cache along with the page content.
The origin should be setting Cache-control: no-cache="set-cookie" if it wants the page to be cached, but Set-Cookie to be removed from shared cache. I doubt many origins do this, but that's the official mechanism so if we start by honouring this, you can force removal of the cookie by setting that header (dictating the ledge behaviour).
On your second point, I'm thinking of adding another event to bind to.. perhaps before_save. This way you can do processing of cacheable items before they are saved explicitly, and allow the origin_fetched path to be clean - i.e. allow non cacheable requests/responses to pass through the proxy transparently.
Does that seem better?
Basically to avoid ending up with lots of:
ledge.bind("origin_fetched", function(req, res)
if res.cacheable() then
...
end
end)
from ledge.
Ok so Ledge will be modified to cache the cookie header if Cache-control: no-cache="set-cookie" is set.
Then we bind to before_save and overide that for 99% of origin servers?
This way the Set-Cookie header is passed through on the initial Miss response but not saved into cache.
Sounds good to me!
Worth keeping this issue in mind when doing the collapsed forwarding stuff in the future too, don't want to pass the same set-cookie header on to all the collapsed requests
from ledge.
No, Ledge will be modified to not cache the cookie header if Cache-control: no-cache="set-cookie"
is set :) Otherwise it'll cache everything based on the res.cacheable()
.
Re the first MISS versus subsequent HITs, I think they should be identical save for the X-Cache / X-Cache-State headers. So if something was cacheable but fetched from the origin and saved, the user should get the same response which was saved, not the pure one from the origin. I think that's cleaner - so we have a cacheable path, and a non cacheable path. Inside the cacheable path, you might get a HIT or a MISS, but it's still the cacheable path.
If the response isn't cacheable, I think perhaps we just omit the X-Cache / X-Cache-State headers? That neatly distinguishes between MISS and NOTCACHEABLE?
Good point on collapse forwarding.. It needs to behave the same. If it was cacheable, give everyone the "shared" copy (presumably with Set-Cookie removed - i.e. whatever we saved). If it's not cacheable, send everyone to the origin and keep metadata to avoid requests waiting for collapse in the future.
from ledge.
So this seems to be working. You need this in your config to ensure Ledge marks cookies as non cacheable (really it's the origin's job):
ledge.bind("before_save", function(req, res)
res.header["Cache-Control"] = 'no-cache="set-cookie"'
end)
Note if the config is written inside nginx.conf with single quotes, you need to escape slightly weirdly (backslash is double escaped I guess).
res.header["Cache-Control"] = "no-cache=\\"set-cookie\\""
from ledge.
The spec around this is quite confusing but I'll try to create a test case(s) that covers the following cases and behaviour. I'm not confident at all that I got this right, so, please correct me if you spot anything that's wrong...
Case1:
- Cache Miss
- Headers from the origin:
Cache-Control: nocache="set-cookie"
Set-Cookie: name=value; path=/; exires="Wdy, DD-Mon-YYYY HH:MM:SS GMT"
The cookie will not be cached but it will remain in the response header to pass it to the browser along with Cache-Control
Case2:
- Cache Miss
- Headers from the origin:
Set-Cookie: name=value; path=/; exires="Wdy, DD-Mon-YYYY HH:MM:SS GMT"
The cookie will be stored in cache and remain in the response header
Case3:
- Cache Miss
- Headers from the origin:
Cache-Control: nocache="set-cookie"
There is no Set-Cookie header and so no cookie to store in cache or send to browser. Only Cache-Control remain in the response header.
Case4:
- Cache Hit
- Headers in the cache
Cache-Conrtol: nocache="set-cookie"
There shouldn't be Set-Cookie in the cache entry. The response header only contains Cache-Control.
Case5:
- Cache Hit
- Headers in the cache
Set-Cookie: name=value; path=/; exires="Wdy, DD-Mon-YYYY HH:MM:SS GMT"
Cache-Control header shouldn't be 'nocache="set-cookie"'. The response header will contain the cookie.
from ledge.
In Case 4 do we modify the Cache-Control headers that we pass on?
So if we've already stripped the Set-Cookie header should the Cache-Control header still be nocache="set-cookie" ?
from ledge.
This looks correct to me.
In Case 4 do we modify the Cache-Control headers that we pass on?
So if we've already stripped the Set-Cookie header should the Cache-Control header still be nocache="set-cookie" ?
I don't think so. I mean it's a pretty odd case, but the spec just says if that header is there then don't cache the cookie. I guess technically something downstream could see that header and infer that a cookie may have been removed by an upstream proxy.
I think we should just do what the spec says and no more. In reality, if the origin is setting cookies on cacheable responses, then a plugin to remove them blind (a la Squid) probably makes more sense anyway.
from ledge.
TEST2 - TEST7 cover the cases above. Don't forget 'TEST_NGINX_NO_SHUFFLE=1' when you run the test or the order of tests that these tests depend on will not be guaranteed.
example command:
PATH=/usr/local/squiz_edge/nginx/sbin:$PATH TEST_NGINX_NO_SHUFFLE=1 prove t/regression.t
Case1 & Case 4: both cache-control & set-cookie in the header
- TEST2: clears the db and sends a request with cache-control & set-cookie, then checks both cache-control & set-cookie are in the response (Case1)
- TEST3: the request does not reach the origin but reads from cache, and checks the response only has cache-control (Case1 & Case4)
Case2 & Case5: only set-cookie in the header
- TEST4: clears the db and sends a request with set-cookie, without cache-control, then checks the set-cookie exists in the response (Case2)
- TEST5: the request does not reach the origin but reads from cache, and checks the set-cookie exists in the response but not cache-control (Case2 & Case5)
Case3: only cache-control in the header
- TEST6: clears the db and sends a request with cache-control, without set-cookie, then checks the cache-control exists in the response but not set-cookie (Case3)
- TEST7: the request does not reach the origin but reads from cache, and checks the cache-control exists in the response but not set-cookie (Case3)
from ledge.
Related Issues (20)
- luarocks install Error: No results matching query were found. HOT 4
- binding globally error:no such event HOT 1
- Can not Customize storage_driver
- doc error:upstream_ssl_verify default value is not false HOT 1
- Can't do a conditional on HTTP_ACCEPT_LANGUAGE HOT 1
- Openresty can't start: no file '/usr/local/lib/lua/5.1/ledge.so' HOT 1
- ESI request forced HTTPS connection even if config.upstream_use_ssl is set to false HOT 2
- HTTP v2 not supported yet HOT 1
- 500 if multiple Date headers are sent HOT 1
- Feature request: Revalidation of expired items in cache
- Recursive ESI with single handler HOT 2
- Ledge is not compatible with Openresty HOT 2
- Where are released being cut? HOT 6
- Fails to run with example from README HOT 1
- If Redis is down, response is not fetched from origin
- Ledge will only connect to Redis running on localhost HOT 10
- How to decouple range (range.lua) module? HOT 2
- issues with openresty 1.19 HOT 5
- Upstream timeout leads to mixed up responses HOT 5
- stale content not being updated HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ledge.