Coder Social home page Coder Social logo

Review feedback about stapled HOT 10 CLOSED

jsha avatar jsha commented on June 13, 2024
Review feedback

from stapled.

Comments (10)

rolandshoemaker avatar rolandshoemaker commented on June 13, 2024

This is great, thanks!

Why even have a dont-die-on-stale-response? Seems like that should always be true.

dont-cache also seems unnecessary. You could make the decision to cache or not based on whether disk.cache-folder is set. Similarly, dont-seed-cache-from-disk seems unnecessary and unused. Also, as a general naming thing, I'd use "no" instead of "dont" because it doesn't have the ambiguity introduce by eliding the apostrophe.

Agreed, some of this is cruft I forgot to remove.

Some inconsistency about what's exported vs what's not. This is meant to be a command, so probably nothing should be exported. Alternately: Maybe you want to break it into subpackages for better maintenance, and then some things will need to be exported?

Yup, I think splitting into subpackages is the best plan and will help enforce a cleaner internal API.

As a stress test, it would be fun to seed this with all currently-issued Let's Encrypt certificates. I've been meaning to write an OCSP-aware monitoring daemon- this would effectively be that, if it could log warnings when responses are stale or don't parse or similar. You would probably need to define an alternate way to bring in certificates, and an alternate way to store responses, since 1M files in a directory is bad news. Even if you split among directories, you would have to make sure the host filesystem has enough inodes.

Great idea. My original plan was to implement a 'stable' cache interface so you could use a on disk or SQL cache backing the in-memory one which would be nice for this.

I've done some work on a mode where if you configure a upstream responder and make a request that isn't in the cache it will try to fetch it from upstream and seed the cache from a response. That could be a simple-ish way to push all 1M entries (if you knew all their serials) into the cache. Or maybe just a issuer cert + a file of serial numbers...

  • maybe break out the "request related" and "response related" sections into their own structs?
  • don't create an http.Client for each Entry, to save memory. The http client should probably be a property of the cache. They're safe for concurrent use.

👍, I've been meaning to restructure these sections to do this. One problem I was trying to figure out was whether to support per-certificate proxy definitions (which would require some hackery), although that might not actually be very important.

  • do you need to keep the x509.Certificate around for the lifetime? When the number of Entry's is large, the memory savings of discarding it could be large.

Entry.loadCertificate: There should be a cache for issuing certificates. Simple version: in-memory cache. Fancier version: config list of PEM files to load (or wildcard of PEM files to load) containing issuer certificates, pre-seeding in-memory cache.

Do you mean the issuer cert? The crypto/ocsp functions require the issuer cert to validate the signatures on responses (if the certificate issuer is used for OCSP and a intermediate isn't returned in the response). We could probably just pull out the key and do the signature validation ourselves to store some memory. If most entries come from the same issuer though and we cache the issuers and just store a pointer to the cached issuer on the entry we'd also save a lot of memory.

Also 👍 on caching the issuers/seeding them, this is one of my current TODOs.

Proxy URI: It should be possible to list multiple upstream proxies to be randomly selected, like the responders.

👍

NewEntry / FromDef / Init: I'm trying to better understand the breakdown between these. It looks like NewEntry is everything that can't error, FromDef is everything that does file or network I/O, and Init is everything that doesn't do file or network I/O, but can error. Is that right? If so I might recommend merging NewEntry and Init. Also, comments. :-)

Soooo, the whole entry creation/initialization is super gross in my opinion, this is mainly due to me wanting to add multiple different ways of adding entries (via config, cert folder, forwarding requests) and just hacking each common-ish bit out so I didn't have to duplicate much code. I'm not super sure about how to clean it up exactly right now but it's high on my priority list.

Entry.monitor: This seems like a mix of techniques: You have a constant-interval ticker, but you also pick an exact amount of time to sleep based on the response. Probably better to pick one. Recommendation: Define a single ticker on cache that loops through all entries and refreshes in a goroutine if it's time to do so. That way you only need a single long-running goroutine versus a goroutine per entry, which will make your stack traces easier to read.

Agreed, this also allows moving a bunch of static stuff out of the Entry struct that didn't need to be there in the first place and should reduce overall size.

HashNameAndPKI: Can't you use x509.MarshalPKIXPublicKey(key)? Also, why do you create an ocsp.Request manually rather than using ocsp.CreateRequest?

So I totally pulled this from the ocsp.CreateRequest method since I was trying to mimic that format but looking at the source of x509.MarshalPKIXPublicKey now I'm really confused if/why they produce different encodings of the public key info, I need to go try this out...

from stapled.

jsha avatar jsha commented on June 13, 2024

Entry.loadCertificate: There should be a cache for issuing certificates. Simple version: in-memory cache. Fancier version: config list of PEM files to load (or wildcard of PEM files to load) containing issuer certificates, pre-seeding in-memory cache.
Do you mean the issuer cert? The crypto/ocsp functions require the issuer cert to validate the signatures on responses (if the certificate issuer is used for OCSP and a intermediate isn't returned in the response). We could probably just pull out the key and do the signature validation ourselves to store some memory.

Yeah, I meant issuer cert here. Largely I was referring to the fact that it does a network fetch, so that should be de-duplicated. Also, loadCertificate is called on the blocking path at startup, right? It's probably a good design principle that startup will never block on network requests. So you want a way to provide a set of already-fetched issuer certificates. I see you can do that on a per-certificate basis in CertDefinition. It might be easier to define a single path that contains a list of known issuers, load them, and make them available to all entries based on subject / skid matching.

To solve the "block on startup" problem for certificates that don't have an issuer certificate loaded from disk, you could move the "fetch from AIA" logic into updateResponse (and cache the result).

Also worth noting that the issuer argument to ocsp.ParseResponse is optional. If you leave it out, it will still parse, it just won't validate the sig.

from stapled.

cpu avatar cpu commented on June 13, 2024

Entry.monitor: This seems like a mix of techniques: You have a constant-interval ticker, but you also pick an exact amount of time to sleep based on the response. Probably better to pick one. Recommendation: Define a single ticker on cache that loops through all entries and refreshes in a goroutine if it's time to do so. That way you only need a single long-running goroutine versus a goroutine per entry, which will make your stack traces easier to read.

This is close-ish to what I had implemented (I haven't moved the OCSP lookup to a separate goroutine off of the update one). I was letting myself overthink a little bit and used a heap sort of the certs based on nextUpdate, sleeping for an appropriate interval based on the heap peek.

Soooo, the whole entry creation/initialization is super gross in my opinion, this is mainly due to me wanting to add multiple different ways of adding entries (via config, cert folder, forwarding requests) and just hacking each common-ish bit out so I didn't have to duplicate much code. I'm not super sure about how to clean it up exactly right now but it's high on my priority list.

FWIW I think my codebase suffers this too. My awkward type flailing with MonitoredCerts stems from this same place, wanting to allow some of the other ways of adding certs in the future that isn't disk backed.

Do you mean the issuer cert? The crypto/ocsp functions require the issuer cert to validate the signatures on responses (if the certificate issuer is used for OCSP and a intermediate isn't returned in the response).

One of the differences in our approaches was that I made an early decision to exclusively work with subject/issuer chainfiles (two PEM encoded x509 certs per file). I didn't want to have to deal with reading an issuer PEM from disk and keeping it in sync and associated with the subject PEM. I have to admit I wasn't thinking about the resources (disk, memory) wasted from having a lot of subject certificates that all share the same issuer. Separating the intermediate cache from the subjects seems to make sense to me. I wish I thought of that :-)

from stapled.

rolandshoemaker avatar rolandshoemaker commented on June 13, 2024

@jsha: Thinking about your load-testing/monitor comment it seems this would be a perfect case for having a dont-die-on-stale-response config flag. If you wanted to log when (possibly more than one) response was stale you'd probably want to stay alive after this fact to keep monitoring?

from stapled.

jsha avatar jsha commented on June 13, 2024

My thought was that you should never die on a stale response.

from stapled.

rolandshoemaker avatar rolandshoemaker commented on June 13, 2024

Ah! Okay, I was misunderstanding you. So this came from one of Sleevi's points about what to do in the situation where you might end up serving a stale response back to the consumer. One of his ideas was if you really wanted to let a monitoring system know something was wrong it'd be better to just die.

I'm not sure I 100% agree with this but I don't really know an ideal behavior in this situation. cfssl will currently just serve stale responses, which it probably shouldn't (cloudflare/cfssl#530).

from stapled.

jsha avatar jsha commented on June 13, 2024

Are you talking about this part?

What happens when it's been 7 days, no new OCSP response can be obtained, and the current response is about to expire? Do you:
Stop the (web/email/ftp/xmpp) service?
Stop serving stapled OCSP responses?
Especially in a world where Must-Staple becomes more prevalent, what should the action be taken when things go awful? If it's a Must-Staple cert, it might be more beneficial to fully stop the service (thus causing monitoring to really flip out) rather than serve bad responses or no response, both of which may result in even worse user experiences.

I think that ignores the likelihood that the stapled will be keeping state for a large number of certificates. If one of them is broken, you don't want it to break everybody.

from stapled.

rolandshoemaker avatar rolandshoemaker commented on June 13, 2024

I've opened #3 - #13 covering most of what was discussed here, feel free to open more if I've missed anything.

I think that ignores the likelihood that the stapled will be keeping state for a large number of certificates. If one of them is broken, you don't want it to break everybody.

That's a good point, I guess there are extremes in the use cases here. In terms of monitoring I guess we could just provide a HTTP endpoint you can hit, health or something, that can tell you if/when there are stale responses.

from stapled.

jsha avatar jsha commented on June 13, 2024

Probably you want an exported stat: numStaleResponses.

On Sat, Mar 5, 2016 at 4:30 PM, Roland Bracewell Shoemaker <
[email protected]> wrote:

I've opened #3 #3 - #16
covering most of what was discussed here, feel free to open more if
I've missed anything.

I think that ignores the likelihood that the stapled will be keeping state
for a large number of certificates. If one of them is broken, you don't
want it to break everybody.

That's a good point, I guess there are extremes in the use cases here. In
terms of monitoring I guess we could just provide a HTTP endpoint you can
hit, health or something, that can tell you if/when there are stale
responses.


Reply to this email directly or view it on GitHub
#1 (comment)
.

from stapled.

rolandshoemaker avatar rolandshoemaker commented on June 13, 2024

I think all of the initial points brought up here are either fixed or represented by other tickets so I'm going to close it. Feel free to open more tickets if I missed anything.

from stapled.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.