Coder Social home page Coder Social logo

draft-irtf-cfrg-hpke's Introduction

Hybrid Public Key Encryption

This is the working area for the individual Internet-Draft, "Hybrid Public Key Encryption".

Building the Draft

Formatted text and HTML versions of the draft can be built using make.

$ make

This requires that you have the necessary software installed. See the instructions.

Existing HPKE implementations

Implementation Language Version Modes
go-hpke Go RFC9180 All
CIRCL Go RFC9180 All but "Export Only"
hpke-compact Go RFC9180 All
rust-hpke Rust RFC9180 All
BoringSSL C RFC9180 Base
NSS C RFC9180 Base, PSK
hpke-rs Rust RFC9180 All
happykey C/OpenSSL RFC9180 All
hpke-wrap C/OpenSSL RFC9180 All
zig-hpke Zig RFC9180 All
libhpke C++ (OpenSSL) RFC9180 All
hacl-star-hpke F* draft-05 All
hpke-py Python (cryptography.io) RFC9180 Base, Auth
hpke-js Javascript (built on WebCrypto API) RFC9180 All
Tink Java, Python, Go, C++ RFC9180 Base
hpke-rb Ruby (OpenSSL) RFC9180 All
Apple CryptoKit Swift RFC9180 All
BouncyCastle Java RFC9180 Base, Auth

Submit a PR if you would like your implementation to be added!

Contributing

See the guidelines for contributions.

draft-irtf-cfrg-hpke's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

draft-irtf-cfrg-hpke's Issues

Clarify "hybrid" in the introduction

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Asymmetric and symmetric algorithms have been combined since the 1980s, e.g., in Privacy-Enhanced Mail [RFC1113], so a hybrid approach (in the sense of combining the two) can by now be considered the "tradition" of public-key cryptography. We would therefore suggest replacing the first sentence with the following:

Encryption schemes that combine asymmetric and symmetric algorithms have been specified and practiced since the early days of public-key cryptography (e.g., [RFC1113]). Combining the two brings the "best of both worlds": the key management advantages of asymmetric cryptography and the performance benefits of symmetric cryptography. However, the traditional combination has been "encrypt the symmetric key with the public key." "Hybrid" public-key encryption schemes (HPKE), specified here, take a different combination, "generate the symmetric key and its encapsulation with the public key." .

Ambiguous Nzz definition, possible wrong value for P-521

Table 7.1 specifies an Nzz value of 64 for DHKEM(P-521, HKDF-SHA512). The test vectors are using a value of 66, which seems right given:

Nzz: The length in bytes of a shared secret produced by the algorithm.

(in context of KEM identifiers, not KDF).

But, 4.1 also states:

For the variants of DHKEM defined in this document, Ndh is equal to Npk, and the output length of the KDF's Extract function is Nzz bytes.

We should clarify whether Nzz is based on the ECDH or HKDF-Extract output length, and fix the test vectors if necessary.

Clarify additional key material in authenticated modes

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5:
"we include two authenticated variants .": We would also suggest mentioning that these variants also contribute additional keying material to the encryption operation. See also discussion in Section 8.1.

After the sentence, "the constructions described here presume .", mention that the recipient also needs a way to determine which of its public keys was used for the encapsulation operation (if the recipient has more than one public key). Also add a reference to Section 9 which addresses the corresponding issues for message encoding.

Updated test vectors do not match the spec

While updating labels for draft-03, I noticed that the test vectors added in 5bc57ba seem to be incorrect, matching an implementation that:

  1. Uses the label "info_hash" to extract info_hash, when draft-03 specifies "info".
  2. Uses the label "psk" to extract psk, when "psk_hash" is specified.

With the "_hash" suffix swapped as above, my implementation generates matching outputs.

KeySchedule notation issues

In the current draft (looking at branch master):

def KeySchedule(mode, pkRm, zz, enc, info, psk, pskID, pkIm):
VerifyMode(mode, psk, pskID, pkI)

pkRm = Marshal(pkR)
pkRm is given as parameter to KeySchedule but is calculated from pkR inside (suggestion: remove the line calculating it)
pkI is passed to VerifyMode, should be pkIm.

From @dwd and @blipp

Nits from Riad

  • Section 5.1.3: it would be nice to include a reference or citation for
    unknown key share attacks.

  • Section 5.2: is there a reason to put the word "amortize" in quotes?

  • Section 7.1.2: it might be worth mentioning here that [keyagreement] also
    includes checking that the public key is not the identity point.

  • Section 7.1.2: is there a reason to recommend either checking for a nonzero
    scalar or checking for a non-identity DH output? Checking the latter covers
    the former and also covers the check from my prior comment. Moreover, it is
    not clear to me that checking the scalar is useful for the recipient, since
    this is essentially just checking that their long-term secret is nonzero.

  • Section 8.1: the sentence "In particular, the KDFs and DH groups..." might
    want to clarify that this statement is true only when these primitives are
    used as specified. The concern is that HKDF is only indifferentiable under
    some restrictions on salt length (for reasons noted in Section 8.3).

Bind DHKEM labels to the group

  • Section 4.1: this may be paranoia, but it would be slightly nicer to include
    the DH group name in the label arguments of LabeledExtract and LabeledExpand
    to ensure that invocations from different DHKEM instantiations are orthogonal.

pkSm does nothing in KeySchedule

I was looking through a code coverage map of my implementation and realized that default_pkSm is never actually used anywhere. KeySchedule takes in a pkSm and uses it to sanity-check the given mode, and then uses it nowhere in the key schedule itself. Should pkSm be removed as an argument?

Add an exporter

It may be desirable to export a secret, as with the TLS exporter. Adding such a feature would add a bit of complexity, and dilute the focus on PKE.

Fix some nits

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

"assumed that the sender" --> "assured that the sender"

Section 8.2:

"KEM public key pkR" --> "KEM public key "pkR""

"ciphertext enc" --> "encapsulated key enc" (two occurrences). "Ciphertext" is used elsewhere in the draft to refer to the AEAD output.

Section 8.3: There is a non-normative (lower-case) "should" in the first sentence. (Contrasting against a normative/upper-case "SHOULD" in the first sentence of 8.4.) Should this "should" be "SHOULD"?

Section 8.7: There are missing quotes around "(enc2, ciphertext2, enc, ciphertext)".

Add acknowledgements

Benjamin Lipp, David Benjamin, Benjamin Beurdouche, Riad Wahby, Kevin Jacobs, Michael Rosenberg, Michael Scott, Raphael Robert, and probably more!

Clarify directionality of HPKE with multiple encryptions

There's currently a single sequence number space that's incremented by 1 for each message encrypted. This implies that only the initiator can encrypt messages to the receiver, else we risk key/nonce re-use. We should be clear about this in the draft!

Issues in test vectors

Two issues I found in the test vectors:

  • kemID is incorrect. eg - for Curve25519, kemID: 1 but it's specified as 2 in the draft.
  • Sequence numbers for generating nonces look off by one.
    For DHKEM(Curve25519), HKDF-SHA256, AES-GCM-128, the initial nonce is 0d8e01f89fa5abab107f7fe9, but the nonce used in the first encryption (sequence number 0) is 0d8e01f89fa5abab107f7fe8 - the initial one XOR 1.
    As I understand the spec says it should be XOR 0.

Add some color to post quantum proof discussion

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 8.1: "A full proof of post-quantum security .". Although we understand that a full proof of post-quantum security may not be achievable within the timeline of this draft's publication, we would nevertheless recommend some additional discussion on what might be desirable to prove. In the draft, the PSK is employed as an authentication factor, so presumably the proof being contemplated would be that authentication in the modes involving PSKs remains secure against a quantum computer. A stronger property would be more attractive: that encryption in the PSK modes remains secure against a quantum computer, whether the KEM itself is post-quantum or not. If the authors consider this property plausible, then it should be mentioned here as a goal for security analysis. If not, then the reasons for not targeting this property should also be given.

Nenc and Npk for P512 are inconsistent within the draft

In Section “DH-Based KEM”:

* P-521: The X-coordinate of the point, encoded as a 66-octet
  big-endian integer

In “Algorithm Identifiers” > “Key Encapsulation Mechanisms”:

| Value  | KEM               | Nenc | Npk | Reference      |
|:-------|:------------------|:-----|:----|:---------------|
| 0x0012 | DHKEM(P-521)      | 65   | 65  | {{NISTCurves}} |

Inconsistent use of X25519 vs Curve25519

The HPKE draft refers to "Curve25519" and "DHKEM(Curve25519, HKDF-SHA256)" throughout the draft, but then section 8.8 mentions DHKEM-X25519.

I believe X25519 is correct here. RFC7748 defines "curve25519" as a particular Montgomery curve. It then defines "X25519" as a Diffie-Hellman primitive on top of curve25519, with particular encodings and everything else. HPKE is using the Diffie-Hellman primitive, so it should use X25519. As a bonus, it's shorter and "DHKEM(Curve25519, HKDF-SHA256)" is already a mouthful. :-)

Shared secret size for P-256

It's currently 32 bytes, i.e., just the x-coordinate of the point. But Npk suggests it should be a fully-encoded public key. Which do we prefer?

cc @blipp @bifurcation

(Thanks to Michael Scott for raising this!)

Harmonize label values

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

The "label" argument to LabeledExtract is being used in some cases to identify the output, in one case to identify the input, and in one case to identify the intent. We suggest harmonizing on the former, and also consistently suffixing the output variable name with "_hash" when the purpose of the extraction is to produce a hash of the input. This would result in the following statements being updated:

info_hash = LabeledExtract(zero(Nh), "info_hash", info) // new label

psk_hash = LabeledExtract(zero(Nh), "psk_hash", psk) // new output name

secret = LabeledExtract(psk_hash, "secret", zz) // new input name and label

Caveat in test vectors

The plaintext is always the same, but the nonces and AADs differ by just one bit, which is hard to spot and easily missed.

Ambiguity about Secret Export

Section 5.3 explains secrets are exported with the KDF Expand function but the included code in the same section now calls LabeledExpand with a "sec" label.

The JSON test vectors contain sample results for the export function, but they match an unlabeled implementation with Expand and not LabeledExpand.

Inconsistent naming of mode AuthPSK

Both the constant mode_psk_auth and function names like SetupAuthPSKR are used. That's inconsistent because the order of psk and auth is different. Suggestion: change the constant to be mode_auth_psk, because there are more function names that would need to be changed otherwise.

Unnecessary return value in Decap(), AuthDecap()

The spec definition of Decap() includes taking enc as a parameter, and returning it unmodified:

   def Decap(enc, skR):
     pkE = Unmarshal(enc)
     dh = DH(skR, pkE)

     pkRm = Marshal(pk(skR))
     kemContext = concat(enc, pkRm)

     zz = ExtractAndExpand(dh, kemContext)
     return zz, enc

Only return zz is needed. The same applies to AuthDecap.

Cite Shoup for identity misbinding prevention in 8.2

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

"avoid identity mis-binding issues": Perhaps also note that including the public key and the encapsulated key as inputs to key derivation can help with the security proof. [Shoup] makes this observation in Section 15.6.1.

[Shoup] @Article{shoup2001proposal,
title={A proposal for an ISO standard for public key encryption (version 2.1)},
author={Shoup, Victor},
journal={IACR e-Print Archive},
volume={112},
year={2001}
}

Limits on Inputs to LabeledExtract and LabeledExpand

I'm trying to remove all allocation from my implementation, and there's really only 1 snag I'm hitting: LabeledExtract and LabeledExpand do a concat operation before passing to their respective HKDF functions, and there isn't always an upper bound on the size of the concatenated result. Specifically, there's

KeySchedule(info, psk, pskID):
    LabeledExtract(..., info)
    LabeledExtract(..., psk)
    LabeledExtract(..., pskID)
Context.Export(exporter_context):
    LabeledExpand(..., exporter_context, ...)

If there were a (reasonably small) upper bound on the sizes of info, psk, pskID, and exporter_context, then it would be trivial to implement HPKE without allocation.

I've thought about "streaming" the input into the above functions, instead of sending a concatenated bytestring. This could theoretically work for HKDF-Extract with SHA256, since it's an MD hash, but this doesn't work generically. Also the definition of HKDF-Expand does not admit a way to stream in the info string.

Clarify pseudocode and define undefined operands

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

Section 5.2.:

The symbol "<<" isn't defined, but assuming it means "shift left by a specified number of bits", the number of bits to shift should be "8*Nn" rather than "Nn".

Does "overflow" in the third paragraph refer to the same condition as "wrap" in the fifth paragraph? If so, the text should be combined and a single term used for consistency. If not, the differences between the two requirements should be explained. We would also suggest adding a note indicating that the reference code assumes the sequence number is the same length as the nonce.

The use of "Nonce" (capitalized) as a function and "nonce" (lower case) as a value may be confusing. We suggest instead that the function be named "ComputeNonce" or similar.

On a similar object-oriented programming note, it should be stated that the underlying "Seal" and "Open" functions are the ones determined by the "aead_id" property.

Guidance for future KEMs

From https://mailarchive.ietf.org/arch/msg/cfrg/ZcTCJkilzCDshxsIj7MwKHNlNuM/

As guidance for future revisions, we would recommend adding a section about the issues that need to be considered when adding support for other KEMs. There will presumably be industry interest in including post-quantum KEMs (as anticipated in Sec. 8.1), and there may also be interest in including RSA-based KEMs, for legacy support. The technical subtleties in adding such mechanisms include:

  • Assumptions about the relationship between the private key and the public key and the definition of the "pk()" function. For instance, GenerateKeyPair, listed as part of a KEM in Section 4, doesn't really need to be part of one (it's not part of RSA-KEM).

  • Assumptions about the length of the public key. It may not always be a fixed value, "Npk", for a KEM with a given set of parameters. The other (and unrelated) "hybrid" draft, draft-ietf-tls-hybrid-design, Section 3.2 ,makes accommodation for public keys associated with a given set of parameters to vary in size.

Consider static DH oracles

Do we need to be concerned about them? If receivers don't validate ephemeral keys (point on the curve, and in the right subgroup), what can go wrong?

Contents of a context aren't well-defined

This is just an editorial comment. The spec doesn't seem to define the contents of a "context" very clearly. Section 5 says:

A "context" encodes the AEAD algorithm and key in use, and manages the nonces used so that the same nonce is not used with multiple plaintexts.

But it also has an exporter secret. Then section 5.1 says:

return Context(key, nonce, exporter_secret)

But we haven't defined the Context function yet. I'm guessing the intent is that Context produces some sort of record type with those field names? But that wouldn't initialize Context.seq used later in 5.2. (Confusingly, this Context is distinct from the context variable which contains an HPKEContext structure. Maybe the latter could be renamed?)

Then 5.2 lists out the contents of a "context' more explicitly, including the first mention of a sequence number. But it omits the exporter secret again. (Should "The sender's context MUST be used for encryption only. Similarly, the recipient's context MUST be used for decryption only." be rephrased? One could read that as saying export is also not okay.)

Then 5.3 mentions a context having an "exporter secret", but this is actually the only instance of that phrase in the document.

Outdated reference

In "DH-Based KEM" the paragraph

The GenerateKeyPair, Marshal, and Unmarshal functions are the same as for the underlying DH group. The Marshal functions for the curves referenced in {#ciphersuites} are as follows:

references the #ciphersuites section that no longer seems to exist.

Clarify KEM shared secret for AuthEncap/Decap

  • Section 4: the definitions of AuthEncap and AuthDecap contain words to the
    effect, 'the KEM shared secret key is known only to the holder of the
    private key "skS".' It would be more accurate to say , 'the KEM shared
    secret key was generated by the holder of the private key "skS"'.

ExtractAndExpand input parameters

From Michael Scott:

A minor observation. In ExtractAndExpand the salt parameter is zero(Nh).

In fact this is the same as using zero(0), as HMAC internally pads this up to a blocksize of zeros.

So for example if using SHA512 and Nh=64, the hash blocksize is 128, and zero(0) gets padded up to 128 zeros, as does zero(64) . In fact the parameter to zero(.) is irrelevant.

We might consider zero(2*Nh) or zero(0). What do you think, @blipp?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.