sipa / bips Goto Github PK
View Code? Open in Web Editor NEWThis project forked from bitcoin/bips
Bitcoin Improvement Proposals
Home Page: bitcoin.org
This project forked from bitcoin/bips
Bitcoin Improvement Proposals
Home Page: bitcoin.org
In the implementation of taproot_tweak_pubkey()
:
def taproot_tweak_pubkey(pubkey, h):
t = int_from_bytes(tagged_hash("TapTweak", pubkey + h))
if t >= SECP256K1_ORDER:
raise ValueError
Q = point_mul(point(pubkey), t)
return bytes_from_int(x(Q)), has_square_y(Q)
Internal pubkey is calculated as Q=point_mul(point(pubkey), t)
.
Which feels (at least to me) like point multiplication, and that would be Q = (t * P)
.
But Script Validation Rule is stating :
If Q ≠ P + int(t)G, fail.
This is same as Q = P + (t * G)
And these are two distinct operation and gives distinct results, and can be a source of confusion.
Assuming the protocol document to be correct.
I suggest changing the implementation as:
Q= point(pubkey) + point_mul(G, t)
Splitting discussions out. This one about the performance impact on signing, without precomputed public key data.
With squareness tie-breaker, signing needs to compute (privkey*G
), compute its affine X coordinate (=computing inv(Z), squaring it, and multiplying with X), compute the squaredness of its Y coordinate (=computing Y*Z, and jacobi of it), normalize X, and then conditionally negate the privkey.
With oddness tie-breaker, signing needs to compute (privkey*G
), compute its affine X coordinate (=computing inv(Z), squaring it, and multiplying with X), compute the affine Y coordinate (=computing 1/Z^3, and multiplying with Y), normalize X, normalize Y, and then conditionally negate the privkey.
The current PR does one unnecessary multiplication, as it's unnecessarily computing the affine Y coordinate. After fixing that, the speed gain from going from square to even tiebreaker is trading 1 field multiplication and normalization for 1 jacobi computation - which is dominated by the jacobi symbol.
I don't know if we care about this ~5us speedup at signing time (because when you do care, perhaps you want signing with precomputed public key data, which avoids this cost in both cases), but there seems to be basically no gain apart from consistency with R (which is arguably very different, as handling of R is generally limited in scope to the signing code, while public keys are obviously exposed to applications (and things like the taproot tweak break the perfect blackboxy ness of them).
Originally posted by @sipa in bitcoin-core/secp256k1#558
The note [7] seems inappropriate after the reference to Euler's criterion. It seems better to:
Regarding TapTree construction based on tapscript probability weights (e.g. Huffman):
There seems to be a privacy trade-off when optimizing for expected script path spending cost, as standard outputs will likely feature standard probability weightings, especially in protocols such as lightning. Spending a script branch will reveal its height in the tree and may imply the contract type.
If authors agree, is this worth mentioning in taproot-BIP?
For public keys pk_1, ..., pk_u
, messages m_1, ..., m_u
, signatures sig_1, ..., sig_u
, the probability that BatchVerify(pk_1, ..., pk_u, m_1, ..., m_u, sig_1, ..., sig_u)
with 128-bit uniform randomizers succeeds and there exists i
in [1, u]
such that Verify(pk_i, m_i, sig_i)
fails is less than 2^-128
.
This speeds up batch verification in libsecp by up to about 9%. If people agree that this is a good idea, I'll open a PR upstream.
For i in [1, u] let
P_i = lift_x(int(pk_i)),
r_i = int(sig[0:32]),
R_i = lift_x(r_i),
s_i = int(sig[32:64]).
If there exists an i such that lift_x for P_i or R_i fails or r_i >= p or s_i >=
n, then both Verify(pk_i, m_i, sig_i) and BatchVerify(pk_1, ..., pk_u, m_1, ...,
m_u, sig_1, ..., sig_u) fail.
Otherwise Verify(pk_i, m_i, sig_i) fails if and only if C_i := s_i*G - R_i -
e_i*P_i != 0. We let c_i to be the discrete logarithm of C_i with respect to a
fixed group generator and define the polynomial f_u(a_2, ..., a_u) = c_1 +
a_2*c_2 + .... + a_u*c_3. BatchVerify succeeds if and only if f_u evaluated on
uniform randomizers a_2, ..., a_u is 0. Assume there exists i in [1, u] such
that Verify(pk_i, m_i, sig_i) fails, then f_u is not the zero polynomial. We can
make the following inductive argument (similar to the Schwartz-Zippel lemma with
d = 1, a_1 = 1):
u = 1: Pr(f_1() = c1 = 0) = 0
By assumption.
u > 2: Pr(f_u(a_2, ..., a_u) = 0) <= 2^-128
We can write the polynomial as f_u(a_2, ..., a_u) = a_u*c_u +
f_{u-1}(a_2, ..., a_{u-1}). If c_u != 0 then there is at most a single
value for a_u such that f_u(a_2, ..., a_u) = 0. Otherwise, c_u = 0,
f_{u-1} is a non-zero polynomial and we can apply the induction
hypothesis.
qed
Ping @sipa @real-or-random
Review comment by @gmaxwell:
Even without MuSig, due to the signature hash committing to the tapleaf hash, signers must commit up front to the exact subset that will be signing. That means you can't just give a signature to the other signers and not care about who will complete it. A solution is providing signatures for all matching leaves, but this may be unreasonable for large multisig.
As far as I can see, the taproot bip currently lacks any discussion about why the taproot construction is secure. There's @apoelstra's security proof which we could link to. The paper itself could be expanded (introduction) and clarified a bit. I'm having troubles figuring out what properties are proven and what is missing every time I look at it.
ECDSA as currently used is already non-malleable. See also
https://nbn-resolving.de/urn:nbn:de:hbz:294-60803
For the nonce generation step of the signing algorithm, I see:
Let rand = hashBIPSchnorrDerive(bytes(d) || m).
Wouldn't the proper primitive to use be a PRF such as HMAC-SHA-256?
Also, shortly after in the Synthetic nonces
section,
rand = hashBIPSchnorrDerive(bytes(d) || m || get_32_bytes_from_rng())
appears like it should be a PRF / HMAC-SHA-256 invocation with get_32_bytes_from_rng() as the key to the HMAC invocation.
The resource linked from the BIP, fault injection attacks and side-channel attacks, says: "With synthetic nonces, nonce derivation is a PRF keyed by the random input Z."
The image under "Constructing and spending Taproot outputs" is unfortunately(!) broken.
It is probably a Github issue as it shows with a SSL certificate issue. But even doing wget https://raw.githubusercontent.com/sipa/bips/bip-schnorr/bip-taproot/tree.png --no-check-certificate
gives a 404 error.
Maybe it can be fixed with a re-upload?
@roconnor brought this up, but I'm not sure what exactly his issue was (perhaps he wants to comment here). Our current paragraph on this is quite general and therefore not too bad imo. Maybe we can make the following improvements:
Review comment by @gmaxwell.
I believe the reason is that turning them into success would complicate some forms of debugging. Script disassembles will likely continue to disassemble these opcodes as CHECKMULTISIG. Furthermore, successes may make debugging harder as things in manual tests may continue to work as expected.
Split out discussion. This one is about avoiding the EC multiplication by using precomputed pubkey data.
It seems a question is whether we need to protect against the situation where the public key (or data derived from it), passed to the signing function, is actually computed by the signer itself, or untrusted.
In the case of signing devices with a fixed master BIP32 key, the public key is likely going be computed by the device itself anyway at signing time, so it's wasteful to have the signing algorithm redo it.
On the other hand, @real-or-random and @jonasnick correctly point out that if the public key data provided to the signing algorithm is untrusted, this public key data must also be an input to the derivation function. I don't think we have a proof that even with this modified derivation function, the resulting signature scheme is secure.
So I think that's an unfortunate situation to be in. We shouldn't standardize an (alternative) signing algorithm that takes untrusted pubkey data as input (because we have no proof it is secure), but simultaneously, having an API that takes in trusted pubkey data is scary (because if the API is abused with untrusted pubkey data, we have a known attack).
FWIW, the ed25519-donna implementation seems to happily take the public key as input to the signer. I wonder if the concern of an untrusted public key has been analysed in that context.
Originally posted by @sipa in bitcoin-core/secp256k1#558
@sipa points out that we should cite these:
https://eprint.iacr.org/2020/1244
https://hdevalence.ca/blog/2020-10-04-its-25519am (cc @hdevalence)
Review comment by @gmaxwell: instead of describing the tapscript signature message as a "patch" of the taproot one, describe a common "base signature message" in bip-taproot (in a separate section), and then refer to it in both bip-taproot and bip-tapscript sighashing.
My idea to do this would be to define a function that takes as input the flags value (without annex or not), and returns a byte array. In bip-taproot sighashing is then described as hash_...(base_message(0)) and in bip-tapscript as hash_...(base_message(1) || ... the extra tapscript specific fields).
Review comment by @gmaxwell.
The text is clear about it having precedence over the 520 byte limit, but not that it also overrides the 1000 stack element limit and other things.
Review comment by @gmaxwell.
Also mention GGM+preimage resistance based proof.
See https://eprint.iacr.org/2017/985.pdf, which describes an attack against Ed25519, by doing power analysis on the nonce generation function.
It relies on having a single SHA512 message block that contains both attacker-controlled data, and secret data that the attacker wishes to learn. We use SHA256 instead of SHA512, but it is sufficiently similar that we should perhaps at least consider if the same kind of attack could apply.
With #194, our nonce generation function would effectively become H(priv || pub || msg [|| randomness])
. All the secret data goes into the first compression, and all attacker-controlled data goes into the second compression, so I think the attack does not apply directly already, but maybe related attacks do. I'll email the authors of the paper about this.
The paper however suggests having at least 128 bits of randomness (synthetic nonce!) in the first hash block. I wonder if that means we should move the randomness to after the priv, so H(priv || randomness || pub || msg)
. That means that if the randomness has at least 128 bits of entropy, we've effectively implemented their countermeasure, but at the cost of losing the ability to precompute the midstate at H(priv || pubkey || ...)
.
Anyone have any thoughts?
@trevp proposes "synthetic nonces" [1], which means generating the nonce from deterministically derived "randomness" and real randomness. This has hardly any disadvantages over deterministic nonces but comes with the benefit of protection against some kinds of fault injection attacks, e.g., if an attacker can inject a fault in the computation of the main signature hash, then it can happen that you leak the private key if you sign the same message twice (same deterministic nonce, different hash). They provide some other minor benefits, too, see Trevor's email for details.
[1] https://moderncrypto.org/mail-archive/curves/2017/000925.html (see in particular the section on "nonce randomization")
[2] https://research.kudelskisecurity.com/2017/10/04/defeating-eddsa-with-faults/
I don't understand what "digest computation avoids unnecessary hashing as opposed to BIP143 digests in which parts may be set zero and before hashing them" means. The last part of the sentence seems to have a syntactical error.
Usually EC signature test vectors also contain the ephemeral key (k
) so that developers can use the tests no matter how they compute k
(eg. random or using RFC-6979 instead of tagged hash).
Context: bitcoin-core/secp256k1#558 (comment)
Right now, BIP340 specifies that imputs are strictly 32-byte messages. This implies that for typical use cases, the message needs to be pre-hashed, and the GGM/rpp/rpsp proof doesn't exactly apply (collision resistance is required too). This is fine for our intended use case, as the message contains inevitable hashes anyway (the signature data contains the txid of outputs being spent), and thus indirectly the security is already based on collision resistance.
Apart from that, there are few technical reasons to not support variable length messages (which in some cases might not have this problem) instead, as the message is always the last input to a hash function anyway.
So we should either:
Discussion group 6 agreed largely with these notes I'd made about this section of the BIP; I would PR but I don't think that's appropriate, since it's kind of central to what the author(s) are writing:
Do the goals make sense?
I don't like that the Motivation section of the BIP talks about the Motivation for not adding in other goals. It should just state the goal of the BIP, instead. This is context, which many fresh readers will not have and it's unreasonable to expect them to read everything about graftroot in order to understand the motivation of the taproot BIP.
The actual goal of this BIP afaict is only in the first line of the "Design" section:
This proposal focuses on improvements to privacy, efficiency, and flexibility of Bitcoin's smart contracts, subject to two restrictions:
Not adding any new strong security assumptions
Not combining into the proposal any functionality which could be simply implemented independently.
The following "Specifically" section could additionally mention the support for much larger scripts than previously (the other huge change along with the mentioned Privacy changes).
The following bullet points I like, they clarify at a high level each of the functional effects of the new design.
Another point on the Design section: it would surely make sense to write the approximate structure of the scriptPubkey (Q = P + H(P||S)G) here for those who want to understand in broad terms, instead of requiring them to read the detailed algorithm in "Script Validation Rules". Perhaps debatable, but it seems that way to me.
I find the paragraph about key prefixing and BIP32 is slightly confusing. It should separate these two concerns more clearly:
Review comment by @gmaxwell
It's probably better to specify these as a diff with bip-taproot (and then point out that that informally means changes XYZ compared to P2WSH).
After we have switched to recommending synthetic randomness for nonces, I propose to do the same for the randomness used in batch verification. Currently we derive the randomness deterministically from the inputs to the batch verification algorithm.
Mixing in real randomness avoids that the result of the batch verification algorithm is entirely predictably to an attacker. There will be sets of signatures that contain invalid signatures but still pass batch verification. If our hash function is good, these can be found only with negligible probability, so this is not an issue per se. But adding real randomness provides defense in depth and makes this attack impossible. This also prevents attempts to exploit such an attack to split the chain between one set of nodes with some PRG X and another set of nodes with some different PRG Y.
Moreover, when I say these inputs can only found with negligible probability, then this relies on an argument that is nowhere written up properly. It appears to be true and I think I have a simple proof but as far as I know, we're to first to specify such an algorithm. Independently of this, we should add a proof sketch to a footnote. (I can do this). But even if we have this argument written up, adding real randomness is defense in depth, and bring us back to "standard" batch verification, where the coefficients are random and which has been looked at often enough,.
I don't any of these points is a must-have. But given that adding real randomness is so cheap (and we can make it optional), I think we should do it. It's also easy to write in the BIP because we have some text already for nonces and we can mostly refer to this.
From footnote 9:
Given a candidate x, the valid Y coordinates are the square roots of c = x3 + 7 mod p and they can be computed as y = ±c(p+1)/4 mod p (see Quadratic residue) if they exist, which can be checked by squaring and comparing with c. Due to Euler's criterion it then holds that c(p-1)/2 = 1 mod p. The same criterion applied to y results in y(p-1)/2 mod p = ±c((p+1)/4)((p-1)/2) mod p = ±1 mod p. Therefore y = +c(p+1)/4 mod p is a quadratic residue and -y mod p is not.
This part of equation y(p-1)/2 mod p = ±c((p+1)/4)((p-1)/2) mod p is not correct for the minus sign.
For the strict proof:
My higher education was not in English, so I kindly ask anyone with good academic English to make a PR.
It is known, that with a very small chances, a public key tweaking may fail due to the fact that addition of some tweaking factor multiplied by generator to an existing public key may result in the special elliptic curve point "infinity". This is already covered by the possibility for libsecp256k1 to return a failure (non-1 value) from theecp256k1_ec_pubkey_tweak_add
function: https://github.com/bitcoin-core/secp256k1/blob/e541a90ef6461007d9c6a74b9f9a7fb8aa34aaa8/src/secp256k1.c#L596. Also, the function theoretically may fail b/c the tweaking factor exceeds the elliptic curve field order, however this fact is, according to https://github.com/bitcoin-core/secp256k1/blob/137d304a6bf419bbbafd70d1dbdb2162354d71b7/src/scalar_low_impl.h#L58, always ignored.
Anyway, if the function may deterministicaly fail for a given tweak (i.e. resulting hash from some input data, added *G to the intermediate public key) the specification should encounter for the possible reaction to the case. If you ok, I can work on the PR covering this.
Review comment by @gmaxwell.
Some of the text can be misinterpreted to mean that empty signatures cause an abort of the script.
(on mobile currently, i'll edit later to show the exact section)
Review comment by @gmaxwell.
In #116 (comment) @gmaxwell brought up a new argument against using is_positive
for points with a square Y coordinate. Alternative suggestions are is_square
and has_square_y
. I'm opening an issue because this would affect a few different files:
Also, I noticed that bip-taproot.md hasn't been updated in a while and still refers to quadratic residues and an is_quad()
function which isn't defined.
I've warmed up to the idea of using has_square_y
for code and for the bip. In general I still like the positive/negative terminology because it's way easier to say in english than "a point whose Y coordinate is a square" (which we use 4 times in bip-schnorr and once in bip-taproot). But perhaps that's just something that doesn't need to be defined in the bip.
According the scripts in the section https://github.com/sipa/bips/blob/bip-schnorr/bip-taproot.mediawiki#constructing-and-spending-taproot-outputs a simple SHA256 hash is used over concatenated internal public key and tapbranch merkle root, which (as I assume) prevents multiple script commitments. However, a simple hash of concatenated public key and a tweaking factor (message) may be subjected to a length extension attack. So I propose to use instead of a simple SHA256 digest, a HMAC-SHA356, which has no risk for any known attacks on the hash value alike was proposed in your sidechains paper https://blockstream.com/sidechains.pdf on the page 18.
We have addressed the similar issue during our work on the RGB project and have developed a spec/standard for collision-resistant tagged key tweaking: https://github.com/LNP-BP/lnpbps/blob/master/lnpbp-0001.md. May be it worth merging Some parts of it nto the proposed BIPs?
Traditional implementations of ECDSA do not guarantee that any signature which will be accepted by any arguably correct implementation will be accepted by any other arguably correct implementation. This can be fixed by adding additional requirements like how bitcoin does with LowS.
This property adds extra considerations to BIP-schnorr since it also seeks to make sure batch verification returns consistent results with scalar verification.
It might be worth mentioning that this was an explicit design criteria. (and if someone found an exception, it would be a serious flaw in the proposal)
Edit to be clear, it should say something like "Every input string is consistently rejected or consistently accepted by all conforming implementations, regardless of how the input string was constructed."
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.