Coder Social home page Coder Social logo

Comments (11)

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024 1

I looked that this issue just briefly. It seems that public key recovery has its own set of potential implementation issues, so that it would make sense to have its own test and test vector format. For example it is possible to generate invalid signatures with the property that public key recovery leads to a public key with a point at infinity. Obviously an implementation should properly handle such situations.

Generating such a set of test vectors wouldn't be the main issue here. The first step would be to determine a test vector format that has sufficiently many use cases. There are a number of somewhat different protocols that all require public key recovery and hence it would be useful to have good coverage. Another question is finding a good place for a potentially new set of test vectors. I.e., it would be useful to have test vectors for specialized protocols all in the same place.

The JCE interface does indeed not expose underlying EC arithmetic. BouncyCastle has some direct support, but having to rely on specific providers is of course not great. Test vector generation for the current test vectors does indeed try to generate a lot of edge cases. Hence the test vectors contain cases where recover ID 2 or 3 would be necessary.

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024 1

I had some time to look into public key recovery and have written some initial code to generate test vector. Since ECDSA verification through public key recovery use a different algorithm than ordinary ECDSA verification there are different kinds of edge cases that should be checked.

One thing that needs to be done here is to define ECDSA variants, i.e., determine the following properties / parameters:

  • the curve
  • the hash function for hashing the message: i.e. SHA-256, SHA3-256, Keccak256 etc. Some protocols have additional formattings, such as constant prefixes to the message that is being hashed. I would consider such things to be a property of the protocol and not a property of the ECDSA variant.
  • the formatting of the signature: I would assume that the 65-byte vrs format is quite common. Not sure, if there are other formats.
  • valid range for v. Here, I'm assuming that the network id is a parameter of the algorithm, and that a key used for one network is never used for another network.
  • the format of the public keys, i.e., compressed, uncompressed points or other formats.

Concrete specifications (i.e. something resembling a standard) for ECDSA variants would be really helpful. I'm finding a number of small contradictions what is valid and it is often not clear if this is because of distinct protocols or misunderstandings.
Also, since Wycheproof is in limbo it might make sense to find better ways to distribute and test new test vectors.

from wycheproof.

webmaster128 avatar webmaster128 commented on June 2, 2024

Having dedicated tests would be ideal indeed. In the meantime I tried generating the recover IDs but did not find an implementationation that is flexible enough which I can run with reasonable setup effort.

The use case I'm most familiar with is Ethereum (which I guess is mostly the same as Ethereum). There you only require recover ID 0 and 1 since other IDs are considered invalid. It would be great to not run everything through DER but have signature = (r, s), message_hash and recover_id as inputs.

Personally I'm interested to use them for ecdsa/secp256{k,r}1. But of course, the more use cases can be covered, the better.

from wycheproof.

webmaster128 avatar webmaster128 commented on June 2, 2024

I would assume that the 65-byte vrs format is quite common [...] valid range for v.

The v variable is Ethereum specific and combines a network ID and a recovery ID in a single value: v = chain_id * 2 + 35 + recovery_id. Using a recovery ID of range 0, 1, 2, 3 is a generic representation.

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024

I'm putting some experimental files here: https://github.com/bleichenbacher-daniel/Rooterberg/tree/main/ecdsa
Please note that these files are just experimental. The main goal is to simply determine the format of the test vectors.
There are essentially two options: the first one is to have test vectors with raw values, r,s and recovery_id. The second one is to have have specific encodings (such as RLP encoded message and signatures) for specific protocols. I don't have an opinion which one is more useful.
The test vector generation is nowhere near complete. The current version focuses on arithmetic errors that could occur during public key reconstruction. The construction of these test vectors depends on how EC points are represented internally. I currently use either Jacobian or affine coordinates during the construction. If there are other commonly used representations then I'll add them too.
I test my general ECDSA implementation against third party libraries and against itself, but I don't have an implementation for the Ethereum or bitcoin specific modifications. Hence it is possible that everything is completely wrong.

from wycheproof.

webmaster128 avatar webmaster128 commented on June 2, 2024

That's great stuff, thanks a lot for the effort!

For my purpose the split r, s, revovery_id would be ideal. IMO the protocol specifics can be done by the user of the test vectors.

My experience working with wycheproof was a bit painful as I had to write a custom DER encoder with all sorts of edge case handling just to use the test vectors but do not need DER anywhere in my project.

The expected pubkey could be included in compressed and uncompressed form maybe? Not too hard to do this in the caller but if you include both that would make it more convenient. Ethereum users will need the uncompressed pubkey (65 bytes), my API needs the compressed representation (33 bytes).

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024

I did push some new test vectors with a new format. These are still experimental.

I agree that tester should not have to parse DER encoding if their library is not using DER encoded signatures. Currently, there are test vectors that use P1363 encoding, which are easier to parse, but it would still make sense to add additional encodings. For standards that are well defined it would make sense to add additional encoding as well. Figuring out a convenient format is one of the goals here, I think.

from wycheproof.

chipshort avatar chipshort commented on June 2, 2024

I implemented a prototype of some tests using these in here.
One small annoyance is that I currently have to check the comment for "(s not in range 1 .. n//2)" because all of them are marked as invalid, but we actually count high-s signatures as valid. Would be nice to add a flag for that.

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024

Thanks a lot for the feedback.

One of the open issues is indeed to collect information about different variant of ECDSA in use, so that it is possible to generate test vector files for given use cases. If there is a standard, RFC, BIP, etc. that describes which signatures are valid then it should be possible to generate test vector files for that given document or application.

The test vectors do have a header describing the assumptions for the ECDSA test case:
E.g., here is one.

"testType": "EcdsaVerify",
"algorithm": {
"type": "Ecdsa",
"curve": "secp256k1",
"sha": "KECCAK256",
"encoding": "RAW",
"normalize": true,
"signature_generation": "Generic"
},

First testType describes the type of the test, which is verifying signatures here. I'll add other test vectors for deterministic signing (e.g with RFC 6979).
The algorithm itself is described below. Besides primitive and curve, other parameters are fixed too. Here "KECCAK256" is the SHA3-256 predecessor without the domain separation bits added by NIST.
Encoding is the encoding of the signature. Currently I have code for raw signatures (just a tuple of integers), DER encoding, IEEE P1363 encoding and the SSH format.
normalize determines if the signatures need to be normalized for applications that are otherwise susceptible to signature malleability attacks.
And the signature generation determines if a specified method (e.g. RFC 6979) was used to generate the signatures or not.
I'm also collecting and organizing references, which I plan to add to the test vector files too, so that hopefully it becomes more clear which test vectors belong to which application.

If the header doesn't fit your application, then please ask for a new file. Generating the test vectors is far from being the biggest issue here. Finding out what variants exist is far more time consuming.

I'm not too familiar with the crypto coin world and I easily mix up different proposals.

from wycheproof.

chipshort avatar chipshort commented on June 2, 2024

I see, thanks for the extensive explanation. So for me the "normalize": false tests are the relevant ones. I have updated my PR.

from wycheproof.

bleichenbacher-daniel avatar bleichenbacher-daniel commented on June 2, 2024

I hope this works better now.

Eventually there should be some documentation listing protocols and corresponding parameters in order to simplify the selection of test vectors. Though I've just started collecting this kind of information.

from wycheproof.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.