toposware / cheetah Goto Github PK
View Code? Open in Web Editor NEWA STARK-friendly elliptic curve defined over a sextic extension of a small prime field.
License: Apache License 2.0
A STARK-friendly elliptic curve defined over a sextic extension of a small prime field.
License: Apache License 2.0
The Fp
struct currently implements double()
simply as self.add(self)
and square()
as self.mul(self)
.
The Scalar
struct also implements double()
as self.add(self)
.
A custom implementation could be beneficial, in particular as Fp6
operations are naturally slower than their FieldElement
counter-part of the stark-curve repo.
Mixed addition (AffinePoint
+ ProjectivePoint
) is about 88% faster than regular ProjectivePoint
addition.
This could lead to beneficial gains in scalar multiplication, by storing in the LookupTable
multiples of the base point in Affine coordinates, rather than Projective coordinates. This can be done efficiently by creating first the array of ProjectivePoint
multiples, and then doing a batch normalization to convert them all in Affine coordinates with 1 field inversion.
Unfortunately, we currently have a relatively expensive Fp6
inversion which makes this approach only beneficial for scalar multiplication with the hardcoded basepoint. But it may be worth investigating, and keep it under the hood in case inversion is improved.
We currently rely on generic algorithms for computing curve group operations, with the only benefit of ignoring multiplications by a4
as it is equal to 1 in our setting (Weierstrass form y^2 = x^3 + x + B
).
Having group algorithms more "tailored" for our specific curve could be beneficial.
The latest version of cheetah uses a direct sextic extension Fp -> Fp6
defined as Fp[X] / (X^6 - 7), where 7 is a quadratic and cubic non-residue.
The current cost of multiplication in Fp6
is:
For squaring in Fp6
, it is:
We may benefit from using a towered extension Fp -> Fp2 -> Fp6
internally. Inputs / outputs would still be given in canonical form of the direct sextic extension. For this, we have the correspondence between:
a in Fp6\Fp: a0 + a1.alpha + a2.alpha^2 + a3.alpha^3 + a4.alpha^4 + a5.alpha^5
and
b in Fp6\Fp2: (b00 + b01.beta) + (b10 + b11.beta).delta + (b20 + b21.beta).delta^2
In particular, if we take alpha^6 = delta^2 = beta^3 = 7 (i.e, Fp2 = Fp[X] / (X^2 - t) and Fp6\Fp2 = Fp2[X] / (X^3 - t)), then we can rewrite b as
b = b00 + t.b21.alpha + b10.alpha^2 + b01.alpha^3 + b20.alpha^4 + b11.alpha^5
Inversely, a can be rewritten as:
a = (a0 + a3.beta) + (a2 + a5.beta).delta + (a4 + 1/t.a1.beta).delta^2
What's important is that, given the 6 coordinates of an Fp6 element, conversion between the direct extension form and the towered extension form is almost costless. This may make towered extension approach interesting for:
We were originally using Montgomery form for representing base field and scalar field elements, which has then be removed in favour of a canonical encoding for the base prime field, to match the implementation on winterfell and miden.
However, if we stay away from the latter, we may want to go back to Montgomery form and use the same wrapping approach in winterfell for our own version of f64. While Fp6 multiplication and squaring may be slightly slower than it currently is, addition/subtraction/doubling should be faster. In addition, we wouldn't have to deal with canonicalization of elements before feeding them into a Hasher for instance, as all element representations are unique in Montgomery form.
Squaring in Fp
currently relies on the underlying Fp
multiplication. We may want to investigate custom squaring possibilities to see if they bring any significant speed-up.
In the continuation of #1, it may be interesting to include support for w-NAF scalar decomposition for faster scalar multiplication of group elements.
Within winterfell, we're currently relying on a wrapper of the cheetah Fp
type to implement the FieldElement
and related Traits.
Because of the trait declaration of some methods in the Air trait, we cannot directly benefit from the speed improvement of toposware/hash#4 in our state-transition AIR program in certificate-stark (except from outside the circuit, i.e. when building the trace but this is almost negligible with respect to the rest of the prover computations).
Having the Fp
struct implement directly FieldElement
, StarkField
, ExtensibleField
behind a condition flag like #cfg[feature("air")]
would reduce the overhead in type conversion, make the usage easier and allow for the kind of optimizations we benefit from the above mentioned PR for instance.
We currently have only Affine and Projective coordinates logic for the curve arithmetic.
Being able to support other point coordinate systems may bring some performance benefits, for instance in the schnorr-sig crate.
For instance Jacobian coordinates are known for yielding relatively fast point doubling methods (about 2/3 of the running time of a point doubling in Projective coordinates) for a slightly slower addition algorithm. However, the use of Straus-Shamir's trick in signature verification reduces the impact of such change (as we have now N doublings and 2N additions).
Jacobian coordinates variants (like Chudnovsky) could also be investigated, though at the cost of a heavier point representation (240B instead of 144B for Jacobian/Projective).
This paper introduces a faster way than Tonelli-Shanks to compute the square root of a field element if the field characteristic is highly 2-adic.
According to the table 1 page 12, the number of field operations in the case of Fp (with 2-adicity 32) would go from 311 to 134.5 when switching algorithm.
We currently use a constant-time version of Tonelli-Shanks for p^6 = 1 mod 16, which is quite heavy (including a whole Fp^6 exponentiation).
Implementing the Algorithm 10 from https://eprint.iacr.org/2012/685.pdf may yield significant improvement, and would be beneficial for point decompression.
The recent stable toolchain introduces noisy lint errors, mostly related to needless borrows.
The current generator point, while not random (as coming from the "Topos" encoded string), still suffers from a non-standard generation procedure. In addition, having a hash-to-curve algorithm would be a great add-on as it can serve multiple purposes.
Following the draft-cfrg-hash-to-curve (section 6.6 p26), it seems that SWU would be the one to go for. The generator point would then be regenerated, following the proper hash-to-curve implementation.
Currently the scalar multiplication is done through a simple double-and-add algorithm, skipping the first MSB that is always zero.
We would gain from having other, more efficient methods implemented (not necessarily limited to only 1), among which:
More here:
It could be nice to generalize the Shamir's trick for N points. It is currently only available for two points (and optimized if one is the base point).
Scalar reveals the limbs as the elements of Fixed size array implements Copy trait. Whenever they have to be moved, limbs are copied thus leaving the original array in memory until it is overwritten.
I have to change the scope of limbs to find this bug.
I have reproduced the issue here: https://gist.github.com/RajeshRk18/24b2f273ea689786c3bb5a19990c6166
For the scalar Scalar([12, 34, 56, 78])
,
when the given code is run,
Creation Address: 0x7ffe0a86a4b8
Value before dropping: [11246162368984329641, 15720856220356058543, 2654058533781523010, 95027734021429075]
Returned Address: 0x7ffe0a86a4b8 // same location means that limbs are not moved
Value at Returned Address: [0, 0, 0, 0]
Value at Creation Address: [0, 0, 0, 0]
Creation Address: 0x7ffe0a86a4b8
Value before dropping: [11246162368984329641, 15720856220356058543, 2654058533781523010, 95027734021429075]
Returned Address: 0x7ffe0a86a470 // different location means that limbs are moved
Value at Returned Address: [0, 0, 0, 0]
Value at Creation Address: [11246162368984329641, 15720856220356058543, 2654058533781523010, 95027734021429075]
Now that inline assembly is stable with Rust 1.59.0 (https://blog.rust-lang.org/2022/02/24/Rust-1.59.0.html), it may be interesting to include conditional assembly feature to speed-up some arithmetic operations.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.