Comments (6)
Before I ever get tempted to do this again, I am not sure if it is possible to do the AVX2 backend, without it being an extreme exercise in masochism. The reasoning is as follows:
-
The intrinsic code has the benefit of the compiler handling register allocation, and spilling, while this will need to be done by hand. While this isn't usually a big deal,
square_and_negate_D
andmul
in particular use a rather large number of variables. -
It may not be possible to port the code over in a way that is easy to maintain. The original code's nice split between the vector of field elements and the point operations is probably unable to be preserved without trashing performance. For example, having the overhead of a function call, moving data to/from memory just to do 5 VPBLENDDs is clearly extremely inefficient.
The AVX-512 IFMA backend looks more promising with respect to the former concern (since I get double the registers to work with, and the limbs are larger), the latter while an issue might be tolerable.
That said, I currently do not have access to a CPU that supports AVX-512 IFMA, so as of now it is a rather moot point.
from curve25519-voi.
I got tempted to do this again, and the results are in #19. I'm not sure how much I'll hate maintaining this, but it does go really fast.
from curve25519-voi.
While the goal of this project is to produce something that is relatively easy to maintain, and I have lingering doubts over how much I will enjoy maintaining the AVX2 code, I went ahead and merged it because it provides a substantial improvement to verification performance.
I can't think of a nice way to make this also support AVX-512 without it being a total shitshow, so my tentative plan when I eventually get a system that supports it is to yank the AVX2 code out and replace it with AVX-512, though that is unlikely to happen for a while.
from curve25519-voi.
I do not see this happening in the short to medium term.
While it is currently possible to do the development with a consumer oriented Rocket Lake system, that involves someone actually buying a Rocket Lake processor. As Rocket Lake is is better left off as silicone in the form of the sand in a cat shit filled public park sandbox, I will not be doing so.
While Alder Lake is (hopefully) going to be an improvement, and actually worth using, Intel is reversing their recent trend on bringing AVX-512 to the mass market, with the consumer and desktop SKUs having the unit disabled by fuse, indicating their desire to keep the instruction set as a server only (Sapphire Rapids) thing.
Closing till the AVX-512 availability situation changes.
Note: If someone wants to supply me with hardware with an AVX-512 unit, that will sit in my apartment, to get this done and to maintain it, I'm open to options.
from curve25519-voi.
Apparently AVX-512 is available on Alder Lake if the E-cores are disabled, so a consumer oriented system is still suitable for development. I also do have access to a Tiger Lake i7.
I'm vaguely tempted to reconsider this, especially since license-based downclocking appears to be a non-issue in Ice Lake/Rocket Lake (there is a voltage transition performance penalty still), but I'm not all that enthusiastic about maintaining two separate assembly implementations (since AVX-512 is not nearly as ubiquitous as AVX-2 is).
See: https://travisdowns.github.io/blog/2020/08/19/icl-avx512-freq.html
from curve25519-voi.
Welp, never mind. Intel is killing AVX-512 on ADL with a ME update.
from curve25519-voi.
Related Issues (20)
- perf: Replace the STROBE implementation
- perf: Add sr25519 precomputation support
- api: Consider exposing the merlin implementation
- ci: Figure out how to setup arm64 and arm32 builds/tests HOT 1
- ed25519: optimize reuse of instantiated BatchVerifier HOT 5
- perf: Think more about defaulting to ExpandedPublicKeys in the ed25519 batch verifier HOT 2
- perf: provide safepoints for pre-emptive scheduling HOT 7
- perf: primitives/x25519: Rethink calling x/crypto/curve25519 HOT 2
- internal/field: Think about using fiat-crypto
- cleanup: Drop support for old versions of Go
- cleanup: Fix assembly `go vet` issues HOT 1
- cleanup: Use the 1.17 cast from slice to array syntax HOT 1
- perf: Consider faster batch forgery identification
- housekeeping: Go 1.17 related cleanup HOT 1
- Browser wasm compatibility request HOT 1
- enhancement: Add paranoid ed25519 signing
- Platform support
- Public key encryption, private key decryption HOT 1
- Go 1.19.x tracking
- ../github.com/oasisprotocol/curve25519-voi/internal/toolchain/constraints.go:48:6: undefined: __SOFTWARE_REQUIRES_GO_VERSION_1_18__ HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from curve25519-voi.