penzn / wasm-long-vectors Goto Github PK
View Code? Open in Web Editor NEWSlides for Wasm long vectors proposal
Slides for Wasm long vectors proposal
After a discussion with @penzn and @RossTate, we realized it would be helpful to catalog some resources for auto-vectorizable and hand-vectorized code. These should be useful for benchmarking and staring at as examples to judge the flexibility of the "long vector" proposal.
Probably the canonical benchmark suite for small, vectorizable kernels is MediaBench.
It's old and dusty, though, and the code just a big bunch of undifferentiated C, so it's not all that helpful at revealing tight, vectorizable inner loops.
Something that might be a bit more helpful is VectorBench, from the MIT folks who have been working on the problem of "revitalizing" vectorized code hand-tuned for old SIMD ISAs so they can be recompiled for new SIMD ISAs.
Namely, a good place to start would be the code they adopted from this repository of a great deal of hand-vectorized image processing kernels.
Unfortunately, even these simple kernels reveal how maddeningly complex hand-vectorized code can truly be. For example, binarizing an image (taking a grayscale image and rounding to a black-and-white image) should be super simple, right? Here's the core loop for that:
https://github.com/ermig1979/Simd/blob/ab23fb98f5bebfeeef170c8abbd1ab9d596b0246/src/Simd/SimdAvx512bwBinarization.cpp#L63-L74
And of course it gets way more gnarly if you're doing something less perfectly elementwise, like a stencil/filter type of thing:
https://github.com/ermig1979/Simd/blob/master/src/Simd/SimdAvx2Sobel.cpp
Perhaps a better way to start is to look at code that should be amenable to auto-vectorization but that is not yet hand-tuned with vector intrinsics. For example, NPB is an old and small benchmark suite with this characteristic.
VectorBench includes the OpenMP-annotated NPB suite. You can find such intriguing loop nests as this:
https://github.com/revec/VectorBench/blob/77073a06779a1dbd948bd5f5a37660946ae07750/scalar/NPB3.0-omp-C/BT/bt.c#L241-L255
Of course, it's up to the reader's imagination to divine the best way to vectorize these loops. To that end, I think the revec paper itself can be helpful.
For example, take a gander at Figure 1, which shows the original scalar code, a vectorized version for SSE2, and the desired vectorization for two other ISAs (AVX2 and AVX-512).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.