Comments (6)
Sure, can you pinpoint a routine for a generic GPU would outperform or have equal performance to a routine for a generic CPU? Moreover, how would you go about tuning for vastly different architechtures? What API would you use for the GPU?
from flint.
GPU parallelization comes from SIMD, and the range of algorithms that can take advantage of such a paradigm is limited.
CPUs also make use of SIMD, but usually when one says multiple cores, one thinks of MIMD, each core is following its own set of instructions.
I think the most obvious place to take advantage of GPUs is linear algebra over small fields, as Magma does:
https://magma.maths.usyd.edu.au/magma/handbook/text/61#611
from flint.
It should already be possible to do GPU-accelerated linear algebra by linking FLINT to a GPU-backed BLAS. I don't recall anyone reporting trying this.
from flint.
I'm just not comfortable trying this from a build-system perspective. Perhaps we should bring this up at the workshop?
from flint.
In theory, there shouldn't be anything for us to do, just the user specifying --with-blas
with something like NVBLAS installed on the system.
from flint.
i've looked around on the internet and found arrayfire, wich does some interesting stuff.
i dont really know tho if it has stuff like integer multiplication.
ive heard of integer multiplication (of very large numbers) being able to be broken down into very many similar operations, so maybe it is possible also with SIMD.
from flint.
Related Issues (20)
- Performance regression in fmpz_mpoly HOT 9
- Document common trap in `arb_set_d` HOT 7
- Fix ordering of inputs of `add_ssaaaa` (and friends)
- If enabling fft_small in configuration, check for AMD/Intel HOT 2
- Use preprocessor only more throughout configuration
- Update TODO
- Use `mpn_divexact_1` instead of `flint_mpn_divexact_1`
- Add float (single-precision) wrappers for Arb functions HOT 4
- Improve smith normal form
- Header `padic_types.h` not installed HOT 3
- Assembly for Arm v8.5-A ISA HOT 1
- Download links are outdate @ flintlib.org HOT 7
- Issue limits and constants based on processor model
- Tuning suite HOT 1
- Copy GMP's Toom33 etc
- Context object for nmod_poly and nmod_mat HOT 2
- Assembly disabled / compilation error on make tune HOT 6
- `fft_small` with AVX-512
- Problems with assembly code HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flint.