Coder Social home page Coder Social logo

dongso / cumat-1 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shamandevel/cumat

1.0 0.0 0.0 13.79 MB

An expression template based linear algebra library running completely on the GPU using CUDA

License: MIT License

CMake 1.91% C++ 62.99% Cuda 32.23% Python 2.81% Batchfile 0.02% C 0.04%

cumat-1's Introduction

cuMat: Linear algebra library in CUDA

cuMat strives to be a port of Eigen in CUDA, enabling the performance gain when computing on the GPU.

Overview:

  • Versatile:
    • cuMat supports all matrix and vector sizes, fixed on compile time or dynamically sized during runtime.
    • all matrices can be batched and all operations are parallelized over batches.
    • supports all standard float and integral types, complex types, as well as custom scalar types.
    • supports BLAS 1-3, many reductions, decompositions, and iterative solvers.
    • supports sparse matrices.
  • Fast ( Benchmarks ):
    • Kernel merging to minimize memory access.
    • Uses CUB for reductions, cuBLAS for matrix products and cuSOLVER for dense decompositions.
    • Custom implementations for all nullary, unary and binary operations.
    • Outperforms cuBLAS if kernel merging can be utilized.
  • Accessible:
    • Simple API influenced by Eigen.
    • Implementation details like context creation and work size spezification are hidden from the user.
    • Thread-safe.
    • Header-only.
    • Cross-Platform support. Developed under Windows, Visual Studio 2017 with CUDA 9.2. Tested with the CI on Linux, gcc and CUDA 9.2.
    • Simple interop to Eigen.

Motivating example

To demonstrate how cuMat can be used, we show how the code for summing two vectors a and b into a thrid vector c looks like when implemented with Eigen, cuBLAS and cuMat.

Eigen:

Eigen::VectorXf a = ..., b = ...; //some initializations
Eigen::VectorXf c = a + b; //CPU

cuBLAS:

int n = ...; //size of the vectors
float* a = ..., b = ...; //some initializations
float* c = ...; //output memory
cublasHandle_t handle;
cublasCreate(&handle);
float alpha = 1; //optional scaling factor of b; axpy: c += alpha * b
cudaMemcpy(c, a, sizeof(float)*n, cudaMemcpyDeviceToDevice); //copy a into c, GPU
cublasSaxpy(handle, n, &alpha, b, 1, c, 1); //add b to c, GPU
cublasDestroy(&handle);

Of course, this above code is a bit unfair because the boilerplate code of creating the cuBLAS handle is included. In practice, this has to be done only once, so the above code reduces to two lines, the memcpy and the axpy.

cuMat:

cuMat::VectorXf a = ..., b = ...; //some initialization
cuMat::VectorXf c = a + b; //GPU

Documentation

The documentation can be found under https://shaman42.gitlab.io/cuMat/. All other open questions regarding this library are answered there.

Requirements

cuMat is header-only, but it builds on some third-party libraries:

  • cuBLAS, cuSOLVER: shipped with the CUDA SDK.
  • CUB: can be found inside Thrust as part of the CUDA SDK, in the third-party folder of cuMat, or provide your own version.
  • (Optional) Eigen for printing matrices and for the Eigen interop. A working version can be found in the third-party folder.

License

cuMat is shipped under the permissive MIT license.

Bug reports

If you find bugs in the library, feel free to open an issue. I will continue to use this library in future projects and therefore continue to improve and extend this library. Of course, pull requests are more than welcome.

cumat-1's People

Contributors

shamandevel avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.