Vivek Kale's Projects
STREAM, for lots of devices written in many programming models - Vivek's version
Benchmarks for experimentation of performance of features of OpenMP in the SOLLVE project.
Code for finding architectural capabilities of supercomputers, i.e., benchmarking supercomputers.
can - a simple dense matrix-matrix multiplication benchmark with MPI/OpenMP/OpenACC. MPI version is based on Cannon's algorithm.
Copy-hiding array abstraction to automatically migrate data between memory spaces
The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
Cracking the Coding Interview 6th Ed. Solutions
cuda stream benchmark: based on work by Massimiliano Fatica@NVIDIA
Department of Energy Standard Utility Library
The Exascale Computing Project Software Technologies Capability Assessment Report - Public Version
Molecular dynamics proxy application based on Kokkos
Extra-P, automated performance modeling for HPC applications
A cut-down version of Grid (https://github.com/paboyle/Grid)
A cut-down version of Grid (https://github.com/paboyle/Grid)
GitPitch In 60 Seconds - A Very Short Tutorial
This is a job talk repo.
This is a set of simple programs that can be used to explore the features of a parallel platform.
Kokkos C++ Performance Portability Programming EcoSystem: The Programming Model - Parallel Execution and Memory Abstraction
This is a repository for sharing OpenMP 5 target usage examples and reproducers
Kokkos C++ Performance Portability Programming EcoSystem: Profiling and Debugging Tools
Examples of Kokkos Tools from DPolia repo
Tutorials for the Kokkos C++ Performance Portability Programming EcoSystem
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
Modification to LLVM OpenMP library for incorporation of user-defined schedules in OpenMP
A library and a set of example codes for combining inter-node load balancing capability in Charm++ with my intra-node loop scheduling strategies.