ats-bench's Introduction

ats-bench

Branch	Status
develop
master

Benchmarks for unified memory handling on GPU

Microbenchmarks

How fast can system data be accessed?
- System buffer fill
  - {aligned, unaligned} x {mapped/managed/system}
- System buffer copy
  - {aligned, unaligned} x {mapped/managed/system}
What is the granularity of access
- Interleave modifications to strided regions with multiple GPUs
  - {mapped/managed/system}
Are system atomics supported?
- {mapped/managed/system}
How fast can a memory region be created?
- Allocation + {no touch / cpu / gpu / both}
Page Fault cost

Benchmarks

Triangle Counting
GEMM

Building

mkdir build
cd buid
cmake ..
make

Running

Benchmarks are in src/benchmarks/*. Do src/benchmark/[class]/[the-benchmark] --help to see all of the options.

There are also some utilities in src/:

src/test-system-allocator: check to see if the system allocator is working

Running on HAL

New Job Queues
Real-time System Status
Interactive Job (1 GPU)

srun --partition=gpu-debug --pty --nodes=1 \
  --ntasks-per-node=12 --cores-per-socket=12 \
  --gres=gpu:v100:1 --mem-per-cpu=1500 \
  --time=2:00:00 --wait=0 --export=ALL /bin/bash

Acks

Uses lyra for cli option parsing.
Uses hunter for package management.
Uses spdlog for logging.
Uses Atrox/github-actions-badge for Github Actions status badge

Related Work

2014

Landaverde, Raphael, et al. "An investigation of unified memory access performance in cuda." 2014 IEEE High Performance Extreme Computing Conference (HPEC). IEEE, 2014. pdf

2015

Li, Wenqiang, et al. "An evaluation of unified memory technology on nvidia gpus." 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 2015. pdf

Recommend Projects