Coder Social home page Coder Social logo

zhangya9741 / multi-gpu-programming-models Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nvidia/multi-gpu-programming-models

0.0 1.0 1.0 192 KB

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

License: BSD 3-Clause "New" or "Revised" License

Shell 3.46% Makefile 4.41% C++ 14.51% Cuda 77.63%

multi-gpu-programming-models's Introduction

Multi GPU Programming Models

This project implements the well known multi GPU Jacobi solver with different multi GPU Programming Models:

  • single_threaded_copy Single Threaded using cudaMemcpy for inter GPU communication
  • multi_threaded_copy Multi Threaded with OpenMP using cudaMemcpy for inter GPU communication
  • multi_threaded_copy_overlapp Multi Threaded with OpenMP using cudaMemcpy for itner GPU communication with overlapping communication
  • multi_threaded_p2p Multi Threaded with OpenMP using GPUDirect P2P mappings for inter GPU communication
  • multi_threaded_p2p_opt Multi Threaded with OpenMP using GPUDirect P2P mappings for inter GPU communication with delayed norm execution
  • multi_threaded_um Multi Threaded with OpenMP relying on transparent peer mappings with Unified Memory for inter GPU communication
  • mpi Multi Process with MPI using CUDA-aware MPI for inter GPU communication
  • mpi_overlapp Multi Process with MPI using CUDA-aware MPI for inter GPU communication with overlapping communication
  • nvshmem Multi Process with MPI and NVSHMEM using NVSHMEM for inter GPU communication. Other approach, nvshmem_opt, might be better for portable performance.
  • nvshmem_opt Multi Process with MPI and NVSHMEM using NVSHMEM for inter GPU communication with NVSHMEM extension API

Each variant is a stand alone Makefile project and all variants have been described in the GTC EU 2018 Talk Multi GPU Programming Models

Requirements

  • CUDA: verison 9.2 or later is required by all variants.
  • OpenMP capable compiler: Required by the Multi Threaded variants. The examples have been developed and tested with gcc.
  • CUDA-aware MPI: Required by the MPI and NVSHMEM variants. The examples have been developed and tested with OpenMPI.
  • CUB: Optional for optimized residual reductions. Set CUB_HOME to your cub installation directory. The examples have been developed and tested with cub 1.8.0.
  • NVSHMEM: Required by the NVSHMEM variant. Please reach out to [email protected] for an early access to NVSHMEM.

Building

Each variant come with a Makefile and can be build by simply issuing make, e.g.

multi-gpu-programming-models$ cd multi_threaded_copy
multi_threaded_copy$ make CUB_HOME=../cub
nvcc -DHAVE_CUB -I../cub -Xcompiler -fopenmp -lineinfo -DUSE_NVTX -lnvToolsExt -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_70,code=compute_70  -std=c++11 jacobi.cu -o jacobi
multi_threaded_copy$ ls jacobi
jacobi

Run instructions

All variant have the following command line options

  • -niter: How many iterations to carry out (default 1000)
  • -nccheck: How often to check for convergence (default 1)
  • -nx: Size of the domain in x direction (default 7168)
  • -ny: Size of the domain in y direction (default 7168)
  • -csv: Print performance results as -csv

The provided script bench.sh contains some examples executing all the benchmarks presented in the GTC EU 2018 Talk Multi GPU Programming Models.

Developers guide

The code applies the style guide implemented in .clang-format file. clang-format version 7 or later should be used to format the code prior to submitting it. E.g. with

multi-gpu-programming-models$ cd multi_threaded_copy
multi_threaded_copy$ clang-format -style=file -i jacobi.cu

multi-gpu-programming-models's People

Contributors

akhillanger avatar jirikraus avatar

Watchers

James Cloos avatar

Forkers

kuning19901

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.