Coder Social home page Coder Social logo

deal-and-ceed-on-gpu's Introduction

deal-on-gpu

This is the project repo of the deal-on-gpu team at EuroHack19 in Lugano, Switzerland.

The presentation of the final results can be found here.

Team members

The team deal-on-gpu consisted of (in alphabetical order):

We profited from the dedicated work by:

Installation on Piz Daint

Scripts for building on Piz Daint with gcc can be found in scripts/daint-gcc/.

To build dealii and step-64, just run

./scripts/daint-gcc/make_dealii.sh --download --build-p4est
./scripts/daint-gcc/make_step-64.sh

To resume a build of dealii, or build after a changing the source in build/dealii/src, just run

./scripts/daint-gcc/make_dealii.sh

If you have the dealii source in a different directory, use the --dealii-source-dir=<dealii source> option when running make_dealii.sh. Change the build root with the --build-root=<build root> option for both make_dealii.sh and make_step-64.sh.

Note: LAPACK on Piz Daint is missing a needed linker flag in its config. This problem will manifest in a failure to link the dealii shared library and programs. Add the option -DLAPACK_LINKER_FLAGS="${ATP_POST_LINK_OPTS}" to the dealii cmake command to fix it.

Using Nvprof + NVVP:

We compile on Daint with cudatoolkit 9.1 due to some transitive dependencies from pre-installed modules. However, to profile P100 GPUs with nvprof, we need nvprof from cudatoolkit 9.2.

The following module setup should set the needed environment

module load daint-gpu
module swap cudatoolkit/9.2.148_3.19-6.0.7.1_2.1__g3d9acc8

First, generate a timeline:

srun nvprof -f -o profile-timeline.nvvp ./step-64

And then generate metrics and analysis-metrics for a kernel. To analyze the apply_kernel_shmem kernel, for example, we can run

nvprof -f -o profile-metrics-apply_kernel_shmem.metrics --kernels ::apply_kernel_shmem: --analysis-metrics --metrics all ./step-64

The --kernels syntax is [context]:[nvtx range]:kernel_id:[invocation]. You can leave the optional values blank to match all instances.

From there, you can open the profiles in NVVP. You need to "import...", and then choose the .nvvp file for the timeline, the .metrics file for the metrics, and include the kernel syntax in the kernels panel.

To generate source-level statistics to see stalls, memory accesses, branching etc., add the -lineinfo flag to nvcc, and the --source-level-analysis flags to nvprof e.g.

nvprof -f -o profile-metrics-apply_kernel_shmem.metrics --kernels ::apply_kernel_shmem: --analysis-metrics --metrics all --source-level-analysis global_access,shared_access,branch,instruction_execution,pc_sampling ./step-64

Note the source level analysis will significantly slow down the execution time!

Displaying source-level info in nvvp requires nvdisasm is installed, which should be available in the cuda toolkit.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.