Coder Social home page Coder Social logo

rocm / amdmigraphx Goto Github PK

View Code? Open in Web Editor NEW
166.0 166.0 76.0 154.4 MB

AMD's graph optimization engine.

Home Page: https://rocm.docs.amd.com/projects/AMDMIGraphX/en/latest/

License: MIT License

CMake 1.10% C++ 91.65% Shell 0.05% Python 6.73% Dockerfile 0.07% PureBasic 0.03% C 0.35% Vim Script 0.01%

amdmigraphx's Introduction

AMD ROCm Software

ROCm is an open-source stack, composed primarily of open-source software, designed for graphics processing unit (GPU) computation. ROCm consists of a collection of drivers, development tools, and APIs that enable GPU programming from low-level kernel to end-user applications.

With ROCm, you can customize your GPU software to meet your specific needs. You can develop, collaborate, test, and deploy your applications in a free, open source, integrated, and secure software ecosystem. ROCm is particularly well-suited to GPU-accelerated high-performance computing (HPC), artificial intelligence (AI), scientific computing, and computer aided design (CAD).

ROCm is powered by AMD’s Heterogeneous-computing Interface for Portability (HIP), an open-source software C++ GPU programming environment and its corresponding runtime. HIP allows ROCm developers to create portable applications on different platforms by deploying code on a range of platforms, from dedicated gaming GPUs to exascale HPC clusters.

ROCm supports programming models, such as OpenMP and OpenCL, and includes all necessary open source software compilers, debuggers, and libraries. ROCm is fully integrated into machine learning (ML) frameworks, such as PyTorch and TensorFlow.

Getting the ROCm Source Code

AMD ROCm is built from open source software. It is, therefore, possible to modify the various components of ROCm by downloading the source code and rebuilding the components. The source code for ROCm components can be cloned from each of the GitHub repositories using git. For easy access to download the correct versions of each of these tools, the ROCm repository contains a repo manifest file called default.xml. You can use this manifest file to download the source code for ROCm software.

Installing the repo tool

The repo tool from Google allows you to manage multiple git repositories simultaneously. Run the following commands to install the repo tool:

mkdir -p ~/bin/
curl https://storage.googleapis.com/git-repo-downloads/repo > ~/bin/repo
chmod a+x ~/bin/repo

Note: The ~/bin/ folder is used as an example. You can specify a different folder to install the repo tool into if you desire.

Installing git-lfs

Some ROCm projects use the Git Large File Storage (LFS) format that may require you to install git-lfs. Refer to Git Large File Storage for more information. For example, to install git-lfs for Ubuntu, use the following command:

sudo apt-get install git-lfs

Downloading the ROCm source code

The following example shows how to use the repo tool to download the ROCm source code. If you choose a directory other than ~/bin/ to install the repo tool, you must use that chosen directory in the code as shown below:

mkdir -p ~/ROCm/
cd ~/ROCm/
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.0.x
~/bin/repo sync

Note: Using this sample code will cause the repo tool to download the open source code associated with the specified ROCm release. Ensure that you have ssh-keys configured on your machine for your GitHub ID prior to the download as explained at Connecting to GitHub with SSH.

Building the ROCm source code

Each ROCm component repository contains directions for building that component, such as the rocSPARSE documentation Installation and Building for Linux. Refer to the specific component documentation for instructions on building the repository.

Each release of the ROCm software supports specific hardware and software configurations. Refer to System requirements (Linux) for the current supported hardware and OS.

Build ROCm from source

The Build will use as many processors as it can find to build in parallel. Some of the compiles can consume as much as 10GB of RAM, so make sure you have plenty of Swap Space !

By default the ROCm build will compile for all supported GPU architectures and will take approximately 500 CPU hours. The Build time will reduce significantly if we limit the GPU Architecture/s against which we need to build by using the environment variable GPU_ARCHS as mentioned below.

# --------------------------------------
# Step1: clone source code
# --------------------------------------

mkdir -p ~/WORKSPACE/      # Or any folder name other than WORKSPACE
cd ~/WORKSPACE/
export ROCM_VERSION=6.1.0   # or 6.1.1 6.1.2
~/bin/repo init -u http://github.com/ROCm/ROCm.git -b roc-6.1.x -m tools/rocm-build/rocm-${ROCM_VERSION}.xml
~/bin/repo sync

# --------------------------------------
# Step 2: Prepare build environment
# --------------------------------------

# Option 1: Start a docker container
# Pulling required base docker images:
# Ubuntu20.04 built from ROCm/tools/rocm-build/docker/ubuntu20/Dockerfile
docker pull rocm/rocm-build-ubuntu-20.04:6.1
# Ubuntu22.04 built from ROCm/tools/rocm-build/docker/ubuntu22/Dockerfile
docker pull rocm/rocm-build-ubuntu-22.04:6.1

# Start docker container and mount the source code folder:
docker run -ti \
    -e ROCM_VERSION=${ROCM_VERSION} \
    -e CCACHE_DIR=$HOME/.ccache \
    -e CCACHE_ENABLED=true \
    -e DOCK_WORK_FOLD=/src \
    -w /src \
    -v $PWD:/src \
    -v /etc/passwd:/etc/passwd \
    -v /etc/shadow:/etc/shadow \
    -v ${HOME}/.ccache:${HOME}/.ccache \
    -u $(id -u):$(id -g) \
    <replace_with_required_ubuntu_base_docker_image> bash

# Option 2: Install required packages into the host machine
# For ubuntu20.04 system
cd ROCm/tools/rocm-build/docker/ubuntu20
bash install-prerequisites.sh
# For ubuntu22.04 system
cd ROCm/tools/rocm-build/docker/ubuntu22
bash install-prerequisities.sh

# --------------------------------------
# Step 3: Run build command line
# --------------------------------------

# Select GPU targets before building:
# When GPU_ARCHS is not set, default GPU targets supported by ROCm6.1 will be used.
# To build against a subset of GFX architectures you can use the below env variable.
# Support MI300 (gfx940, gfx941, gfx942).
export GPU_ARCHS="gfx942"               # Example
export GPU_ARCHS="gfx940;gfx941;gfx942" # Example

# Pick and run build commands in the docker container:
# Build rocm-dev packages
make -f ROCm/tools/rocm-build/ROCm.mk -j ${NPROC:-$(nproc)} rocm-dev
# Build all ROCm packages
make -f ROCm/tools/rocm-build/ROCm.mk -j ${NPROC:-$(nproc)} all
# list all ROCm components to find required components
make -f ROCm/tools/rocm-build/ROCm.mk list_components
# Build a single ROCm packages
make -f ROCm/tools/rocm-build/ROCm.mk T_rocblas

# Find built packages in ubuntu20.04:
out/ubuntu-20.04/20.04/deb/
# Find built packages in ubuntu22.04:
out/ubuntu-22.04/22.04/deb/

# Find built logs in ubuntu20.04:
out/ubuntu-20.04/20.04/logs/
# Find built logs in ubuntu22.04:
out/ubuntu-22.04/22.04/logs/
# All logs pertaining to failed components, end with .errrors extension.
out/ubuntu-22.04/22.04/logs/rocblas.errors          # Example
# All logs pertaining to building components, end with .inprogress extension.
out/ubuntu-22.04/22.04/logs/rocblas.inprogress  # Example
# All logs pertaining to passed components, use the component names.
out/ubuntu-22.04/22.04/logs/rocblas             # Example

Note: Overview for ROCm.mk

ROCm documentation

This repository contains the manifest file for ROCm releases, changelogs, and release information.

The default.xml file contains information for all repositories and the associated commit used to build the current ROCm release; default.xml uses the Manifest Format repository.

Source code for our documentation is located in the /docs folder of most ROCm repositories. The develop branch of our repositories contains content for the next ROCm release.

The ROCm documentation homepage is rocm.docs.amd.com.

Building the documentation

For a quick-start build, use the following code. For more options and detail, refer to Building documentation.

cd docs
pip3 install -r sphinx/requirements.txt
python3 -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html

Alternatively, CMake build is supported.

cmake -B build
cmake --build build --target=doc

Older ROCm releases

For release information for older ROCm releases, refer to the CHANGELOG.

amdmigraphx's People

Contributors

aaronenyeshi avatar ahsan-ca avatar apwojcik avatar attila-dusnoki-htec avatar bpickrel avatar cagery avatar causten avatar charliel7 avatar dependabot[bot] avatar github-actions[bot] avatar gyulaz-htec avatar igormirosavljevichtec avatar jerryyin avatar kahmed10 avatar krzysz00 avatar lakhinderwalia avatar manupak avatar mei-ye avatar mirza-halilcevic avatar mvermeulen avatar nives-vukovic avatar pfultz2 avatar ravil-mobile avatar scxiao avatar shivadbhavsar avatar tedthemistokleous avatar turneram avatar tvukovic-amd avatar umangyadav avatar wsttiger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amdmigraphx's Issues

Add a context object to compute

For backends like miopen, we need to add pass an miopen handle. Currently this is passed through the argument class. This requires shape to have an any_type, which is rather ambiguous.

Instead the compute method could take a context object that stores the miopen handle. Since the context object is target-dependent, this can be set up by the target during compile.

We may need a mechanism to set the context from the target for programs that are already compiled. Perhaps an overload of eval that takes a target.

Add attribute for aliased output

To keep better track of usage, it would be nice to have an attribute to know when the output of an operator is an alias of the input argument.

For many operators in miopen, the output buffer gets passed in as a parameter. It would be good to know that the argument result is an alias to this parameter.

Failures reading ONNX files

This issue partially to document what I see trying model zoo models and partially to sync on specific versions. Here is what I see trying each of the five designated models with code as of 10/1:

resnet50 - using https://github.com/onnx/models/tree/master/resnet50, release 1.3

Produces list of nodes w/o complaint

inception_v2 - using https://github.com/onnx/models/tree/master/inception_v2, release 1.3

@481 = @literal{ ... } -> float_type, {64}, {1}
@482 = @literal{ ... } -> float_type, {64}, {1}
@483 = @literal{ ... } -> float_type, {64}, {1}
@484 = @literal{ ... } -> float_type, {64}, {1}
@485 = @literal{ ... } -> float_type, {64, 3, 7, 7}, {147, 49, 7, 1}
data_0 = @param:data_0 -> float_type, {1, 3, 224, 224}, {150528, 50176, 224, 1}
@487 = convolutionpadding={3, 3}, stride={2, 2}, dilation={1, 1} -> float_type, {1, 64, 112, 112}, {802816, 12544, 112, 1}
@488 = batch_norm_inference(@487,@484,@483,@482,@481) -> float_type, {1, 64, 112, 112}, {802816, 12544, 112, 1}
@489 = unknown:Unsqueeze(@480) -> float_type, {64}, {1}

terminate called after throwing an instance of 'migraph::exception'
what(): /home/mev/source/MIGraph/src/include/migraph/check_shapes.hpp:66: Dimensions do not match
Aborted (core dumped)

mobilenet - using https://github.com/onnx/models/tree/master/models/image_classification/mobilenet

Passes without complaint, but does give one unknown operator:

mev@cafayate:~/source/MIGraph/build$ ./src/onnx/read_onnx /home/mev/dockerx/models/mobilenetv2-1.0/mobilenetv2-1.0.onnx | grep unknown
@420 = unknown:GlobalAveragePool(@419) -> float_type, {1, 1280, 7, 7}, {62720, 49, 7, 1}

mnist - using https://github.com/onnx/models/tree/master/mnist, onnx version 1.2

@0 = @literal{-0.044856, 0.00779166, 0.0681008, 0.0299937, -0.12641, 0.140219, -0.0552849, -0.0493838, 0.0843221, -0.0545404} -> float_type, {1, 10}, {10, 1}
@1 = @literal{256, 10} -> int64_type, {2}, {1}
@2 = @literal{ ... } -> float_type, {16, 4, 4, 10}, {160, 40, 10, 1}
@3 = @literal{1, 256} -> int64_type, {2}, {1}
@4 = @literal{ ... } -> float_type, {16, 1, 1}, {1, 1, 1}
@5 = @literal{ ... } -> float_type, {16, 8, 5, 5}, {200, 25, 5, 1}
@6 = @literal{-0.16154, -0.433836, 0.0916414, -0.0168522, -0.0650264, -0.131738, 0.0204176, -0.12111} -> float_type, {8, 1, 1}, {1, 1, 1}
@7 = @literal{ ... } -> float_type, {8, 1, 5, 5}, {25, 25, 5, 1}
Input3 = @param:Input3 -> float_type, {1, 1, 28, 28}, {784, 784, 28, 1}
@9 = convolutionpadding={0, 0}, stride={1, 1}, dilation={1, 1} -> float_type, {1, 8, 24, 24}, {4608, 576, 24, 1}

terminate called after throwing an instance of 'migraph::exception'
what(): /home/mev/source/MIGraph/src/include/migraph/check_shapes.hpp:66: Dimensions do not match
Aborted (core dumped)

yolov3 (couldn't find v3 in model zoo, tried https://github.com/onnx/models/tree/master/tiny_yolov2 version 1.2

Some unknown operators but no crash
mev@cafayate:~/source/MIGraph/build$ ./src/onnx/read_onnx /home/mev/dockerx/models/tiny_yolov2/model.onnx | grep unknown
[libprotobuf INFO google/protobuf/io/coded_stream.cc:610] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 63480670
@43 = unknown:ImageScaler(image) -> float_type, {0, 3, 416, 416}, {519168, 173056, 416, 1}
@46 = unknown:LeakyRelu(@45) -> float_type, {0, 16, 414, 414}, {2742336, 171396, 414, 1}
@50 = unknown:LeakyRelu(@49) -> float_type, {0, 32, 205, 205}, {1344800, 42025, 205, 1}
@54 = unknown:LeakyRelu(@53) -> float_type, {0, 64, 100, 100}, {640000, 10000, 100, 1}
@58 = unknown:LeakyRelu(@57) -> float_type, {0, 128, 48, 48}, {294912, 2304, 48, 1}
@62 = unknown:LeakyRelu(@61) -> float_type, {0, 256, 22, 22}, {123904, 484, 22, 1}
@66 = unknown:LeakyRelu(@65) -> float_type, {0, 512, 9, 9}, {41472, 81, 9, 1}
@70 = unknown:LeakyRelu(@69) -> float_type, {0, 1024, 6, 6}, {36864, 36, 6, 1}
@73 = unknown:LeakyRelu(@72) -> float_type, {0, 1024, 4, 4}, {16384, 16, 4, 1}

Add Gemm

There are two ONNX operators corresponding to matrix multiplication -MatMul and Gemm. We have implemented MatMul but we are not reading in Gemm which results in unknown operator. Add Gemm to frontend ONNX parser. Also, Gemm needs a few more parameters - transA, transB, alpha, beta.

Add scheduling pass

When instruction ordering is fixed, memory coloring can only do a limited job to reduce memory footprint. Add a pass to reorder instructions, which also has a potential to interleave computation and memory copy to improve throughput.

Improve coverage of unit tests for memory coloring

Here is the latest coverage reports for:

There are several areas that seem important to have code coverage. Ideally we should have full coverage for memory_coloring_impl::allocate and memory_coloring_impl::build(although we can skip empty programs).

The check for invalid offsets in rewrite seem important to have coverage for as well, unless we want to make that check it an assert. We dont need coverage for the unify_literals check.

Also, it would be good to have coverage for the ordering. Especially, since the last else uses > instead of <, here. Its not at all clear why the operator changes so a test demonstrating its usefulness would be good.

Add batch norm support

  • Add a dummy operator for batch norm, computing just the shapes.
  • Add batch norm for cpu backend.
  • Test added batch norm
  • Add MIOpen batch norm support

add operand alias to operators

need to find a way to annotate operand-alias in operators/operations/instructions so that program analysis does not need to add specialized checks for each operators.

Tasks for July

This is a ToDo lists to finish for July. Instead of creating an issue for each task, lets keep one for each release or month and discuss them
@adityaatluri

  • Add batch norm cpu implementation for both inference and training along with tests.
  • Add GPU kernel for batch norm for both inference and training along with tests.
  • Add AMD copyright to all files and MIT license

@pfultz2

@wsttiger

Test failure observed with ubuntu 18.04 + rocm 1.8 and MIGraph latest (2018-09-19)

Unit test error:
The following tests FAILED:
17 - test_gpu_miopen (Failed)
Errors while running CTest

Configuration information:

  1. Built in a docker container from the "Dockerfile" definition in MIGraph sources.
    env CXX=/opt/rocm/hcc/bin/hcc cmake ..
    make
    make check

    *** This is not rocm 1.9 (mis-spoke in meeting; I installed rocm 1.9 on my system but this was run in the container which uses rocm 1.8) ***

  2. /etc/issue
    Ubuntu 18.04.1 LTS \n \l

  3. uname -a
    Linux yarumal 4.15.0-34-generic #37-Ubuntu SMP Mon Aug 27 15:21:48 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

As a separate exercise I will try outside the docker container which does have rocm-dev 1.9.211 installed.

Release 0.1 definition

GOAL:

Initial demonstration for SC '18, available outside team

IS/IS NOT:

  • IS single GPU, IS NOT multiple GPU
  • IS inference, IS NOT training
  • IS ONNX file input IS NOT framework integration

Goals:

Dialect goal
Enumerated list of models (model zoo): resnet50, inception, mobilenet, mnist, yolov3

Performance goal
resnet50 - faster than TF
measuring performance on enumerated models
measure memory improvement - run with and without pass

Delivery
Timing soon enough to great demo (from outside the team)
Repo is public and tagged

Task areas [can remove these as we have issues tagged]

Quality Assurance

  • Fix cpu verification of onnx files
    • Better reducing onnx file

onnx model zoo resnet50 uses operators not supported in MIGraph

To reproduce:

  1. Download the resnet50 tarfile from https://github.com/onnx/models/tree/master/resnet50
  2. Run src/read_onnx on the model.onnx file from package above

MIGraph prints the graph as it is read in:
./src/onnx/read_onnx resnet50/model.onnx | grep unknown
@284 = unknown:Sum(@281,@283) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@294 = unknown:Sum(@293,@285) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@304 = unknown:Sum(@303,@295) -> float_type, {1, 256, 56, 56}, {802816, 3136, 56, 1}
@316 = unknown:Sum(@313,@315) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@326 = unknown:Sum(@325,@317) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@336 = unknown:Sum(@335,@327) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@346 = unknown:Sum(@345,@337) -> float_type, {1, 512, 28, 28}, {401408, 784, 28, 1}
@358 = unknown:Sum(@355,@357) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@368 = unknown:Sum(@367,@359) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@378 = unknown:Sum(@377,@369) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@388 = unknown:Sum(@387,@379) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@398 = unknown:Sum(@397,@389) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@408 = unknown:Sum(@407,@399) -> float_type, {1, 1024, 14, 14}, {200704, 196, 14, 1}
@420 = unknown:Sum(@417,@419) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@430 = unknown:Sum(@429,@421) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@440 = unknown:Sum(@439,@431) -> float_type, {1, 2048, 7, 7}, {100352, 49, 7, 1}
@448 = unknown:Softmax(@447) -> float_type, {1, 1000}, {1000, 1}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.