Coder Social home page Coder Social logo

ddele / habana_custom_kernel Goto Github PK

View Code? Open in Web Editor NEW

This project forked from habanaai/habana_custom_kernel

0.0 0.0 0.0 361 KB

Provides the examples to write and build Habana custom kernels using the HabanaTools

C++ 70.29% Python 2.47% C 25.89% CMake 1.34%

habana_custom_kernel's Introduction

Habana Custom Kernel

This repository provides the examples to write and build Habana custom kernels using the HabanaTools.

Table Of Contents

TPC Kernels Overview

The Tensor Processor Core™ (TPC) is a fully programmable VLIW4 processor designed to execute non-linear deep learning operators. It is embedded in Habana’s Gaudi deep learning accelerator. Habana’s Gaudi SoC contains numerous TPC cores all operating in parallel, with each core running a single thread. The TPC is designed with very long instruction word (VLIW) architecture. It has a wide single instruction multiple data (SIMD) vector unit that support 2048-bit SIMD operations with data types such as float, bfloat16, INT16, INT32 and INT8. In each cycle, the TPC’s ALU (Arithmetic Logic Unit) can execute up to 64 floats/INT32 ops, or 128 INT16 ops, or 256 INT8 ops. TPC is designed for workloads that do not map to Matrix Multiplication Engine (MME). Those workloads or operators can be implemented using TPC kernels.

Install Habanatools For Ubuntu

To retrieve the package please visit Habana Vault, click Artifact, find habanatools and download the latest release package for Ubuntu 20.04. You can find different packages for different OS you used.

  sudo dpkg -i ./habanatools_1.14.0-493_amd64.deb
  • Once installed the following files will be added to your machine

    Location Purpose
    1 /usr/bin/tpc-clang TPC-C compiler and assembler
    2 /usr/bin/tpc-llvm-objdump TPC dis-assembler
    3 /usr/lib/habanatools/libtpcsim_shared.so TPC simulator
    4 /usr/lib/habanatools/libtpc_tests_core.so Test core library
    5 /usr/lib/habanatools/include/gc_interface.h Glue code interface header
    6 /usr/lib/habanatools/include/tpc_test_core_api.h Test core APIs
    7 /usr/lib/habanatools/include/tpc_test_core_types.h Test core type defines
  • Compiler usage example The compiler supports a single translation unit, hence ‘-c’ argument should be defined.

 /usr/bin/tpc-clang batch_norm_fwd_f32.c -c -x c++ -o batch_norm_fwd_f32.o

The output of the compilation session will be an elf file named ‘batch_norm_fwd_f32.o’ . To extract raw binary, from the elf, use the following command:

 objcopy -O binary --only-section=.text batch_norm_fwd_f32.o batch_norm_fwd_f32.bin 

Using cmake tool shown in the following template examples.

For other OS, please refer to the TPC Tools Installation Guide for more details. If you get error like can't find libTpcElfReader.so etc, make sure you add /usr/lib/habanatools path to LD_LIBRARY_PATH environment variable.

Template Examples

The template examples show users how to create and build the custom kernels, which can be used in Tensorflow (TF) and PyTorch (PT) custom ops later. This template example has organized in the following way, which contains TPC kennels(kernels/), Glue codes(src/) and Unit tests(tests/).

  • TPC kernel codes are the ISA executed by the TPC processor. They contain the kernel implementation.
  • Glue codes are executed on the host machine serviced by the Habana DNN SoC, and they hold specifications regarding how the program input/outputs can be dynamically partitioned between the numerous TPC processors in the Habana device.
  • Unit tests are to verify the kernel's correctness using the build-in simulator provided in the HabanaTools, test core provides the ability to test on real device and performance.

Build the template examples

Make sure your Habana tools are installed, check the /usr/bin/tpc-clang and Cmake are up-to-date version, you can download latest cmake via https://cmake.org/download/

Clone the repository

 git clone [email protected]:HabanaAI/Habana_Custom_Kernel.git

In the terminal, make sure you are in the project root directory, then create a directory called build

mkdir build
cd build

then run the following commands

cmake ..
make

After build, you can find libcustom_tpc_perf_lib.so in build/src directory, which is your custom kernel library, and tpc_kernel_tests in build/tests, which contains all the unit tests. For more details about TPC kernel writing, please refer to the TPC User Guide for more information.

Kernels Performance Measure

The new test core feature can be used to measure the custom kernels performance when set env variables TPC_RUNNER=1 (make sure hardware and driver are installed correctly) and HABANA_PROFILE=1. A hltv file will be created after running the test, make sure only run one kernel test at a time, for example, tpc_kernel_tests -t FilterFwd2DBF16Test. You can load the hltv file to https://hltv.habana.ai/ and visually check the performance. Please check the Habana Profiler to find more details.

Custom Ops for Tensorflow and PyTorch

The user also can develop their own Tensorflow and PyTorch custom ops using their created TPC kernels. In this custom kernel project, we provide several custom kernel examples, such as custom_div (division), relu6_fwd/relu_fwd (relu6/relu forward path) and relu6_bwd/relu_bwd (relu6/relu backward path). Please visit TensorFlow Custom OPs Examples and PyTorch Custom Ops Examples for more details and make sure add your custom kernel path to environment variable GC_KERNEL_PATH, like export GC_KERNEL_PATH=/path/to/your_so/libcustom_tpc_perf_lib.so:/usr/lib/habanalabs/libtpc_kernels.so.

habana_custom_kernel's People

Contributors

zzhang37 avatar greg-serochi avatar tzachicohen avatar amittal10 avatar ebagelman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.