Hello NVIDIA, PLEASE KINDLY READ ALL THIS THREAD BEFORE ANSW

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

In the build.py there is an option for platform architecture aarch

Thank you for your answers <a class="user-mention notranslate" data-hovercard-type="us

hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Compiling Triton Inference Server about server HOT 7 CLOSED

wolfgangsmdt commented on June 16, 2024

Compiling Triton Inference Server

from server.

Comments (7)

kthui commented on June 16, 2024 1

@nv-kmcgill53 @mc-nv do you know if it is possible to compile the ARM version of Triton (i.e. for raspberry, jetson, ...) on a x86 machine?

from server.

mc-nv commented on June 16, 2024 1

It should be possible to compile, using emulation, on x86 to obtain arm64 binaries as long as all necessary dependencies are been satisfied.

Tip

Customer may try to use docker QEMU emulator for such purpose.

from server.

kthui commented on June 16, 2024 1

In the build.py script there is an option for platform architecture aarch64, but the script must run on the target platform.

You can change the script if some checks are blocking you from moving forward with the build.

compiling on the jetson (natively) will end up saturating the RAM usage and rebooting the board

We usually build it on ARM machines. If you have an Apple Silicon machine, maybe you can try it there. Otherwise, the docker QEMU emulator seems to be the way to go.

from server.

wolfgangsmdt commented on June 16, 2024

Thank you for your answers @kthui and @mc-nv ,
In the build.py script there is an option for platform architecture aarch64, but the script must run on the target platform.

However, compiling on the jetson (natively) will end up saturating the RAM usage and rebooting the board. I believe it is still under development and this is the reason why this section here about compiling for jetson devices is empty.

If you have any other suggestions simpler than docker QEMU which is not 100% sure it will works I will be happy.

from server.

wolfgangsmdt commented on June 16, 2024

hello @kthui @mc-nv

Just to give you an update.

docker QEMU emulator was my only option and it did worked only for CPU after installing all the needed dependencies and it took a lot of time to compile using 24 CPUs.

However, the backends failed. The problem is that the build.py script does not provide the arm64 backend so it failed with the error of collect2: error: ld returned 1 exit status which I believe for x86 and not for amr64/aarch64 architechture.

So now, I am trying to find a way to compile the backends for arm64/aarch64 architechture.

If you have any suggestions, I will be happy. : )

from server.

wolfgangsmdt commented on June 16, 2024

Hello @kthui and @mc-nv

I cannot compile triton with --enable-gpu option in docker QEMU. because there is no CUDA for arm64.
Here below is my error:

CMake Error at /usr/share/cmake-3.27/Modules/FindCUDA.cmake:883 (message):
  Specify CUDA_TOOLKIT_ROOT_DIR
Call Stack (most recent call first):
  CMakeLists.txt:6 (find_package)

Here below is my command to compile in docker QEMU, (I am compiling without docker as docker cannot run inside docker with diferent architechture, so I cannot forward the docker.sock)

./build.py --target-platform linux --target-machine aarch64 -j 12 --enable-logging --enable-stats --enable-metrics --enable-gpu-metrics --enable-cpu-metrics --enable-tracing --enable-nvtx --enable-gpu --enable-mali-gpu  --endpoint grpc -v --no-container-build --build-dir /home/qemu/build_triton_all/triton_onnxruntime/server/build --backend onnxruntime

Here below is my docker QEMU envirement:

Linux 2c1b2a62f0a8 6.5.0-18-generic #18~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb  7 11:40:03 UTC 2 aarch64 aarch64 aarch64 GNU/Linux

Another question, could you please confirm me that Triton Inference Server latest version cannot run on Nvidia Jetson Nano?

Thank you

from server.

wolfgangsmdt commented on June 16, 2024

Hello @kthui @mc-nv @dyastremsky

Actually, you cannot compile full Triton and its backends with docker QEMU.
You cannot compile it neither on target hardware like raspberry or jetson nano device.

the problem is that it needs a lot of ressources (RAM, CPU core/threads) to compile only Triton, then for the backends you need docker to compile them see here https://github.com/triton-inference-server/onnxruntime_backend/blob/0825c357a226c9e4657a24895302557a211b13d8/CMakeLists.txt#L320

So you need to forward the docker.sock into docker QEMU, but this is not going to work as docker.sock is for my host x86 machine, therefore, you cannot full compile Triton.

Could you please confirm me that? @kthui @dyastremsky @mc-nv

Thank you : )

from server.

Compiling Triton Inference Server about server HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent