I have configured an ensemble model in Triton Inference Server, which includes DALI pr

Initially, I used dali.py and configured a <code clas

Handling Unsupported Input and Ensuring GPU Processing in Triton Inference Server about server HOT 1 OPEN

Bycqg commented on August 16, 2024

Handling Unsupported Input and Ensuring GPU Processing in Triton Inference Server

from server.

Comments (1)

Bycqg commented on August 16, 2024

Initially, I used dali.py and configured a dali_backend model in the ensemble model to preprocess images. However, with this configuration, if I uploaded a file in an incorrect format (e.g., GIF), DALI could not decode it, leading to the Triton service being killed.

Now, I have switched to using python_backend and wrote DALI preprocessing in model.py, using try...except to handle exceptions and prevent the Triton service from being killed. However, in this process, I found that the pb_utils.Tensor() function only supports parameters of type (str, numpy.ndarray). This forces me to transfer DALI GPU data back to the CPU. My intention was to directly transfer data from DALI GPU to TensorRT GPU, as I believe this would be more efficient. I would like to ask if DALI GPU data must be transferred back to the CPU before being passed on. If not, how can I achieve this (preferably with code examples or documentation)?

from server.

Handling Unsupported Input and Ensuring GPU Processing in Triton Inference Server about server HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent