Coder Social home page Coder Social logo

skyleilani / viam-mlmodelservice-triton Goto Github PK

View Code? Open in Web Editor NEW

This project forked from viamrobotics/viam-mlmodelservice-triton

0.0 0.0 0.0 51 KB

MLModelService wrapping Nvidia's Triton Server

License: Apache License 2.0

Shell 0.63% C++ 94.50% CMake 4.87%

viam-mlmodelservice-triton's Introduction

Viam Triton MLModel Service

Intro

A Viam MLModelService resource backed by Nvidia's Triton Server. You can read a tutorial on how to set it up on the Viam docs site.

(In)-stability Notice

This module is still under active development.

Prerequisites

  • An NVidia Jetson Orin board with Jetpack 5 installed. Note that use of an NVMe SSD is strongly recommended over using an SD card.

Install

  • Ensure that the NVidia Container Runtime is installed: sudo apt-get install nvidia-container. Note that nvidia-container is part of nvidia-jetpack so if you have Jetpack installed on the board, you probably already have this. But it is worth running the above command to make sure!

Registering the Module with a Robot

Follow the instructions to add a modular service to your robot, and search for "triton", then select the version from the Registry.

Creating a Model Respository

This module currently requires manual setup of a Model Repository under the ~/.viam directory of the user who will run viam-server. Here, we place the model repository under ~/.viam/triton/repository, but the exact subpath under ~/.viam does not matter.

For instance, to add the EfficientDet-Lite4 Object Detection model, you should have a layout like this after unpacking the module:

$ tree ~/.viam
~/.viam
├── cached_cloud_config_05536cf6-f8a6-464f-b05c-bf1e57e9d1d9.json
└── triton
    └── repository
        └── efficientdet-lite4-detection
            ├── 1
            │   └── model.savedmodel
            │       ├── saved_model.pb
            │       └── variables
            │           ├── variables.data-00000-of-00001
            │           └── variables.index
            └── config.pbext

The config.pbext file must exist, but at least for TensorFlow models it can be empty. The version here is 1 but it can be any positive integer. Newer versions will be preferred by default.

Instantiating the module for a deployed model

The next step is to create an instance of the resource this module serves. This will go in the services section of your robot's JSON configuration:

A minimal configuration looks like:

  ...
  "services": [
    ...
    {
      "type": "mlmodel",
      "attributes": {
        "model_name": "efficientdet-lite4-detection",
        "model_repository_path": "/path/to/.viam/triton/repository",
      },
      "model": "viam:mlmodelservice:triton",
      "name": "mlmodel-effdet-triton"
    },
    ...
  ],
  ...

A complete configuration, specifying many optional parameters, might look like:

  ...
  "services": [
    ...
    {
      "type": "mlmodel",
      "attributes": {
        "backend_directory": "/opt/tritonserver/backends",
        "model_name": "efficientdet-lite4-detection",
        "model_version": 1,
        "model_repository_path": "/path/to/.viam/triton/repository",
        "preferred_input_memory_type_id": 0,
        "preferred_input_memory_type": "gpu",
        "tensor_name_remappings": {
          "outputs": {
            "output_3": "n_detections",
            "output_0": "location",
            "output_1": "score",
            "output_2": "category"
          },
          "inputs": {
            "images": "image"
          }
        }
      },
      "model": "viam:mlmodelservice:triton",
      "name": "mlmodel-effdet-triton"
    },
    ...
  ],
  ...

The type field must be mlmodel, and the model field must use the viam:mlmodelservice:triton tag, but the name of this module is up to you. The following attribute level configurations are available:

  • model_name [required]: The model to be loaded from the repository.

  • model_repository_path [required]: The (container side) path to a model repository. Note that this must be a subdirectory of the $HOME/.viam directory of the user running viam-server.

  • backend_directory [optional, default determined at build time]: A container side path to the TritonServer "backend" directory. You normally do not need to override this; the build will set it to the backend directory of the Triton Server installation in the container. You may set it if you wish to use a different set of backends.

  • model_version [optional, defaults to -1, meaning 'newest']: The version of the model to be loaded. If not specified, the module will use the newest version of the model named by model_name.

  • preferred_input_memory_type [optional, see below for default]: One of cpu, cpu-pinned, or gpu. This controlls the type of memory that will be allocated by the module for input tensors. If not specified, this will default to cpu if no CUDA-capable devices are detected at runtime, or to gpu if CUDA-capable devices are found.

  • preferred_input_memory_type_id [optional, defaults to 0]: CUDA identifier on which to allocate gpu or cpu-pinned input tensors. This defaults to 0, meaning the first device. You probably don't need to change this unless you have multiple GPUs.

  • tensor_name_remappings [optional, defaults to {}]: Provides two dictionaries under the input and output keys that rename the models tensors. Higher level services may expect tensors with particular names (e.g. the Viam Vision services). Use this map to rename the tensors from the loaded model as needed to meet those requirements.

Connecting the Viam Vision Service

If all has gone right, you can now create a Viam vision service with a configuration like the following:

  ...
  "services": [
    ...
    {
      "attributes": {
        "mlmodel_name": "mlmodel-effdet-triton"
      },
      "model": "mlmodel",
      "name": "vision-effdet-triton",
      "type": "vision"
    }

You can now connect this vision service to a transform camera, or get detections programatically via any SDK.

Verifying that the GPU is in use

I recommend using the jtop utility on the Jetson line in order to monitor GPU usage and validate that Triton is accelerating inference via the GPU.

viam-mlmodelservice-triton's People

Contributors

acmorrow avatar bhaney avatar abe-winter avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.