Comments (3)
You can add another output with the same name as the output state if you want to return it to the client. https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#implicit-state-management
For debugging purposes, the client can request the output state. In order to allow the client to request the output state, the output section of the model configuration must list the output state as one of the model outputs. Note that requesting the output state from the client can increase the request latency because of the additional tensors that have to be transferred.
from server.
Is it the only way to extract states?
from server.
Currently, yes that's the only way. Let us know if you have ideas for other ways to extract states as well.
from server.
Related Issues (20)
- Unable to use pytoch library with libtorch backend when using triton inference server In-Process python API HOT 10
- model analyser stucks HOT 2
- Question to huggingface model using triton
- Questions about input and output shape in model configuration when batch size is 1 HOT 3
- Model Management HOT 1
- passing input data HOT 1
- Python backend status zombie but Tritonserver `v2/health` still return 200 OK HOT 1
- Onnxruntime backend doesn't load model when container is running on Ubuntu HOT 1
- Cant build python+onnx+ternsorrtllm backends r24.04 HOT 3
- increase chunk size for streaming with tensorrtllm_backend
- trt accelerator
- Can't build the Docker image r24.04 on Azure Nvidia VMI HOT 3
- Build error when building new image on top of the `nvcr.io/nvidia/tritonserver:24.04-py3-sdk` container image from NGC HOT 7
- Feature Questions HOT 1
- Triton Server for the model mixtral-8x7b HOT 4
- max_batch_effect HOT 1
- launch_triton_server.py attempts to place two models on the same GPU instead of one model on two GPUs
- CUDA Failing to initialize in docker container HOT 3
- Add to the serve-side metrics on the input and output sizes HOT 1
- Pods Receiving Traffic Too Early When Scaling with HPA Causes 'Socket Closed' Errors on Triton Inference Server
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server.