Rust Bindings For Nvidia's TensorRT Deep Learning Library.
See tensorrt/README.md for information on the Rust library See tensorrt-sys/README.md for information on the wrapper library for TensorRT
Rust library for running TensorRT accelerated deep learning models
Rust Bindings For Nvidia's TensorRT Deep Learning Library.
See tensorrt/README.md for information on the Rust library See tensorrt-sys/README.md for information on the wrapper library for TensorRT
Issue tracking the completion of bindings to the nvinfer1::IRuntime
interface in the TensorRT library. All public functions available in the C++ interface should have corresponding Rust bindings.
Issue tracking the completion of bindings to the nvinfer1::ICudaEngine
interface from the TensorRT library. All pubic functions available from the C++ library should have corresponding Rust bindings.
Currently we can only successfully run tests for tensorrt-sys on CI. When attempting to run the tests for tensorrt-rs on CI we get some ugly runtime errors that I think are related to runtime dynamic linking inside the docker container we're using.
I am still working on a consistent reproduction but I have a strong feeling that this is related to some issue in the Docker container
The ndarray crate is a popular crate for creating and manipulating muliti-dimensional arrays and matrices in Rust. It is similar to the popular python library numpy. Ndarray is becoming a standard backbone as the data type used for ML applications in Rust.
For good ergonomics and integration within the wider Rust ML community we should add support for using ndarray types directly with tensorrt-rs.
Hello, as I've seen here Error now is anyhow::Error
.
As description at crates.io of it suggest:
This library provides anyhow::Error, a trait object based error type for easy idiomatic error handling in Rust applications.
Generally it is nice to use it for applications in order to merge all errors into 1 Error type and have good way to have context of some errors, but it is not great for libraries.
As a solution it's better just to use an enum or just struct(String). That will give some context of Error to caller. As an example from here:
/// Error which happened in cuda runtime
/// Todo: make it as enum
#[derive(Debug, Error, Clone)
pub struct CudaError(String);
/// Error which occured during inference
#[derive(Debug, Error, Clone)
pub struct InferenceError(pub CudaError);
impl From<CudaError> for InferenceError {
fn from(e: CudaError) -> Self {
Self(e)
}
}
impl Context {
pub fn execute<D1: Dimension, D2: Dimension>(
&self,
input_data: ExecuteInput<D1>,
mut output_data: Vec<ExecuteInput<D2>>,
) -> Result<(), InferenceError>;
}
To help new users get started we should have an example application that uses the library to run the test model to do object detection.
This application should cover the basics of parsing a .uff and serializing it to an .engine, passing data (in this case images) to the model, and displaying the model output.
The examples in the library should mirror the examples provided with the TensorRT library so people have a direct comparison of usage between official samples and tensorrt-rs
Issue tracking the completion of bindings to the nvinfer1::IHostMemory
interface in the TensorRT library. All public functions that are available in the C++ library should have corresponding Rust bindings.
Issue tracking the completion of bindings to the nvinfer1::IUffParser
interface in TensorRT. All public functions that are available in the C++ library should have corresponding Rust bindings.
Is your feature request related to a problem? Please describe.
Currently using the library requires a user to have TensorRT pre-installed on the system. This makes the experience of using the library simply via crates.io less than idea for new users.
Describe the solution you'd like
I would like to include the static versions of the TesnorRT library with the crate so that end users don't have to worry about having the correct version of TesnorRT's shared libraries installed on the system.
Describe alternatives you've considered
Right now we use dynamic linking via .so's on Linux and .dll's on Windows. This works just fine but adds another step before someone can use the library. This is a very workable solution but not as pain free as static linking could be.
Additional context
It is possible that there could be issues with CUDA toolkit version incompatibility with TensorRT. I don't think it will be a major issue at this juncture but handling this situation or informing end users of any mismatch in a clear way is something that will need to be addressed when adding this feature.
TensorRT engines support two execution modes, sync and async. Sync is already supported and doesn't require any CUDA Streams to run.
For async to be supported we need to have support for creating and passing CUDA streams to the execution context along with the data to be executed.
We may be able to get this support via https://github.com/bheisler/RustaCUDA since it's already wrapping the CUDA API.
Is your feature request related to a problem? Please describe.
When converting a model from UFF to an engine you have to write that engine to a file if you don't want to have to convert the UFF file everytime the program starts up.
To get around this there needs to be a binding to engine.seralize
and IHostMemory
.
Describe the solution you'd like
A function on the engine type to call seralize to get HostMemory that can be written to disk
Describe alternatives you've considered
None since the functionality described isn't possible without this binding.
Additional context
Here is an example of writing an engine to a file from C++.
bool writeEngineToFile(ICudaEngine &engine, const std::string &fileName) {
IHostMemory *serializedModel = engine.serialize();
if (serializedModel) {
ofstream modelBinaryFile(fileName, ios::out | ios::binary);
if (modelBinaryFile.is_open()) {
modelBinaryFile.write((char *) serializedModel->data(), serializedModel->size());
modelBinaryFile.close();
gLogInfo << "Serailzed TensorRT Engine to " << fileName << std::endl;
} else {
gLogError << "Failed to write TensorRT engine to file: " << fileName << std::endl;
return false;
}
} else {
gLogError << "Failed to serialize TensorRT engine" << std::endl;
return false;
}
return true;
}
Hi! Thanks for building this library. I am using TensorRT in my project written in python. Now, I am porting my project from python to rust and want to use this repo, but I am not able to import it using cargo.toml. I suspect this could be a cuda version mismatch or device not supported issue. I'd like to seek your help in building it.
He is my environment:
In cargo.toml, I'm using
[dependencies]
tensorrt-rs = {git="https://github.com/mstallmo/tensorrt-rs" , branch="develop"}
This is the error I get when building.
Thanks again for working on this repo!
Regards,
Dixant
The uff parser and functions related to it return booleans in a few locations to indicate success or failure to parse the file. We should convert these boolean values to the Result
type in Rust. We should also make sure that the file ends in .uff
not only that the location exits.
This should fit nicely into the Result
that is already returned when we check to make sure that the path to the .uff file that is passed in to be parsed exists.
tensorrt has built in dims types for communicating the shape of inputs to the library. To be able to properly use the uff parser and the library as a whole we need to have bindings to these types.
There is a tricky class inheritance here that might have to be dealt with but we can cross that bridge after getting one implemented to get a better idea of how the interaction from Rust to C++ will work for this case.
Link to dims documentation here:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-515/tensorrt-api/c_api/classnvinfer1_1_1_dims_c_h_w.html
Issue tracking the completion of the bindings to the nvinfer1::IBuilder
interface from the TensorRT library. The Rust implementation should have bindings to all the publicly available functions in the nvinfer1::IBuilder
interface.
The current API for creating a Builder, Network, and Parser is very similar to the C++ API provided by TensorRT. This isn't exactly a bad thing but using it in practice feels like there are a lot of steps to construct that could be slimmed down. We should take a little time to look at different possibilities for APIs that can be provided by the crate that are a little better than 1 to 1 mappings of the C++ APIs.
I'm not against keeping the types around as is since they provided flexibility when building networks in situations that aren't exactly the same as our use cases. Maybe the answer is that it's better to build the abstraction in the application itself out of the types that the crate provides rather than tying to force a use case specific API into a more general context.
Add support for building networks with an onnx based networks as well as uff based networks utilizing the tensorrt onnx parser.
Is your feature request related to a problem? Please describe.
There is currently no automated CI running on each pull request to highlight any potential issues with new code changes.
Describe the solution you'd like
CI enabled ideally through GitHub actions to keep everything clear and simple and prevent from having to use a 3rd party service. This may not be feasible as we have GPU and native library dependencies that need to be linked against that could cause major headaches with GitHub actions. This is something that will have to be investigated further.
Describe alternatives you've considered
There isn't an alternative to having CI but we can look into other options outside of Github Actions if it will make GPU building and testing less of a headache. Cost will have to be considered greatly here.
Since the correctness of this library depends a lot on how it interacts with engine files that are created from the C++ library it would be greatly helpful to create a TensorFlow model that we can use for testing and example code.
I'm thinking something along the lines of a common object detector using image net or something of the like. No matter what model is chosen it should be something that's widely known and accessible to others.
We should keep the .hd5 and .uff model files stored somewhere accessible (AWS?) and then work on a .onnx version when support for that comes online.
Issue tracking the completion of bindings to the nvinfer1::INetworkDefinition
interface in the TesnorRT library. All public functions available from the C++ library should have corresponding Rust bindings.
Issue tracking the completion of bindings to the nvinfer1::IExecutionContext
interface from the TensoRT library. All public functions available in the C++ interface should have corresponding Rust bindings.
We're going to need some good documentation for library examples and usage that will be published on docs.rs once we publish the crate to crates.io. We will also need to fill out the README so that people coming to the project have some background info and an explanation of what the project does.
List of things to document:
TensorRT
Plugins are a large part of TensorRT engine creation and execution. These plugins can be supplied by the library or custom written by the user.
I think the first pass at plugins should only focus on loading plugins that are included with the library.
Custom user plugins will be supported down the road but I think that will end up being more complicated than we need to start off.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.