Comments (4)
It is OK to enable both of them. TensorRT will choose a higher-precision kernel if it results in overall lower runtime, or if no low-precision implementation exists.
Read this for more details.
from mmdeploy.
@grimoire That makes sense however Nvidia states "..There are three precision flags: FP16, INT8, and TF32, and they may be enabled independently.." so are you really supposed to enable more than one at the same time? Well if it works, I guess it is fine. In any case, I will test this further and see what happens on my Jetson AGX Xavier.
Slightly off-topic. In the link you sent it states that "..TensorRT will still choose a higher-precision kernel if it results in overall lower runtime..."
However, if I enable FP16 when using a GPU architecture that does not use it (eg. my Quadro P2000), a warning is given:
[TRT] [W] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
Naturally, the default FP32 would've been the fastest - but the FP16 is still used instead. So does TensorRT actually pick the fastest one in this case?
from mmdeploy.
According to my experiment, enabling both flags is slightly faster than int8 or fp16 only, so I guess TensorRT would do some optimization about the precision.
And I guess TensorRT will use fp16 if the layer does not support int8(whatever device). That's why the performance would be bad on the device without fp16 support.
from mmdeploy.
@grimoire I see, that is good to know. In that case, I think we can close this issue.
from mmdeploy.
Related Issues (20)
- how to remove "mmdeploy" domain in model's input and keep the onnx model reasoning normal HOT 1
- [Bug] How do I use multiprocess or multithreaded loading models
- [Bug] Dynamic Axes Not supported with this tensorRT version
- [Bug] [mmdeploy] [error] [module_adapter.h:31] unhandled exception: invalid argument (1)
- [Bug] Issue with tools/deploy.py on centerpoint model
- [Bug] Detection is not possible depending on the aspect ratio of the defect.
- [Docs] ONNX export Optimizer
- [Feature] DSVT in mmdeploy
- [Bug] Converting to ONNX: Dynamic batch size on input gives all dynamic axes on output.
- [Bug] The C++ environment has been configured. When I use the exported onnx file, it shows that the model cannot be loaded during C++ inference.
- Fatal error: mmdeploy:MMCVModulatedDeformConv2d(-1) is not a registered function/op
- RTMO model convert to onnx HOT 1
- [Bug] RTMDet model to ONNXRuntime or Torchscript doesn't produce masks
- [Bug] Failed to convert fcos-resnet50 to onnx HOT 1
- [Bug] Detection TensorRT has zero output binding shape
- [Feature] Torchscript Backend support for Pytorch 2.1.0+ versions
- [Bug]When I deploy segmentation, the result of SDK occurs discontinuous results. Sometimes the result is normal, but another time it's unnormal.
- [Bug] 尝试进行瑞芯微RK3588模型转换出错 HOT 1
- [Docs] RTMO inference with TensorRT
- [Bug] Wrong Onnx Execution Provider - Error when binding input: There's no data transfer registered for copying tensors from Device:[DeviceType:1 MemoryType:0 DeviceId:0] to Device:[DeviceType:0 MemoryType:0 DeviceId:0] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mmdeploy.