liu-mengyang / trt-fairmot Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 4.0 7.98 MB

该项目实现了对以DLA34作为骨干网络的FairMOT的TensorRT迁移加速，同时这也是一个NVIDIA-阿里云的Hackathon2021竞赛项目。

Python 34.91% C++ 34.26% C 4.11% Cuda 26.30% Makefile 0.26% Shell 0.16%

trt-fairmot's Introduction

Hi 👋

I am Mengyang.

Last year M.S. student at SEU
I do mobile (edge) intelligence research
Now building an interesting ML system
TVM enthusiast
Mail to me: [email protected]

trt-fairmot's People

Contributors

Stargazers

Watchers

Forkers

javalier swcho33 22983180 ponymay

trt-fairmot's Issues

是否支持低版本cuda

请问此项目是否可以支持cuda10.2，我在ngc nvcr.io/nvidia/tensorrt:20.03-py3上编译不通过

我在我自己电脑配的环境是
ubuntu18.04
cuda11.1
pytorch1.8
tensorrt7.2.2.3
在编译时出现以下错误
make[1]: 进入目录“/home/j/桌面/trt-fairmot-main/TensorRT_ONNX_impl/build”
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2Plugin.o -c ../plugins/DCNv2Plugin.cpp
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2PluginDyn.o -c ../plugins/DCNv2PluginDyn.cpp
/usr/local/cuda/bin/nvcc -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -M -MT obj/DeformConv.o -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DeformConv.d ../plugins/DeformConv.cu
../plugins/DCNv2Plugin.cpp:2:10: fatal error: torch/script.h: 没有那个文件或目录
#include <torch/script.h>
^~~~~~~~~~~~~~~~
../plugins/DCNv2PluginDyn.cpp:2:10: fatal error: torch/script.h: 没有那个文件或目录
#include <torch/script.h>
^~~~~~~~~~~~~~~~
compilation terminated.
compilation terminated.
Makefile:39: recipe for target 'obj/DCNv2Plugin.o' failed
make[1]: *** [obj/DCNv2Plugin.o] Error 1
make[1]: *** 正在等待未完成的任务....
Makefile:39: recipe for target 'obj/DCNv2PluginDyn.o' failed
make[1]: *** [obj/DCNv2PluginDyn.o] Error 1
../plugins/DeformConv.cu:1:10: fatal error: ATen/ATen.h: 没有那个文件或目录
#include <ATen/ATen.h>
^~~~~~~~~~~~~
compilation terminated.
Makefile:34: recipe for target 'obj/DeformConv.o' failed
make[1]: *** [obj/DeformConv.o] Error 1
make[1]: 离开目录“/home/j/桌面/trt-fairmot-main/TensorRT_ONNX_impl/build”
Makefile:4: recipe for target 'all' failed
make: *** [all] Error 2

有没有人有同样错误的，我在网上并没有找到相应解决办法

About the speed result

thank you for the awesome contribution firstly. I implement this project successfully, and got the result below:

but when I test the inferenct time without using this project, i got the results below

model	gpu_memroy_cost	ave_time	fps	model_inference	post_inference
farimot_dla34	1295MiB	0.069	14	0.05	0.008
farimot_dla34(MMCV)	1591MiB	0.062	14.57	0.038	0.008

As we can easily find that the acceleration is not much as we expected, and the gpu momery cost more than before.
Futhermore, I noticed when I run python compare_onnx_fairmot.py, the gpu memory hit 6700MB.
I think it need to check more carefully. I will be appreciate if anyone who can explain the results.

将ONNX模型转化为TensorRT Engine(sh build_trt.sh)出错

您好，在我对该项目的复现中遇到了一些问题，集体体现在onnx转trt的环节中（即构建TensorRT Engine中的Step3：将ONNX模型转化为TensorRT Engine）

我的环境如下：

NGC tensorrt:21.02-py3 Docker容器
TensorRT 7.2.2
Python 3.8.5
PyTorch 1.8.1+cu11.1

以上环境与要求版本皆与您的Readme Instructions相同
随后克隆本项目、安装第三方库、下载原版权重后，进入Readme中的使用环节：

1、编译Plugin，得到编译信息如下（无错误），并在build路径中得到动态库DCNv2Plugin.so与DCNv2PluginDyn.so。

cd build; make
make[1]: Entering directory '/home/trt-fairmot/TensorRT_ONNX_impl/build'
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2Plugin.o -c ../plugins/DCNv2Plugin.cpp
/usr/local/cuda/bin/nvcc -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -M -MT obj/DeformConv.o -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DeformConv.d ../plugins/DeformConv.cu
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -MD -MP -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -o obj/DCNv2PluginDyn.o -c ../plugins/DCNv2PluginDyn.cpp
/usr/local/cuda/bin/nvcc -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -I../include -isystem /usr/local/cuda/include -isystem /usr/local/tensorrt7.2-cuda11.1/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include -Xcompiler -fPIC -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o obj/DeformConv.o -c ../plugins/DeformConv.cu
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -shared -o DCNv2Plugin.so obj/DCNv2Plugin.o obj/DeformConv.o -L/usr/local/cuda/lib64 -L/usr/local/lib/python3.8/dist-packages/torch/lib/ -L/usr/local/tensorrt7.2-cuda11.1/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudart -lnvinfer -lnvonnxparser -ldl -lpthread -lcuda -ltorch -lc10 -ltorch_cuda -lc10_cuda -ltorch_cpu -ltorch_python
g++ -g -DEBUG -std=c++17 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__  -fPIC -shared -o DCNv2PluginDyn.so obj/DCNv2PluginDyn.o obj/DeformConv.o -L/usr/local/cuda/lib64 -L/usr/local/lib/python3.8/dist-packages/torch/lib/ -L/usr/local/tensorrt7.2-cuda11.1/lib -Wl,-rpath=/usr/local/cuda/lib64 -lcudart -lnvinfer -lnvonnxparser -ldl -lpthread -lcuda -ltorch -lc10 -ltorch_cuda -lc10_cuda -ltorch_cpu -ltorch_python
make[1]: Leaving directory '/home/trt-fairmot/TensorRT_ONNX_impl/build

2、导出PyTorch模型为ONNX模型：成功得到fairmot.onnx与fairmot_plugin.onnx模型，但在运行python build_onnx_engine.py指令时中间提示DCNv2是未定义的算子（个人认为是正常的）
3、将ONNX模型转化为TensorRT Engine：

此处出错：运行sh build_trt.sh指令时，得到第一条Error信息如下：
[E] Could not load plugin library: ./build/DCNv2PluginDyn.so, due to: libc10.so: cannot open shared object file: No such file or directory
网络上搜寻此错误的原因可能时torch/lib中缺少此libc10.so动态库，但本人确认此so是确实存在的。
随后导致terexec失败，DCN层未能成功转换，后续错误附上：

[04/12/2022-15:43:01] [I] [TRT] No importer registered for op: DCNv2Plugin. Attempting to import as plugin.
[04/12/2022-15:43:01] [I] [TRT] Searching for plugin: DCNv2Plugin, plugin_version: 1, plugin_namespace: 
[04/12/2022-15:43:01] [E] [TRT] INVALID_ARGUMENT: getPluginCreator could not find plugin DCNv2Plugin version 1
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:705: While parsing node number 97 [DCNv2Plugin -> "534"]:
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:706: --- Begin node ---
[04/12/2022-15:43:01] [E] [TRT] /home/jenkins/workspace/OSS/L0_MergeRequest/oss/parsers/onnx/ModelImporter.cpp:707: input: "529"

以上为本错误的具体复现过程，恳请您给出解答或您的想法，万分感谢！

有关于NGC镜像的支持问题

您好！首先非常感谢您的工作！
目前大赛官方所提供的镜像已经升级至TensorRT8版本了，请问相对您的项目进行部署，现阶段有什么比较好的方式呢？是只能像Issue#1中那样自己手动搭建环境嘛
希望能得到您的回复！

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.