dusty-nv / jetson-inference Goto Github PK

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Home Page: https://developer.nvidia.com/embedded/twodaystoademo

License: MIT License

CMake 0.87% Shell 5.80% C++ 55.56% Cuda 3.91% Python 14.60% C 1.08% Dockerfile 0.28% CSS 13.57% JavaScript 2.04% HTML 2.30%

deep-learning inference computer-vision embedded image-recognition object-detection segmentation jetson jetson-tx1 jetson-tx2

jetson-inference's Introduction

Deploying Deep Learning

Welcome to our instructional guide for inference and realtime vision DNN library for NVIDIA Jetson devices. This project uses TensorRT to run optimized networks on GPUs from C++ or Python, and PyTorch for training models.

Supported DNN vision primitives include imageNet for image classification, detectNet for object detection, segNet for semantic segmentation, poseNet for pose estimation, and actionNet for action recognition. Examples are provided for streaming from live camera feeds, making webapps with WebRTC, and support for ROS/ROS2.

Follow the Hello AI World tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets, training your own models with PyTorch, and deploying them with TensorRT.

Hello AI World
Jetson AI Lab
Video Walkthroughs
API Reference
Code Examples
Pre-Trained Models
System Requirements
Change Log

> JetPack 6 is now supported on Orin devices (developer.nvidia.com/jetpack)
> Check out the Generative AI and LLM tutorials on Jetson AI Lab!
> See the Change Log for the latest updates and new features.

Hello AI World

Hello AI World can be run completely onboard your Jetson, including live inferencing with TensorRT and transfer learning with PyTorch. For installation instructions, see System Setup. It's then recommended to start with the Inference section to familiarize yourself with the concepts, before diving into Training your own models.

System Setup

Inference

Training

Transfer Learning with PyTorch
Classification/Recognition (ResNet-18)
Object Detection (SSD-Mobilenet)
- Re-training SSD-Mobilenet
- Collecting your own Detection Datasets

WebApp Frameworks

Appendix

Jetson AI Lab

The Jetson AI Lab has additional tutorials on LLMs, Vision Transformers (ViT), and Vision Language Models (VLM) that run on Orin (and in some cases Xavier). Check out some of these:

NanoOWL - Open Vocabulary Object Detection ViT (container: nanoowl)

Live Llava on Jetson AGX Orin (container: local_llm)

Live Llava 2.0 - VILA + Multimodal NanoDB on Jetson Orin (container: local_llm)

Realtime Multimodal VectorDB on NVIDIA Jetson (container: nanodb)

Video Walkthroughs

Below are screencasts of Hello AI World that were recorded for the Jetson AI Certification course:

Description	Video
Hello AI World Setup Download and run the Hello AI World container on Jetson Nano, test your camera feed, and see how to stream it over the network via RTP.
Image Classification Inference Code your own Python program for image classification using Jetson Nano and deep learning, then experiment with realtime classification on a live camera stream.
Training Image Classification Models Learn how to train image classification models with PyTorch onboard Jetson Nano, and collect your own classification datasets to create custom models.
Object Detection Inference Code your own Python program for object detection using Jetson Nano and deep learning, then experiment with realtime detection on a live camera stream.
Training Object Detection Models Learn how to train object detection models with PyTorch onboard Jetson Nano, and collect your own detection datasets to create custom models.
Semantic Segmentation Experiment with fully-convolutional semantic segmentation networks on Jetson Nano, and run realtime segmentation on a live camera stream.

API Reference

Below are links to reference documentation for the C++ and Python libraries from the repo:

jetson-inference

	C++	Python
Image Recognition	`imageNet`	`imageNet`
Object Detection	`detectNet`	`detectNet`
Segmentation	`segNet`	`segNet`
Pose Estimation	`poseNet`	`poseNet`
Action Recognition	`actionNet`	`actionNet`
Background Removal	`backgroundNet`	`actionNet`
Monocular Depth	`depthNet`	`depthNet`

jetson-utils

C++
Python

These libraries are able to be used in external projects by linking to libjetson-inference and libjetson-utils.

Code Examples

Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:

Additional C++ and Python samples for running the networks on images and live camera streams can be found here:

	C++	Python
Image Recognition	`imagenet.cpp`	`imagenet.py`
Object Detection	`detectnet.cpp`	`detectnet.py`
Segmentation	`segnet.cpp`	`segnet.py`
Pose Estimation	`posenet.cpp`	`posenet.py`
Action Recognition	`actionnet.cpp`	`actionnet.py`
Background Removal	`backgroundnet.cpp`	`backgroundnet.py`
Monocular Depth	`depthnet.cpp`	`depthnet.py`

note: see the Array Interfaces section for using memory with other Python libraries (like Numpy, PyTorch, ect)

These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help for usage info.

Pre-Trained Models

The project comes with a number of pre-trained models that are available to use and will be automatically downloaded:

Image Recognition

Network	CLI argument	NetworkType enum
AlexNet	`alexnet`	`ALEXNET`
GoogleNet	`googlenet`	`GOOGLENET`
GoogleNet-12	`googlenet-12`	`GOOGLENET_12`
ResNet-18	`resnet-18`	`RESNET_18`
ResNet-50	`resnet-50`	`RESNET_50`
ResNet-101	`resnet-101`	`RESNET_101`
ResNet-152	`resnet-152`	`RESNET_152`
VGG-16	`vgg-16`	`VGG-16`
VGG-19	`vgg-19`	`VGG-19`
Inception-v4	`inception-v4`	`INCEPTION_V4`

Object Detection

Model	CLI argument	NetworkType enum	Object classes
SSD-Mobilenet-v1	`ssd-mobilenet-v1`	`SSD_MOBILENET_V1`	91 (COCO classes)
SSD-Mobilenet-v2	`ssd-mobilenet-v2`	`SSD_MOBILENET_V2`	91 (COCO classes)
SSD-Inception-v2	`ssd-inception-v2`	`SSD_INCEPTION_V2`	91 (COCO classes)
TAO PeopleNet	`peoplenet`	`PEOPLENET`	person, bag, face
TAO PeopleNet (pruned)	`peoplenet-pruned`	`PEOPLENET_PRUNED`	person, bag, face
TAO DashCamNet	`dashcamnet`	`DASHCAMNET`	person, car, bike, sign
TAO TrafficCamNet	`trafficcamnet`	`TRAFFICCAMNET`	person, car, bike, sign
TAO FaceDetect	`facedetect`	`FACEDETECT`	face

Legacy Detection Models

Model	CLI argument	NetworkType enum	Object classes
DetectNet-COCO-Dog	`coco-dog`	`COCO_DOG`	dogs
DetectNet-COCO-Bottle	`coco-bottle`	`COCO_BOTTLE`	bottles
DetectNet-COCO-Chair	`coco-chair`	`COCO_CHAIR`	chairs
DetectNet-COCO-Airplane	`coco-airplane`	`COCO_AIRPLANE`	airplanes
ped-100	`pednet`	`PEDNET`	pedestrians
multiped-500	`multiped`	`PEDNET_MULTI`	pedestrians, luggage
facenet-120	`facenet`	`FACENET`	faces

Semantic Segmentation

Dataset	Resolution	CLI Argument	Accuracy	Jetson Nano	Jetson Xavier
Cityscapes	512x256	`fcn-resnet18-cityscapes-512x256`	83.3%	48 FPS	480 FPS
Cityscapes	1024x512	`fcn-resnet18-cityscapes-1024x512`	87.3%	12 FPS	175 FPS
Cityscapes	2048x1024	`fcn-resnet18-cityscapes-2048x1024`	89.6%	3 FPS	47 FPS
DeepScene	576x320	`fcn-resnet18-deepscene-576x320`	96.4%	26 FPS	360 FPS
DeepScene	864x480	`fcn-resnet18-deepscene-864x480`	96.9%	14 FPS	190 FPS
Multi-Human	512x320	`fcn-resnet18-mhp-512x320`	86.5%	34 FPS	370 FPS
Multi-Human	640x360	`fcn-resnet18-mhp-512x320`	87.1%	23 FPS	325 FPS
Pascal VOC	320x320	`fcn-resnet18-voc-320x320`	85.9%	45 FPS	508 FPS
Pascal VOC	512x320	`fcn-resnet18-voc-512x320`	88.5%	34 FPS	375 FPS
SUN RGB-D	512x400	`fcn-resnet18-sun-512x400`	64.3%	28 FPS	340 FPS
SUN RGB-D	640x512	`fcn-resnet18-sun-640x512`	65.1%	17 FPS	224 FPS

If the resolution is omitted from the CLI argument, the lowest resolution model is loaded
Accuracy indicates the pixel classification accuracy across the model's validation dataset
Performance is measured for GPU FP16 mode with JetPack 4.2.1, nvpmodel 0 (MAX-N)

Legacy Segmentation Models

Network	CLI Argument	NetworkType enum	Classes
Cityscapes (2048x2048)	`fcn-alexnet-cityscapes-hd`	`FCN_ALEXNET_CITYSCAPES_HD`	21
Cityscapes (1024x1024)	`fcn-alexnet-cityscapes-sd`	`FCN_ALEXNET_CITYSCAPES_SD`	21
Pascal VOC (500x356)	`fcn-alexnet-pascal-voc`	`FCN_ALEXNET_PASCAL_VOC`	21
Synthia (CVPR16)	`fcn-alexnet-synthia-cvpr`	`FCN_ALEXNET_SYNTHIA_CVPR`	14
Synthia (Summer-HD)	`fcn-alexnet-synthia-summer-hd`	`FCN_ALEXNET_SYNTHIA_SUMMER_HD`	14
Synthia (Summer-SD)	`fcn-alexnet-synthia-summer-sd`	`FCN_ALEXNET_SYNTHIA_SUMMER_SD`	14
Aerial-FPV (1280x720)	`fcn-alexnet-aerial-fpv-720p`	`FCN_ALEXNET_AERIAL_FPV_720p`	2

Pose Estimation

Model	CLI argument	NetworkType enum	Keypoints
Pose-ResNet18-Body	`resnet18-body`	`RESNET18_BODY`	18
Pose-ResNet18-Hand	`resnet18-hand`	`RESNET18_HAND`	21
Pose-DenseNet121-Body	`densenet121-body`	`DENSENET121_BODY`	18

Action Recognition

Model	CLI argument	Classes
Action-ResNet18-Kinetics	`resnet18`	1040
Action-ResNet34-Kinetics	`resnet34`	1040

Recommended System Requirements

Jetson Nano Developer Kit with JetPack 4.2 or newer (Ubuntu 18.04 aarch64).
Jetson Nano 2GB Developer Kit with JetPack 4.4.1 or newer (Ubuntu 18.04 aarch64).
Jetson Orin Nano Developer Kit with JetPack 5.0 or newer (Ubuntu 20.04 aarch64).
Jetson Xavier NX Developer Kit with JetPack 4.4 or newer (Ubuntu 18.04 aarch64).
Jetson AGX Xavier Developer Kit with JetPack 4.0 or newer (Ubuntu 18.04 aarch64).
Jetson AGX Orin Developer Kit with JetPack 5.0 or newer (Ubuntu 20.04 aarch64).
Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).

The Transfer Learning with PyTorch section of the tutorial speaks from the perspective of running PyTorch onboard Jetson for training DNNs, however the same PyTorch code can be used on a PC, server, or cloud instance with an NVIDIA discrete GPU for faster training.

Extra Resources

In this area, links and resources for deep learning are listed:

ros_deep_learning - TensorRT inference ROS nodes
NVIDIA AI IoT - NVIDIA Jetson GitHub repositories
Jetson eLinux Wiki - Jetson eLinux Wiki

Two Days to a Demo (DIGITS)

note: the DIGITS/Caffe tutorial from below is deprecated. It's recommended to follow the Transfer Learning with PyTorch tutorial from Hello AI World.

Expand this section to see original DIGITS tutorial (deprecated)

The DIGITS tutorial includes training DNN's in the cloud or PC, and inference on the Jetson with TensorRT, and can take roughly two days or more depending on system setup, downloading the datasets, and the training speed of your GPU.

jetson-inference's People

Contributors

Stargazers

Watchers

Forkers

derrickjnet leliaonvidia yaochx alexey-kamenev jiapei100 tianxingyzxq labimage benjamesbabala leezqcst arasharchor ikrishneel hzliu123 lymhust lyk125 renbozqin aiilab tsingjinyun acidburn0zzz wangjuenew heavenyc prlawrence abaco-systems raoufbarboza melshaer markneville fsugai xhuvom tpjg raaka1 209-tongji birdgun fancyerii haditab h0w2 ml-lab lrvk123 tlesort huleg kevinee curiositycreations eivado ctuning chunde arvinasokan guo253 v-italy jaechoon2 igeeks dataracer11 woshichengxinxin zjulion wojiao42 chenhuizhao kevinkit panda1100 caozhengquan stanjvarughese tokk-nv fucevin ginsongsong draculaborn shuzhenzhang drchungalbert vkhokhla siminghua svmtracking pgielda bpinaya architectureofthings coderye qinhongwei rpmunoz zyahhh gwli eonkid dmitten runauto tno-ivs ugocupcic liuxiaaoyu s4wrxttcs qinhuaping robotlinker zn845639326 reinzor lucywi hanahimi bigsnarfdude elix22 yyuzhong jacocronje rhodiumlabs wangcongrobot liutao2017 amatya-space winnerineast maucod mryoungci grover123 jwatte

jetson-inference's Issues

Changing Camera source from onboard camera to USB camera

Hi,

Would it be possible to change the camera source from the onboard camera to USB camera?

Kind Regards,
JP

hi

Cuda error in file src/implicit_gemm.cu at line 406: invalid device function

I have successfully compiled the code. However, when run the detectent-console demo, I got the following error:

detectnet-console
  args (2):  0 [./detectnet-console]  1 [peds-004.p]  

[GIE]  attempting to open cache file multiped-500/snapshot_iter_178000.caffemodel.tensorcache
[GIE]  cache file not found, profiling network model
[GIE]  platform does not have FP16 support.
[GIE]  loading multiped-500/deploy.prototxt multiped-500/snapshot_iter_178000.caffemodel
[GIE]  configuring CUDA engine
[GIE]  building CUDA engine
Cuda error in file src/implicit_gemm.cu at line 406: invalid device function
detectnet-console: src/implicit_gemm.cu:769: virtual int dit::ImplicitGemm::run(void*, const void*, const void*, int, cudaStream_t): Assertion `mParamsSizeInBytes<=sizeof(ImplicitGemmLargeKernelParams)' failed.
Aborted (core dumped)

Can anyone provide me some help about it? Thanks so much!

lenet - Segmentation fault

Hi anyone tried lenet ?

i have added
Lenet caffe model and prototext
changed
https://github.com/itlab-vision/DLLibs-comparison/blob/master/caffe/mnist/caffe_mnist_classification/conf%20and%20trained%20model/lenet.caffemodel?raw=true

renamed it to lenet.caffemodel

https://raw.githubusercontent.com/BVLC/caffe/master/examples/mnist/lenet.prototxt

added the following lines.

enum NetworkType
{
        ALEXNET,
        GOOGLENET,
        LENET
};

/**
 * Load a new network instance
 */
static imageNet* Create( NetworkType networkType=LENET );

received an when program is executed.
Segmentation fault

tensorNet::LoadNetwork Seg Faults

When running the imagenet-camera example, the program seg faults. It appears to be in the call:

imageNet.cpp
imageNet::init -
tensorNet::LoadNetwork

Where the LoadNetwork function is called with NULL. Replacing the NULL with an empty string ("") allows the program to run.

Configuration:
L4T 24.2
CUDA 8.0
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2)

DetectNet mAP on the trained models ...

Hello,

As I see you have trained the DetectNet for several classes (3 models). may I ask you the mAP of your trained models and also the detection speed in FPS please?

make Error occurred：

/jetson-inference/build# make
[ 2%] Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
/usr/include/string.h: In function ‘void* __mempcpy_inline(void*, const void*, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
return (char *) memcpy (__dest, __src, __n) + __n;
^
CMake Error at jetson-inference_generated_cudaRGB.cu.o.cmake:266 (message):
Error generating file
/home/molys/Download_file/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaRGB.cu.o

CMakeFiles/jetson-inference.dir/build.make:249: recipe for target 'CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o' failed
make[2]: *** [CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/jetson-inference.dir/all' failed
make[1]: *** [CMakeFiles/jetson-inference.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

my machine is GTX1080+CUDA8.0+cudnn7.5
Thank you!

Changing the Batch-Size does not show effect when it is not in the TensorNet.cpp

Hello,

When changing:

MAX_BATCH_SIZE

in any of the examples it will not have an effect, it has to be changed in the TensorNet.cpp to have an effect.

Greetings

differences between inference results

Dear all,

I have trained a network for object detection. In Digits, the results are almost promising;however when I use exactly the same model out of digits using Jetson codes, the results are different. One possible discrepancy may result from the mean value of the network which I modified manually in the codes. I also took care of the image sizes in Digits and out of Digits to be exactly the same. Some other causes might be the "mCoverageThreshold" and the "threshold" parameters in detectNet.cpp which seem to be pre-fixed to "0.5", which might be different from the values used in Digits.
In particular, some explanation on the "mCoverageThreshold" and "threshold" parameters used in detectNet.cpp is helpful.
Any suggestions on the possible causes for this and how to fix them are more than welcome.

CUDNN Problem installing NVCaffe

@dusty-nv
I would like to use TensorRT and accompanying software that was recently made public by NVIDIA. I am working on installing your repo on my TX1, but first I must install the fp16 branch of NVCaffe.

I seem to be getting a CUDNN error with the version that comes with the most recent JetPack install. I can run make all, make test, and make pycaffe all with no problems. Then I try to run make runtest CUDA_DEVICES_VISIBLE=0 and get the following error about CUDNN.

F1003 02:38:10.825604 1032 cudnn_conv_layer.cpp:157] Check failed: status == CUDNN_STATUS_SUCCESS (9 vs. 0) CUDNN_STATUS_NOT_SUPPORTED
*** Check failure stack trace: ***
@ 0x7f8bfb1718 google::LogMessage::Fail()
@ 0x7f8bfb3614 google::LogMessage::SendToLog()
@ 0x7f8bfb1290 google::LogMessage::Flush()
@ 0x7f8bfb3eb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8ade7424 caffe::CuDNNConvolutionLayer<>::Reshape()
@ 0x7f8ae8a8a8 caffe::Net<>::Init()
@ 0x7f8ae8c0f8 caffe::Net<>::Net()
@ 0x662e7c caffe::NetTest<>::InitNetFromProtoString()
@ 0x607dc0 caffe::NetTest<>::InitReshapableNet()
@ 0x66b12c caffe::NetTest_TestReshape_Test<>::TestBody()
@ 0xa71e6c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0xa6b174 testing::Test::Run()
@ 0xa6b2b0 testing::TestInfo::Run()
@ 0xa6b370 testing::TestCase::Run()
@ 0xa6c4d0 testing::internal::UnitTestImpl::RunAllTests()
@ 0xa6c7e4 testing::UnitTest::Run()
@ 0x56c0e8 main
@ 0x7f8a9728a0 __libc_start_main
Makefile:552: recipe for target 'runtest' failed
make: *** [runtest] Aborted

Do I need to use a different version of CUDNN than is present with the JetPack?

Build fails

Steps to reproduce

Ubuntu 16.x
git clone https://github.com/dusty-nv/jetson-inference
mkdir build
cd build
cmake ../
make

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDART_LIBRARY (ADVANCED)
    linked by target "jetson-inference" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference
    linked by target "imagenet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/imagenet-console
    linked by target "imagenet-camera" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/imagenet-camera
    linked by target "detectnet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/detectnet-console
    linked by target "detectnet-camera" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/detectnet-camera
    linked by target "segnet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/segnet-console
CUDA_TOOLKIT_INCLUDE (ADVANCED)
   used as include directory in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference
...
-- Configuring incomplete, errors occurred!
See also "/home/daniil/Projects/Dividiti/tensort/jetson-inference/build/CMakeFiles/CMakeOutput.log".
See also "/home/daniil/Projects/Dividiti/tensort/jetson-inference/build/CMakeFiles/CMakeError.log".
make: *** No targets specified and no makefile found.  Stop.

Detectnet fine tuning

Hello Sir,

First I appreciate about your provided information.
My question is basically about DetectNet.

You have provided 3 trained models. may I ask you where can I download them? I mean ped-100, multiped-500, facenet-120.
Do you have any more trained model rather than above mentioned ones?
I use DIGITS 5.1, and I decide to find tune model instead of training from scratch. Then may I ask you which _.prototxt should be used in accompany with CaffeModel? I mean deploy.prototxt or original.prototxt or solver.prototxt or train_val.prototxt?

Mean Value in detectNet

Hello,

I got a question regarding the mean value used in detectNet.cpp: the mean value seems to be set to "104.0069879317889f, 116.66876761696767f, 122.6789143406786f", line 166 in detetNet.cpp, regardless of the trained model. Shouldn't this value be a variable and read from the mean.binaryproto data file?

many thanks,
Shervin

Build problems on Ubuntu 14.04 Host

Hello,

I can successfully built and run the example Apps on my TX1. As an exercise I've been trying to compile the code on my host machine. Is this supposed to work out of the box? I get many errors such as:

/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"
/usr/include/x86_64-linux-gnu/c++/4.8/bits/c++config.h(190): error: expected a ";"

I haven't modified the CMake configuration file, and just followed the instructions as is. I've attached a log of my make command output, as well as the Makefile generated by CMake.

makefile_output.txt
Makefile.txt

Any clues why this is happening, or if it should happen?

Best Regards,
Piyush3dB.

Followed the instructions... cmake build overwhelms storage

Hi, following the instructions of this repo, the build for cmake produces a file that spans into several gigabytes. I was not able to use the "make" command on this file afterwards since there was no makefile. I believe I ran out of space on the Jetson TX-1.

how to set "mCoverageThreshold" and "threshold"

Dear all, could you please let me know how the Digits environment set the "mCoverageThreshold" and "threshold" parameters used in DetectNet?

Make Error: Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o

Can everyone help me how to solve this following error when i type 'make'?

[ 2%] Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o
nvcc fatal : Cannot compile in the 32-bit mode when the host compiler targets aarch64.
CMake Error at jetson-inference_generated_cudaYUV-YV12.cu.o.cmake:207 (message):
Error generating
/media/share33/User/henrylee/tx1/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaYUV-YV12.cu.o

CMakeFiles/jetson-inference.dir/build.make:119: recipe for target 'CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o' failed
make[2]: *** [CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/jetson-inference.dir/all' failed
make[1]: *** [CMakeFiles/jetson-inference.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Thank you very much.

Batch Processing

Just a follow-up on support for batch processing discussed in this topic:
https://devtalk.nvidia.com/default/topic/969035/jetson-tx1/embedded-deep-learning-with-jetson-mdash-nvidia-webinar/post/5060249/#5060249

Using custom Digits trained model in TX1 inference.

Hello Dusty.

I managed to successfully run your example with no issues (Jetpack 2.3). I would like to use my own DetectNet, which I trained using Digits 4 as shown in this example (https://github.com/NVIDIA/DIGITS/tree/master/examples/object-detection), for inference on TX1. I believe that I have changed all the corresponding fields of your repository in order to load my own .prototxt, .caffemodel and .binaryproto files. After running the detectnet-console example I get the following error:

./detectnet-console /home/ubuntu/Downloads/car2.jpg
detectnet-console
args (2): 0 [./detectnet-console] 1 [/home/ubuntu/Downloads/car2.jpg]

[GIE] platform has FP16 support.
[GIE] loading /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/deploy.prototxt /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/snapshot_iter_191190.caffemodel
could not parse layer type Python
[GIE] failed to parse caffe network
failed to load /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/snapshot_iter_191190.caffemodel
detectNet -- failed to initialize.
detectnet-console: failed to initialize detectNet

Tracing back this leads me to the detectNet.cpp file where I found that the error occurs during the net->LoadNetwork(...) command. Is there any idea about the origin of this problem?

Thank you!

Jetson TK1 inference

Hi,

Not really sure if this is the best place to ask this but is there any change to use this repository on a TK1?
More generally, is there any change to deploy detectnet on a TK1?

Thanks in advance

Drawing Rectangle

I want to draw the rectangle from the results of detectNet.

I think I should use opengl.

right? Do you have plan to implement rectangle function?

How to Install on ubuntu 14.04 with custom host

Hi,
I am try to install on my ubutu host. ubuntu 14.04 and cuda version 8.0
But it seen many issue, I reference the another issue "Build problems on Ubuntu 14.04 Host" and he's CMakeLists.txt but it still has error with
uint32_t undefined and I put #include on the of cudaOverlay.cu and cudaOverlay.h
It seen ok to run.
But after it ,another question is coming.
It say Nvinfer.h : no such file or directory
How to slove it
Thank you everyone!!

Feature Request - Saving Image + Label file within DetectNet-Camera.cpp

I think it would be really useful for certain use cases to be able to save an image plus the label file on command.

That way we can build a dataset that can then be adjusted/fixed before adding it back to the original dataset used to create the model. To retrain the model with the more data.

In my case I'm actually using the object detection boundary boxes to do a separate classification network. For example if I had hundreds of toy cars I would want to first find the cars using detectnet, and then I'd want to run a separate classification on the cars to determine the exact type. Like a Porsche 911 or a Ford F150.

I'm trying to do this myself, but of course I immediately ran into an issue due to my lack of knowledge with CUDA (never used it before). Where I'm confused about the GPU memory space versus the CPU memory space. I'm using the saveImageRGBA function that's used within detectnet-console.cpp, but it looks like its meant for an data within the GPU memory. For detectnet-camera it looks like the frame gets captured to shared image space, and then it gets converted to RGBA within the GPU space. I get a segmentation fault when I try using the saveImageRGBA function.

Inferencing on desktop PASCAL GPU with TensorRT 1.0 RC on CUDA 8.0

I have recently obtained and installed TensorRT 1.0RC on an early access program. However, this jetson-inference repo should work on PASCAL based GTX 1080 running CUDA 8.0 with the GIE provided by NVIDIA for demo. Running 'make' command generates error:

Error limit reached. 100 errors detected in the compilation of "/tmp/tmpxft_000009d1_00000000-7_cudaOverlay.cpp1.ii". Compilation terminated. CMake Error at jetson-inference_generated_cudaOverlay.cu.o.cmake:264 (message): Error generating file /home/xhuv/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaOverlay.cu.o

What change should be made to run this code on desktop GPUs other than TX1?

jetson-inference detectnet-camera can not using training model?

Dgits training model success. https://github.com/NVIDIA/DIGITS
using : cuda8.0 , caffe, digits5;
GoogleNet model
jetson-inference detectnet-camera can not using this model, error follow:

[GIE] attempting to open cache file vehicle/snapshot_iter_21120.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading vehicle/deploy.prototxt vehicle/snapshot_iter_21120.caffemodel
[GIE] failed to retrieve tensor for output 'coverage'
[GIE] failed to retrieve tensor for output 'bboxes'
[GIE] configuring CUDA engine
[GIE] building CUDA engine

Does TensorRT support Batch Normalization?

Hi, I have re-trained DetectNet with batch normalization layers , but I failed to run TensorRT over this caffe model.
When load the model file, here shows some error infos:
Message type "diccaffe.BatchNormParameter" has no filed name "scale_filler", could not parse deploy file.
After I delete the batch_norm_param filed in the deploy file, it shows another error info:
caffeParser.cpp:613: bool bnConvert(const nvinfer1::Weights&, const nvinfer1::Weights&, const nvinfer1::Weights&, float, nvinfer1::Weights&, nvinfer1::Weights&, std::vector<void*>&) [with T = float]: Assertion mean.count == variance.count && movingAverage.count == 1' failed.`

It will be appreciated if you can give me some advice.

TX1 Running the Live Camera Detection Demo FPS

Hello!!!
I successful Building from Source on TX1(Before is ubuntu 14.04 + GTX980)
I try the Running the Live Camera Detection Demo
I have some question with the FPS.

./detectnet-camera multiped-500 FPS is about 6.7 FPS
./detectnet-camera ped-100 FPS is about 6.7 FPS
./detectnet-camera facenet-120 is about 12.3 FPS

Does the FPS is correct ? It looks a little slow...
How can I accelerate it ?

Another question is , I put my own model create by https://github.com/NVIDIA/DIGITS/tree/digits-4.0/examples/object-detection
And change "multiped-500 file" to it . It successful to detect car.
But the FPS still 6~7.
How can I accelerate it ?

id's of classes in multiped-500 model

@dusty-nv in the original.prototxt for the multiped-500 model it looks like there are 5 classes. In your example you reference pedestrians and luggage - what are the other 3 classes?
Also, is this multi-class dataset available if we wish to fine tune the model?
Thanks for the useful examples.

jetson-inference detectnet-camera can not using training model

Dgits training model success.
https://github.com/dusty-nv/jetson-inference/tree/master/detectnet-camera
jetson-inference detectnet-camera can not using this model, error follow:
Dgits training model success.
https://github.com/dusty-nv/jetson-inference/tree/master/detectnet-camera
jetson-inference detectnet-camera can not using this model, error follow:

DetectNet Demo : caffemodel file missing

Hi Dustin !
I tried to run your demo for Detection network.
You mention that : "three example detection network models are are automatically downloaded during the repo source configuration"
But it seems that they are missing.
Thanks
Alex

Question: Can you load up two Networks at once?

Currently I use detectnet-camera to detect the position of objects within the live camera frame, and then I write out the image crops for the objects found.

Now I want to run the image crops through a trained imagenet classification network to identify exactly what the object is (like a ford pinto instead of simply being a vehicle).

Do I have to stop the DetectNet network in order to do this? Or can I run an imagenet based classification while the detectnet is still loaded?

I want to be able to go between the two with as little reduction on the frame rate as possible.

glibconfig.h: No such file or directory

hi, @dusty-nv, when i make, i meet this problem, but i can't fix it, can you do me a favor?

/usr/include/glib-2.0/glib/gtypes.h:32:24: fatal error: glibconfig.h: No such file or directory

thanks.

I want to add my customized layer

I want to add my customized layer in Jetson-Inference demo.

Currently, GIE does not support customized layers as well as latest caffe versions.

I want to add my caffe distribute(libcaffe.so) in this camera demo by removing GIE.

Is it possible?

Segnet can't load custom network

I followed the semantic segmentation example for DIGITS 5 to train my own model. I tried to load it with segnet but I get this:

[GIE]  attempting to open cache file seg-voc/snapshot.caffemodel.tensorcache
[GIE]  cache file not found, profiling network model
[GIE]  platform has FP16 support.
[GIE]  loading seg-voc/deploy.prototxt seg-voc/snapshot.caffemodel
[libprotobuf FATAL ../../../externals/protobuf/aarch64/10.0/include/google/protobuf/repeated_field.h:1378] CHECK failed: (index) < (current_size_): 
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: (index) < (current_size_): 
Aborted

I think that this has something to do with the deploy file because I guess that is what the protobuf lib is for... However, my deploy.prototxt is well formed and almost identical to the ones in the segnet examples, except for input size and initial padding.

webcam usage

Dear all,
Thanks for the codes and comments provided.
I have been able to compile and run the examples corresponding to console (detectnet-console and imagenet-console) on a "PC" running Ubuntu 14.04 but unfortunately the webcam cannot be used (the detectnet-camera and imagenet-camera apps do not run). So far I have tested two different webcams and neither one could be used for detection/recognition purposes. Running imagenet-camera/detectnet-camera apps I get some run-time errors such as 'cannot convert rgbtorgba'/'could not capture frame'...
The webcams can be initialized but I get some black screen with some noise and not a proper image of a scene..
any comments or help is very much appreciated

version of caffe being benchmarked

Hi,
for Caffe mentioned in readme's chart below, may I know which Caffe branch it is referring to?

I think it is not Nvidia's experimental/fp16 Caffe branch mentioned in Tx1 white paper.
This is because above graph shows for Googlenet batch1@fp16, above 51fps is 1.6x faster than 33fps in table below extracted from Tx1 white paper

Thank you.

cudaGetLastError

hi, @dusty-nv, when i make, i meet this problem, but i can't fix it, can you do me a favor?

[cuda]   cudaGetLastError()
[cuda]      invalid device function (error 8) (hex 0x08)
[cuda]      /home/ddk/ddk-repo/jetson-inference/imageNet.cu:50
[cuda]   cudaPreImageNet((float4*)rgba, width, height, mInputCUDA, mWidth, mHeight, make_float3(104.0069879317889f, 116.66876761696767f, 122.6789143406786f))
[cuda]      invalid device function (error 8) (hex 0x08)
[cuda]      /home/ddk/ddk-repo/jetson-inference/imageNet.cpp:146
imageNet::Classify() -- cudaPreImageNet failed
imagenet-console:  failed to classify 'orange_0.jpg'  (result=-1)

thanks.

Three example detection network models (ped-100,multiped-500,facenet-120) are not available

Hi, I am running this demo code. In the DetectNet section, the pretrained detection network models(ped-100,multiped-500,facenet-120) are needed. However, the download links are invalid. Are there any other ways to download the pre-trained models?
thanks!
eric

instructions about segNet for segmentation

Hi, I can not find information of instructions about segNet for segmentation , is there any manual for us to read in the project?

How to deploy the trained model in jetson-inference code

Hi there...
Thanks for the code.
I m new to training and inference. I trained a model with my own data set(nearly 300 images) in DIGITS and now trying to deploy it using Jetson-inference. I have a Jetson TX1 development kit and successfully installed the jetson-inference in it. But when I modified the caffe.model_file with caffemodel file Im getting the following errors.
[GIE] attempting to open cache file snapshot_iter_180.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading deploy.prototxt snapshot_iter_180.caffemodel
[GIE] failed to retrieve tensor for output 'prob'
[GIE] configuring CUDA engine
[GIE] building CUDA engine

It will be appreciated if you can give me some advice.

Compiling with -std=c++11 gives errors

I am using 14.04, with TensorRT etc installed(Custom Jetson Carrier boards)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Thu_May__5_22:52:38_CDT_2016
Cuda compilation tools, release 7.0, V7.0.74

Any suggestions to fix it ?

/usr/lib/gcc/aarch64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined

/usr/lib/gcc/aarch64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"

/usr/include/aarch64-linux-gnu/c++/4.8/bits/c++config.h(190): error: expected a ";"

/usr/include/c++/4.8/exception(63): error: expected a ";"

/usr/include/c++/4.8/exception(68): error: expected a ";"

Trouble building NVCaffe on TX1

When building nvcaffe on the TX1 on L4T 24.2 I ran into a couple of issues using the instructions listed on: https://github.com/dusty-nv/jetson-inference/blob/master/docs/building-nvcaffe.md

While loading dependencies, libboost-thread1.55-dev is not found. The current library on L4T 24.2 appears to be libboost-thread1.58-dev. The regular caffe branch uses libboost-all-dev, which I believe includes libboost-thread1
When compiling, there is an error with the hdf5 include and library files. Error:
src/caffe/net.cpp:8:18: fatal error: hdf5.h: No such file or directory
This appears to be an Ubuntu 16.04 issue. One way to fix it is to add hdf5 to the INCLUDE_DIRS environment variable when configuring the Makefile.config. On regular caffe, this is something like:
echo "INCLUDE_DIRS += /usr/include/hdf5/serial/" >> Makefile.config
when building the Makefile. There is a similar issue with finding the actual libraries themselves. Discussion:
BVLC/caffe#2347
One solution is to modify LIBRARY_DIRS in Makefile.config to include the hdf5 libraries as needed.

Configuration:
L4T 24.2
CUDA 8.0
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2)

[GIE] Error parsing layer type Deconvolution index 379

Can everyone help me how to solve this following error when i want to use deconvolution layer?

i see the following website, it shows that GIE supports deconvolution layer, but i try to add deconvolution layer to my prototxt and build the new caffe model whose format is not like goolgenet, alexnet and detectnet. I get this error.

https://devblogs.nvidia.com/parallelforall/production-deep-learning-nvidia-gpu-inference-engine/

[GIE] loading pvaNetClassifier.prototxt pvanet_frcnn_iter_100000.caffemodel
Caffe Parser: groups are not supported for deconvolutions
error parsing layer type Deconvolution index 379
[GIE] failed to parse caffe network

Thank you very much.

Profling leads to a FPS drop

Thank you for the example code.

When I change the profiling to "true" in the TensorNet-Class, the profiling works like a charm. However, the FrameRate drops from 30 to 23 when using the GoogleNet example.
Is there a way to speed this up? Maybe by writing the output from the profling into a file instead of printing it out or even make a second thread that runs, and prints the profling ?

I am using Nvidia Jetson TX1 Board with L4T 24.2

Greetings

Support for Jetson TK1

Can I install and run the program on Jetson TK1 ?

Caffe Parser: could not parse binary model file

Hi
I was using a TX1 dev board, I have flashed OS on it with Jetpack 2.3.1,
then downloaded code and built it successfully on the dev board.
but when I go to aarch64/bin/ to run ./imagenet-console orange_0.jpg output_0.jpg
I got the error Caffe Parser could not parser binary model file

like this

$ ./imagenet-console ./orange_0.jpg output_0.jpg
imagenet-console
args (3): 0 [./imagenet-console] 1 [./orange_0.jpg] 2 [output_0.jpg]

[GIE] attempting to open cache file bvlc_googlenet.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading googlenet.prototxt bvlc_googlenet.caffemodel
Caffe Parser: could not parse binary model file
Could not parse model file
[GIE] failed to parse caffe network
failed to load bvlc_googlenet.caffemodel
failed to load bvlc_googlenet.caffemodel
imageNet -- failed to initialize.
imagenet-console: failed to initialize imageNet

Will it be the problem of my downloading of the bvlc_googlenet.caffemodel file?

explanation on "mCoverageThreshold" and "threshold" parameters

Dear all,

Could someone please provide some explanation on the "mCoverageThreshold" and "threshold" parameters used in detectNet.cpp. These parameters seem to have big effects on detection results.

segnet overlay takes much time

When I run segnet, I got layer network time is 32ms and overlay time is 351ms. How could I reduce the time?

Different prediction results with ImageNet Inference than Digits on Custom GoogleNet Model

I trained a GoogleNet image classification network on a custom dataset using the default Googlenet network within Digits. My dataset consists of lots of 256x256 squashed images of playing cards. Basically crops of the playing cards that Digits squashed to 256x256. I had 11 different playing cards in total.

The training results were close to 100%, and Digits classify one reports accurate results for the test images I tested.

On the TX1 I simply copied over the Google Caffe Model, modified the prototype file for 11 classes, created the label file, and deleted the Cache'd file. I then used ImageNet-Console (with googlenet set) on the test images (after I ran them through digits to squash them to 256x256).

On most of the test images the results were good where they were consistent with what Digits reported, but on a few them they were way off.

Did I make some obvious mistake somewhere?

I also got the same kind of result with AlexNet. Where most of the test results are consistent with Digits except for a few cases that are way off.

dusty-nv / jetson-inference Goto Github PK

jetson-inference's Introduction

Deploying Deep Learning

Table of Contents

Hello AI World

System Setup

Inference

Training

WebApp Frameworks

Appendix

Jetson AI Lab

Video Walkthroughs

API Reference

jetson-inference

jetson-utils

Code Examples

Pre-Trained Models

Image Recognition

Object Detection

Semantic Segmentation

Pose Estimation

Action Recognition

Recommended System Requirements

Extra Resources

Two Days to a Demo (DIGITS)

jetson-inference's People

Contributors

Stargazers

Watchers

Forkers

jetson-inference's Issues

Recommend Projects

Recommend Topics

Recommend Org