Coder Social home page Coder Social logo

dusty-nv / jetson-inference Goto Github PK

View Code? Open in Web Editor NEW
7.3K 271.0 2.9K 151.29 MB

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Home Page: https://developer.nvidia.com/embedded/twodaystoademo

License: MIT License

CMake 0.87% Shell 5.80% C++ 55.56% Cuda 3.91% Python 14.60% C 1.08% Dockerfile 0.28% CSS 13.57% JavaScript 2.04% HTML 2.30%
deep-learning inference computer-vision embedded image-recognition object-detection segmentation jetson jetson-tx1 jetson-tx2

jetson-inference's Introduction

Deploying Deep Learning

Welcome to our instructional guide for inference and realtime vision DNN library for NVIDIA Jetson devices. This project uses TensorRT to run optimized networks on GPUs from C++ or Python, and PyTorch for training models.

Supported DNN vision primitives include imageNet for image classification, detectNet for object detection, segNet for semantic segmentation, poseNet for pose estimation, and actionNet for action recognition. Examples are provided for streaming from live camera feeds, making webapps with WebRTC, and support for ROS/ROS2.

Follow the Hello AI World tutorial for running inference and transfer learning onboard your Jetson, including collecting your own datasets, training your own models with PyTorch, and deploying them with TensorRT.

Table of Contents

>   JetPack 6 is now supported on Orin devices (developer.nvidia.com/jetpack)
>   Check out the Generative AI and LLM tutorials on Jetson AI Lab!
>   See the Change Log for the latest updates and new features.

Hello AI World

Hello AI World can be run completely onboard your Jetson, including live inferencing with TensorRT and transfer learning with PyTorch. For installation instructions, see System Setup. It's then recommended to start with the Inference section to familiarize yourself with the concepts, before diving into Training your own models.

System Setup

Inference

Training

WebApp Frameworks

Appendix

Jetson AI Lab

The Jetson AI Lab has additional tutorials on LLMs, Vision Transformers (ViT), and Vision Language Models (VLM) that run on Orin (and in some cases Xavier). Check out some of these:

NanoOWL - Open Vocabulary Object Detection ViT (container: nanoowl)

Live Llava on Jetson AGX Orin (container: local_llm)

Live Llava 2.0 - VILA + Multimodal NanoDB on Jetson Orin (container: local_llm)

Realtime Multimodal VectorDB on NVIDIA Jetson (container: nanodb)

Video Walkthroughs

Below are screencasts of Hello AI World that were recorded for the Jetson AI Certification course:

Description Video
Hello AI World Setup
Download and run the Hello AI World container on Jetson Nano, test your camera feed, and see how to stream it over the network via RTP.
Image Classification Inference
Code your own Python program for image classification using Jetson Nano and deep learning, then experiment with realtime classification on a live camera stream.
Training Image Classification Models
Learn how to train image classification models with PyTorch onboard Jetson Nano, and collect your own classification datasets to create custom models.
Object Detection Inference
Code your own Python program for object detection using Jetson Nano and deep learning, then experiment with realtime detection on a live camera stream.
Training Object Detection Models
Learn how to train object detection models with PyTorch onboard Jetson Nano, and collect your own detection datasets to create custom models.
Semantic Segmentation
Experiment with fully-convolutional semantic segmentation networks on Jetson Nano, and run realtime segmentation on a live camera stream.

API Reference

Below are links to reference documentation for the C++ and Python libraries from the repo:

jetson-inference

C++ Python
Image Recognition imageNet imageNet
Object Detection detectNet detectNet
Segmentation segNet segNet
Pose Estimation poseNet poseNet
Action Recognition actionNet actionNet
Background Removal backgroundNet actionNet
Monocular Depth depthNet depthNet

jetson-utils

These libraries are able to be used in external projects by linking to libjetson-inference and libjetson-utils.

Code Examples

Introductory code walkthroughs of using the library are covered during these steps of the Hello AI World tutorial:

Additional C++ and Python samples for running the networks on images and live camera streams can be found here:

C++ Python
   Image Recognition imagenet.cpp imagenet.py
   Object Detection detectnet.cpp detectnet.py
   Segmentation segnet.cpp segnet.py
   Pose Estimation posenet.cpp posenet.py
   Action Recognition actionnet.cpp actionnet.py
   Background Removal backgroundnet.cpp backgroundnet.py
   Monocular Depth depthnet.cpp depthnet.py

note: see the Array Interfaces section for using memory with other Python libraries (like Numpy, PyTorch, ect)

These examples will automatically be compiled while Building the Project from Source, and are able to run the pre-trained models listed below in addition to custom models provided by the user. Launch each example with --help for usage info.

Pre-Trained Models

The project comes with a number of pre-trained models that are available to use and will be automatically downloaded:

Image Recognition

Network CLI argument NetworkType enum
AlexNet alexnet ALEXNET
GoogleNet googlenet GOOGLENET
GoogleNet-12 googlenet-12 GOOGLENET_12
ResNet-18 resnet-18 RESNET_18
ResNet-50 resnet-50 RESNET_50
ResNet-101 resnet-101 RESNET_101
ResNet-152 resnet-152 RESNET_152
VGG-16 vgg-16 VGG-16
VGG-19 vgg-19 VGG-19
Inception-v4 inception-v4 INCEPTION_V4

Object Detection

Model CLI argument NetworkType enum Object classes
SSD-Mobilenet-v1 ssd-mobilenet-v1 SSD_MOBILENET_V1 91 (COCO classes)
SSD-Mobilenet-v2 ssd-mobilenet-v2 SSD_MOBILENET_V2 91 (COCO classes)
SSD-Inception-v2 ssd-inception-v2 SSD_INCEPTION_V2 91 (COCO classes)
TAO PeopleNet peoplenet PEOPLENET person, bag, face
TAO PeopleNet (pruned) peoplenet-pruned PEOPLENET_PRUNED person, bag, face
TAO DashCamNet dashcamnet DASHCAMNET person, car, bike, sign
TAO TrafficCamNet trafficcamnet TRAFFICCAMNET person, car, bike, sign
TAO FaceDetect facedetect FACEDETECT face
Legacy Detection Models
Model CLI argument NetworkType enum Object classes
DetectNet-COCO-Dog coco-dog COCO_DOG dogs
DetectNet-COCO-Bottle coco-bottle COCO_BOTTLE bottles
DetectNet-COCO-Chair coco-chair COCO_CHAIR chairs
DetectNet-COCO-Airplane coco-airplane COCO_AIRPLANE airplanes
ped-100 pednet PEDNET pedestrians
multiped-500 multiped PEDNET_MULTI pedestrians, luggage
facenet-120 facenet FACENET faces

Semantic Segmentation

Dataset Resolution CLI Argument Accuracy Jetson Nano Jetson Xavier
Cityscapes 512x256 fcn-resnet18-cityscapes-512x256 83.3% 48 FPS 480 FPS
Cityscapes 1024x512 fcn-resnet18-cityscapes-1024x512 87.3% 12 FPS 175 FPS
Cityscapes 2048x1024 fcn-resnet18-cityscapes-2048x1024 89.6% 3 FPS 47 FPS
DeepScene 576x320 fcn-resnet18-deepscene-576x320 96.4% 26 FPS 360 FPS
DeepScene 864x480 fcn-resnet18-deepscene-864x480 96.9% 14 FPS 190 FPS
Multi-Human 512x320 fcn-resnet18-mhp-512x320 86.5% 34 FPS 370 FPS
Multi-Human 640x360 fcn-resnet18-mhp-512x320 87.1% 23 FPS 325 FPS
Pascal VOC 320x320 fcn-resnet18-voc-320x320 85.9% 45 FPS 508 FPS
Pascal VOC 512x320 fcn-resnet18-voc-512x320 88.5% 34 FPS 375 FPS
SUN RGB-D 512x400 fcn-resnet18-sun-512x400 64.3% 28 FPS 340 FPS
SUN RGB-D 640x512 fcn-resnet18-sun-640x512 65.1% 17 FPS 224 FPS
  • If the resolution is omitted from the CLI argument, the lowest resolution model is loaded
  • Accuracy indicates the pixel classification accuracy across the model's validation dataset
  • Performance is measured for GPU FP16 mode with JetPack 4.2.1, nvpmodel 0 (MAX-N)
Legacy Segmentation Models
Network CLI Argument NetworkType enum Classes
Cityscapes (2048x2048) fcn-alexnet-cityscapes-hd FCN_ALEXNET_CITYSCAPES_HD 21
Cityscapes (1024x1024) fcn-alexnet-cityscapes-sd FCN_ALEXNET_CITYSCAPES_SD 21
Pascal VOC (500x356) fcn-alexnet-pascal-voc FCN_ALEXNET_PASCAL_VOC 21
Synthia (CVPR16) fcn-alexnet-synthia-cvpr FCN_ALEXNET_SYNTHIA_CVPR 14
Synthia (Summer-HD) fcn-alexnet-synthia-summer-hd FCN_ALEXNET_SYNTHIA_SUMMER_HD 14
Synthia (Summer-SD) fcn-alexnet-synthia-summer-sd FCN_ALEXNET_SYNTHIA_SUMMER_SD 14
Aerial-FPV (1280x720) fcn-alexnet-aerial-fpv-720p FCN_ALEXNET_AERIAL_FPV_720p 2

Pose Estimation

Model CLI argument NetworkType enum Keypoints
Pose-ResNet18-Body resnet18-body RESNET18_BODY 18
Pose-ResNet18-Hand resnet18-hand RESNET18_HAND 21
Pose-DenseNet121-Body densenet121-body DENSENET121_BODY 18

Action Recognition

Model CLI argument Classes
Action-ResNet18-Kinetics resnet18 1040
Action-ResNet34-Kinetics resnet34 1040

Recommended System Requirements

  • Jetson Nano Developer Kit with JetPack 4.2 or newer (Ubuntu 18.04 aarch64).
  • Jetson Nano 2GB Developer Kit with JetPack 4.4.1 or newer (Ubuntu 18.04 aarch64).
  • Jetson Orin Nano Developer Kit with JetPack 5.0 or newer (Ubuntu 20.04 aarch64).
  • Jetson Xavier NX Developer Kit with JetPack 4.4 or newer (Ubuntu 18.04 aarch64).
  • Jetson AGX Xavier Developer Kit with JetPack 4.0 or newer (Ubuntu 18.04 aarch64).
  • Jetson AGX Orin Developer Kit with JetPack 5.0 or newer (Ubuntu 20.04 aarch64).
  • Jetson TX2 Developer Kit with JetPack 3.0 or newer (Ubuntu 16.04 aarch64).
  • Jetson TX1 Developer Kit with JetPack 2.3 or newer (Ubuntu 16.04 aarch64).

The Transfer Learning with PyTorch section of the tutorial speaks from the perspective of running PyTorch onboard Jetson for training DNNs, however the same PyTorch code can be used on a PC, server, or cloud instance with an NVIDIA discrete GPU for faster training.

Extra Resources

In this area, links and resources for deep learning are listed:

Two Days to a Demo (DIGITS)

note: the DIGITS/Caffe tutorial from below is deprecated. It's recommended to follow the Transfer Learning with PyTorch tutorial from Hello AI World.

Expand this section to see original DIGITS tutorial (deprecated)
The DIGITS tutorial includes training DNN's in the cloud or PC, and inference on the Jetson with TensorRT, and can take roughly two days or more depending on system setup, downloading the datasets, and the training speed of your GPU.

© 2016-2019 NVIDIA | Table of Contents

jetson-inference's People

Contributors

abady1000 avatar andrejlevkovitch avatar asierarranz avatar bojle avatar datlife avatar devonsuper avatar dusty-nv avatar fivefishstudios avatar leonpano2006 avatar ligaz avatar magnus-eigenvision avatar marco-cheung avatar maximofn avatar mo-xiaoming avatar pfremm avatar rdarbha avatar sanidhya-30 avatar thomasperraudin avatar tkislan avatar tokk-nv avatar xmba15 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jetson-inference's Issues

Cuda error in file src/implicit_gemm.cu at line 406: invalid device function

I have successfully compiled the code. However, when run the detectent-console demo, I got the following error:

detectnet-console
  args (2):  0 [./detectnet-console]  1 [peds-004.p]  

[GIE]  attempting to open cache file multiped-500/snapshot_iter_178000.caffemodel.tensorcache
[GIE]  cache file not found, profiling network model
[GIE]  platform does not have FP16 support.
[GIE]  loading multiped-500/deploy.prototxt multiped-500/snapshot_iter_178000.caffemodel
[GIE]  configuring CUDA engine
[GIE]  building CUDA engine
Cuda error in file src/implicit_gemm.cu at line 406: invalid device function
detectnet-console: src/implicit_gemm.cu:769: virtual int dit::ImplicitGemm::run(void*, const void*, const void*, int, cudaStream_t): Assertion `mParamsSizeInBytes<=sizeof(ImplicitGemmLargeKernelParams)' failed.
Aborted (core dumped)

Can anyone provide me some help about it? Thanks so much!

lenet - Segmentation fault

Hi anyone tried lenet ?

i have added
Lenet caffe model and prototext
changed
https://github.com/itlab-vision/DLLibs-comparison/blob/master/caffe/mnist/caffe_mnist_classification/conf%20and%20trained%20model/lenet.caffemodel?raw=true

renamed it to lenet.caffemodel

https://raw.githubusercontent.com/BVLC/caffe/master/examples/mnist/lenet.prototxt

added the following lines.

enum NetworkType
{
        ALEXNET,
        GOOGLENET,
        LENET
};

/**
 * Load a new network instance
 */
static imageNet* Create( NetworkType networkType=LENET );

received an when program is executed.
Segmentation fault

tensorNet::LoadNetwork Seg Faults

When running the imagenet-camera example, the program seg faults. It appears to be in the call:

imageNet.cpp
imageNet::init -
tensorNet::LoadNetwork

Where the LoadNetwork function is called with NULL. Replacing the NULL with an empty string ("") allows the program to run.

Configuration:
L4T 24.2
CUDA 8.0
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2)

DetectNet mAP on the trained models ...

Hello,

As I see you have trained the DetectNet for several classes (3 models). may I ask you the mAP of your trained models and also the detection speed in FPS please?

make Error occurred:

/jetson-inference/build# make
[ 2%] Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
:0:2: warning: ISO C99 requires whitespace after the macro name
:0:7: warning: ISO C99 requires whitespace after the macro name
/usr/include/string.h: In function ‘void* __mempcpy_inline(void*, const void*, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
return (char *) memcpy (__dest, __src, __n) + __n;
^
CMake Error at jetson-inference_generated_cudaRGB.cu.o.cmake:266 (message):
Error generating file
/home/molys/Download_file/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaRGB.cu.o

CMakeFiles/jetson-inference.dir/build.make:249: recipe for target 'CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o' failed
make[2]: *** [CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaRGB.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/jetson-inference.dir/all' failed
make[1]: *** [CMakeFiles/jetson-inference.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

my machine is GTX1080+CUDA8.0+cudnn7.5
Thank you!

differences between inference results

Dear all,

I have trained a network for object detection. In Digits, the results are almost promising;however when I use exactly the same model out of digits using Jetson codes, the results are different. One possible discrepancy may result from the mean value of the network which I modified manually in the codes. I also took care of the image sizes in Digits and out of Digits to be exactly the same. Some other causes might be the "mCoverageThreshold" and the "threshold" parameters in detectNet.cpp which seem to be pre-fixed to "0.5", which might be different from the values used in Digits.
In particular, some explanation on the "mCoverageThreshold" and "threshold" parameters used in detectNet.cpp is helpful.
Any suggestions on the possible causes for this and how to fix them are more than welcome.

CUDNN Problem installing NVCaffe

@dusty-nv
I would like to use TensorRT and accompanying software that was recently made public by NVIDIA. I am working on installing your repo on my TX1, but first I must install the fp16 branch of NVCaffe.

I seem to be getting a CUDNN error with the version that comes with the most recent JetPack install. I can run make all, make test, and make pycaffe all with no problems. Then I try to run make runtest CUDA_DEVICES_VISIBLE=0 and get the following error about CUDNN.

F1003 02:38:10.825604 1032 cudnn_conv_layer.cpp:157] Check failed: status == CUDNN_STATUS_SUCCESS (9 vs. 0) CUDNN_STATUS_NOT_SUPPORTED
*** Check failure stack trace: ***
@ 0x7f8bfb1718 google::LogMessage::Fail()
@ 0x7f8bfb3614 google::LogMessage::SendToLog()
@ 0x7f8bfb1290 google::LogMessage::Flush()
@ 0x7f8bfb3eb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8ade7424 caffe::CuDNNConvolutionLayer<>::Reshape()
@ 0x7f8ae8a8a8 caffe::Net<>::Init()
@ 0x7f8ae8c0f8 caffe::Net<>::Net()
@ 0x662e7c caffe::NetTest<>::InitNetFromProtoString()
@ 0x607dc0 caffe::NetTest<>::InitReshapableNet()
@ 0x66b12c caffe::NetTest_TestReshape_Test<>::TestBody()
@ 0xa71e6c testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0xa6b174 testing::Test::Run()
@ 0xa6b2b0 testing::TestInfo::Run()
@ 0xa6b370 testing::TestCase::Run()
@ 0xa6c4d0 testing::internal::UnitTestImpl::RunAllTests()
@ 0xa6c7e4 testing::UnitTest::Run()
@ 0x56c0e8 main
@ 0x7f8a9728a0 __libc_start_main
Makefile:552: recipe for target 'runtest' failed
make: *** [runtest] Aborted

Do I need to use a different version of CUDNN than is present with the JetPack?

Build fails

Steps to reproduce

CMake Error: The following variables are used in this project, but they are set to NOTFOUND.
Please set them or make sure they are set and tested correctly in the CMake files:
CUDA_CUDART_LIBRARY (ADVANCED)
    linked by target "jetson-inference" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference
    linked by target "imagenet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/imagenet-console
    linked by target "imagenet-camera" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/imagenet-camera
    linked by target "detectnet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/detectnet-console
    linked by target "detectnet-camera" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/detectnet-camera
    linked by target "segnet-console" in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference/segnet-console
CUDA_TOOLKIT_INCLUDE (ADVANCED)
   used as include directory in directory /home/daniil/Projects/Dividiti/tensort/jetson-inference
...
-- Configuring incomplete, errors occurred!
See also "/home/daniil/Projects/Dividiti/tensort/jetson-inference/build/CMakeFiles/CMakeOutput.log".
See also "/home/daniil/Projects/Dividiti/tensort/jetson-inference/build/CMakeFiles/CMakeError.log".
make: *** No targets specified and no makefile found.  Stop.

Detectnet fine tuning

Hello Sir,

First I appreciate about your provided information.
My question is basically about DetectNet.

  1. You have provided 3 trained models. may I ask you where can I download them? I mean ped-100, multiped-500, facenet-120.

  2. Do you have any more trained model rather than above mentioned ones?

  3. I use DIGITS 5.1, and I decide to find tune model instead of training from scratch. Then may I ask you which _.prototxt should be used in accompany with CaffeModel? I mean deploy.prototxt or original.prototxt or solver.prototxt or train_val.prototxt?

Mean Value in detectNet

Hello,

I got a question regarding the mean value used in detectNet.cpp: the mean value seems to be set to "104.0069879317889f, 116.66876761696767f, 122.6789143406786f", line 166 in detetNet.cpp, regardless of the trained model. Shouldn't this value be a variable and read from the mean.binaryproto data file?

many thanks,
Shervin

Build problems on Ubuntu 14.04 Host

Hello,

I can successfully built and run the example Apps on my TX1. As an exercise I've been trying to compile the code on my host machine. Is this supposed to work out of the box? I get many errors such as:

/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined
/usr/lib/gcc/x86_64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"
/usr/include/x86_64-linux-gnu/c++/4.8/bits/c++config.h(190): error: expected a ";"

I haven't modified the CMake configuration file, and just followed the instructions as is. I've attached a log of my make command output, as well as the Makefile generated by CMake.

makefile_output.txt
Makefile.txt

Any clues why this is happening, or if it should happen?

Best Regards,
Piyush3dB.

Followed the instructions... cmake build overwhelms storage

Hi, following the instructions of this repo, the build for cmake produces a file that spans into several gigabytes. I was not able to use the "make" command on this file afterwards since there was no makefile. I believe I ran out of space on the Jetson TX-1.

Make Error: Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o

Can everyone help me how to solve this following error when i type 'make'?

[ 2%] Building NVCC (Device) object CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o
nvcc fatal : Cannot compile in the 32-bit mode when the host compiler targets aarch64.
CMake Error at jetson-inference_generated_cudaYUV-YV12.cu.o.cmake:207 (message):
Error generating
/media/share33/User/henrylee/tx1/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaYUV-YV12.cu.o

CMakeFiles/jetson-inference.dir/build.make:119: recipe for target 'CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o' failed
make[2]: *** [CMakeFiles/jetson-inference.dir/cuda/jetson-inference_generated_cudaYUV-YV12.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/jetson-inference.dir/all' failed
make[1]: *** [CMakeFiles/jetson-inference.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

Thank you very much.

Using custom Digits trained model in TX1 inference.

Hello Dusty.

I managed to successfully run your example with no issues (Jetpack 2.3). I would like to use my own DetectNet, which I trained using Digits 4 as shown in this example (https://github.com/NVIDIA/DIGITS/tree/master/examples/object-detection), for inference on TX1. I believe that I have changed all the corresponding fields of your repository in order to load my own .prototxt, .caffemodel and .binaryproto files. After running the detectnet-console example I get the following error:

./detectnet-console /home/ubuntu/Downloads/car2.jpg
detectnet-console
args (2): 0 [./detectnet-console] 1 [/home/ubuntu/Downloads/car2.jpg]

[GIE] platform has FP16 support.
[GIE] loading /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/deploy.prototxt /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/snapshot_iter_191190.caffemodel
could not parse layer type Python
[GIE] failed to parse caffe network
failed to load /home/ubuntu/code/digits/gie_dusty1/jetson-inference/detectnet_car_model/snapshot_iter_191190.caffemodel
detectNet -- failed to initialize.
detectnet-console: failed to initialize detectNet

Tracing back this leads me to the detectNet.cpp file where I found that the error occurs during the net->LoadNetwork(...) command. Is there any idea about the origin of this problem?

Thank you!

Jetson TK1 inference

Hi,

Not really sure if this is the best place to ask this but is there any change to use this repository on a TK1?
More generally, is there any change to deploy detectnet on a TK1?

Thanks in advance

Drawing Rectangle

I want to draw the rectangle from the results of detectNet.

I think I should use opengl.

right? Do you have plan to implement rectangle function?

How to Install on ubuntu 14.04 with custom host

Hi,
I am try to install on my ubutu host. ubuntu 14.04 and cuda version 8.0
But it seen many issue, I reference the another issue "Build problems on Ubuntu 14.04 Host" and he's CMakeLists.txt but it still has error with
uint32_t undefined and I put #include on the of cudaOverlay.cu and cudaOverlay.h
It seen ok to run.
But after it ,another question is coming.
It say Nvinfer.h : no such file or directory
How to slove it
Thank you everyone!!

Feature Request - Saving Image + Label file within DetectNet-Camera.cpp

I think it would be really useful for certain use cases to be able to save an image plus the label file on command.

That way we can build a dataset that can then be adjusted/fixed before adding it back to the original dataset used to create the model. To retrain the model with the more data.

In my case I'm actually using the object detection boundary boxes to do a separate classification network. For example if I had hundreds of toy cars I would want to first find the cars using detectnet, and then I'd want to run a separate classification on the cars to determine the exact type. Like a Porsche 911 or a Ford F150.

I'm trying to do this myself, but of course I immediately ran into an issue due to my lack of knowledge with CUDA (never used it before). Where I'm confused about the GPU memory space versus the CPU memory space. I'm using the saveImageRGBA function that's used within detectnet-console.cpp, but it looks like its meant for an data within the GPU memory. For detectnet-camera it looks like the frame gets captured to shared image space, and then it gets converted to RGBA within the GPU space. I get a segmentation fault when I try using the saveImageRGBA function.

Inferencing on desktop PASCAL GPU with TensorRT 1.0 RC on CUDA 8.0

I have recently obtained and installed TensorRT 1.0RC on an early access program. However, this jetson-inference repo should work on PASCAL based GTX 1080 running CUDA 8.0 with the GIE provided by NVIDIA for demo. Running 'make' command generates error:

Error limit reached. 100 errors detected in the compilation of "/tmp/tmpxft_000009d1_00000000-7_cudaOverlay.cpp1.ii". Compilation terminated. CMake Error at jetson-inference_generated_cudaOverlay.cu.o.cmake:264 (message): Error generating file /home/xhuv/jetson-inference/build/CMakeFiles/jetson-inference.dir/cuda/./jetson-inference_generated_cudaOverlay.cu.o

What change should be made to run this code on desktop GPUs other than TX1?

jetson-inference detectnet-camera can not using training model?

Dgits training model success. https://github.com/NVIDIA/DIGITS
using : cuda8.0 , caffe, digits5;
GoogleNet model
jetson-inference detectnet-camera can not using this model, error follow:

[GIE] attempting to open cache file vehicle/snapshot_iter_21120.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading vehicle/deploy.prototxt vehicle/snapshot_iter_21120.caffemodel
[GIE] failed to retrieve tensor for output 'coverage'
[GIE] failed to retrieve tensor for output 'bboxes'

[GIE] configuring CUDA engine
[GIE] building CUDA engine

Does TensorRT support Batch Normalization?

Hi, I have re-trained DetectNet with batch normalization layers , but I failed to run TensorRT over this caffe model.
When load the model file, here shows some error infos:
Message type "diccaffe.BatchNormParameter" has no filed name "scale_filler", could not parse deploy file.
After I delete the batch_norm_param filed in the deploy file, it shows another error info:
caffeParser.cpp:613: bool bnConvert(const nvinfer1::Weights&, const nvinfer1::Weights&, const nvinfer1::Weights&, float, nvinfer1::Weights&, nvinfer1::Weights&, std::vector<void*>&) [with T = float]: Assertion mean.count == variance.count && movingAverage.count == 1' failed.`

It will be appreciated if you can give me some advice.

TX1 Running the Live Camera Detection Demo FPS

Hello!!!
I successful Building from Source on TX1(Before is ubuntu 14.04 + GTX980)
I try the Running the Live Camera Detection Demo
I have some question with the FPS.

./detectnet-camera multiped-500 FPS is about 6.7 FPS
./detectnet-camera ped-100 FPS is about 6.7 FPS
./detectnet-camera facenet-120 is about 12.3 FPS

Does the FPS is correct ? It looks a little slow...
How can I accelerate it ?

Another question is , I put my own model create by https://github.com/NVIDIA/DIGITS/tree/digits-4.0/examples/object-detection
And change "multiped-500 file" to it . It successful to detect car.
But the FPS still 6~7.
How can I accelerate it ?

id's of classes in multiped-500 model

@dusty-nv in the original.prototxt for the multiped-500 model it looks like there are 5 classes. In your example you reference pedestrians and luggage - what are the other 3 classes?
Also, is this multi-class dataset available if we wish to fine tune the model?
Thanks for the useful examples.

jetson-inference detectnet-camera can not using training model

Dgits training model success.
https://github.com/dusty-nv/jetson-inference/tree/master/detectnet-camera
jetson-inference detectnet-camera can not using this model, error follow:
Dgits training model success.
https://github.com/dusty-nv/jetson-inference/tree/master/detectnet-camera
jetson-inference detectnet-camera can not using this model, error follow:

[GIE] attempting to open cache file vehicle/snapshot_iter_21120.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading vehicle/deploy.prototxt vehicle/snapshot_iter_21120.caffemodel
[GIE] failed to retrieve tensor for output 'coverage'
[GIE] failed to retrieve tensor for output 'bboxes'

[GIE] configuring CUDA engine
[GIE] building CUDA engine

DetectNet Demo : caffemodel file missing

Hi Dustin !
I tried to run your demo for Detection network.
You mention that : "three example detection network models are are automatically downloaded during the repo source configuration"
But it seems that they are missing.
Thanks
Alex

Question: Can you load up two Networks at once?

Currently I use detectnet-camera to detect the position of objects within the live camera frame, and then I write out the image crops for the objects found.

Now I want to run the image crops through a trained imagenet classification network to identify exactly what the object is (like a ford pinto instead of simply being a vehicle).

Do I have to stop the DetectNet network in order to do this? Or can I run an imagenet based classification while the detectnet is still loaded?

I want to be able to go between the two with as little reduction on the frame rate as possible.

I want to add my customized layer

I want to add my customized layer in Jetson-Inference demo.

Currently, GIE does not support customized layers as well as latest caffe versions.

I want to add my caffe distribute(libcaffe.so) in this camera demo by removing GIE.

Is it possible?

Segnet can't load custom network

I followed the semantic segmentation example for DIGITS 5 to train my own model. I tried to load it with segnet but I get this:

[GIE]  attempting to open cache file seg-voc/snapshot.caffemodel.tensorcache
[GIE]  cache file not found, profiling network model
[GIE]  platform has FP16 support.
[GIE]  loading seg-voc/deploy.prototxt seg-voc/snapshot.caffemodel
[libprotobuf FATAL ../../../externals/protobuf/aarch64/10.0/include/google/protobuf/repeated_field.h:1378] CHECK failed: (index) < (current_size_): 
terminate called after throwing an instance of 'google::protobuf::FatalException'
  what():  CHECK failed: (index) < (current_size_): 
Aborted

I think that this has something to do with the deploy file because I guess that is what the protobuf lib is for... However, my deploy.prototxt is well formed and almost identical to the ones in the segnet examples, except for input size and initial padding.

webcam usage

Dear all,
Thanks for the codes and comments provided.
I have been able to compile and run the examples corresponding to console (detectnet-console and imagenet-console) on a "PC" running Ubuntu 14.04 but unfortunately the webcam cannot be used (the detectnet-camera and imagenet-camera apps do not run). So far I have tested two different webcams and neither one could be used for detection/recognition purposes. Running imagenet-camera/detectnet-camera apps I get some run-time errors such as 'cannot convert rgbtorgba'/'could not capture frame'...
The webcams can be initialized but I get some black screen with some noise and not a proper image of a scene..
any comments or help is very much appreciated

cudaGetLastError

hi, @dusty-nv, when i make, i meet this problem, but i can't fix it, can you do me a favor?

[cuda]   cudaGetLastError()
[cuda]      invalid device function (error 8) (hex 0x08)
[cuda]      /home/ddk/ddk-repo/jetson-inference/imageNet.cu:50
[cuda]   cudaPreImageNet((float4*)rgba, width, height, mInputCUDA, mWidth, mHeight, make_float3(104.0069879317889f, 116.66876761696767f, 122.6789143406786f))
[cuda]      invalid device function (error 8) (hex 0x08)
[cuda]      /home/ddk/ddk-repo/jetson-inference/imageNet.cpp:146
imageNet::Classify() -- cudaPreImageNet failed
imagenet-console:  failed to classify 'orange_0.jpg'  (result=-1)

thanks.

How to deploy the trained model in jetson-inference code

Hi there...
Thanks for the code.
I m new to training and inference. I trained a model with my own data set(nearly 300 images) in DIGITS and now trying to deploy it using Jetson-inference. I have a Jetson TX1 development kit and successfully installed the jetson-inference in it. But when I modified the caffe.model_file with caffemodel file Im getting the following errors.
[GIE] attempting to open cache file snapshot_iter_180.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading deploy.prototxt snapshot_iter_180.caffemodel
[GIE] failed to retrieve tensor for output 'prob'
[GIE] configuring CUDA engine
[GIE] building CUDA engine

It will be appreciated if you can give me some advice.

Compiling with -std=c++11 gives errors

I am using 14.04, with TensorRT etc installed(Custom Jetson Carrier boards)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Thu_May__5_22:52:38_CDT_2016
Cuda compilation tools, release 7.0, V7.0.74

Any suggestions to fix it ?

/usr/lib/gcc/aarch64-linux-gnu/4.8/include/stddef.h(432): error: identifier "nullptr" is undefined

/usr/lib/gcc/aarch64-linux-gnu/4.8/include/stddef.h(432): error: expected a ";"

/usr/include/aarch64-linux-gnu/c++/4.8/bits/c++config.h(190): error: expected a ";"

/usr/include/c++/4.8/exception(63): error: expected a ";"

/usr/include/c++/4.8/exception(68): error: expected a ";"

Trouble building NVCaffe on TX1

When building nvcaffe on the TX1 on L4T 24.2 I ran into a couple of issues using the instructions listed on: https://github.com/dusty-nv/jetson-inference/blob/master/docs/building-nvcaffe.md

  1. While loading dependencies, libboost-thread1.55-dev is not found. The current library on L4T 24.2 appears to be libboost-thread1.58-dev. The regular caffe branch uses libboost-all-dev, which I believe includes libboost-thread1
  2. When compiling, there is an error with the hdf5 include and library files. Error:
    src/caffe/net.cpp:8:18: fatal error: hdf5.h: No such file or directory
    This appears to be an Ubuntu 16.04 issue. One way to fix it is to add hdf5 to the INCLUDE_DIRS environment variable when configuring the Makefile.config. On regular caffe, this is something like:
    echo "INCLUDE_DIRS += /usr/include/hdf5/serial/" >> Makefile.config
    when building the Makefile. There is a similar issue with finding the actual libraries themselves. Discussion:
    BVLC/caffe#2347
    One solution is to modify LIBRARY_DIRS in Makefile.config to include the hdf5 libraries as needed.

Configuration:
L4T 24.2
CUDA 8.0
gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.2)

[GIE] Error parsing layer type Deconvolution index 379

Can everyone help me how to solve this following error when i want to use deconvolution layer?

i see the following website, it shows that GIE supports deconvolution layer, but i try to add deconvolution layer to my prototxt and build the new caffe model whose format is not like goolgenet, alexnet and detectnet. I get this error.

https://devblogs.nvidia.com/parallelforall/production-deep-learning-nvidia-gpu-inference-engine/

[GIE] loading pvaNetClassifier.prototxt pvanet_frcnn_iter_100000.caffemodel
Caffe Parser: groups are not supported for deconvolutions
error parsing layer type Deconvolution index 379
[GIE] failed to parse caffe network

Thank you very much.

Profling leads to a FPS drop

Thank you for the example code.

When I change the profiling to "true" in the TensorNet-Class, the profiling works like a charm. However, the FrameRate drops from 30 to 23 when using the GoogleNet example.
Is there a way to speed this up? Maybe by writing the output from the profling into a file instead of printing it out or even make a second thread that runs, and prints the profling ?

I am using Nvidia Jetson TX1 Board with L4T 24.2

Greetings

Caffe Parser: could not parse binary model file

Hi
I was using a TX1 dev board, I have flashed OS on it with Jetpack 2.3.1,
then downloaded code and built it successfully on the dev board.
but when I go to aarch64/bin/ to run ./imagenet-console orange_0.jpg output_0.jpg
I got the error Caffe Parser could not parser binary model file

like this

$ ./imagenet-console ./orange_0.jpg output_0.jpg
imagenet-console
args (3): 0 [./imagenet-console] 1 [./orange_0.jpg] 2 [output_0.jpg]

[GIE] attempting to open cache file bvlc_googlenet.caffemodel.tensorcache
[GIE] cache file not found, profiling network model
[GIE] platform has FP16 support.
[GIE] loading googlenet.prototxt bvlc_googlenet.caffemodel
Caffe Parser: could not parse binary model file
Could not parse model file
[GIE] failed to parse caffe network
failed to load bvlc_googlenet.caffemodel
failed to load bvlc_googlenet.caffemodel
imageNet -- failed to initialize.
imagenet-console: failed to initialize imageNet

Will it be the problem of my downloading of the bvlc_googlenet.caffemodel file?

Different prediction results with ImageNet Inference than Digits on Custom GoogleNet Model

I trained a GoogleNet image classification network on a custom dataset using the default Googlenet network within Digits. My dataset consists of lots of 256x256 squashed images of playing cards. Basically crops of the playing cards that Digits squashed to 256x256. I had 11 different playing cards in total.

The training results were close to 100%, and Digits classify one reports accurate results for the test images I tested.

On the TX1 I simply copied over the Google Caffe Model, modified the prototype file for 11 classes, created the label file, and deleted the Cache'd file. I then used ImageNet-Console (with googlenet set) on the test images (after I ran them through digits to squash them to 256x256).

On most of the test images the results were good where they were consistent with what Digits reported, but on a few them they were way off.

Did I make some obvious mistake somewhere?

I also got the same kind of result with AlexNet. Where most of the test results are consistent with Digits except for a few cases that are way off.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.