Coder Social home page Coder Social logo

nvcaffe's People

Contributors

borisfom avatar borisgin avatar cypof avatar dgolden1 avatar drnikolaev avatar ducha-aiki avatar eelstork avatar erictzeng avatar flx42 avatar jamt9000 avatar jeffdonahue avatar kkhoot avatar kloudkl avatar longjon avatar lukeyeager avatar mavenlin avatar mohomran avatar nv-slayton avatar philkr avatar qipeng avatar rbgirshick avatar ronghanghu avatar sergeyk avatar sguada avatar shelhamer avatar slayton58 avatar thatguymike avatar tnarihi avatar yangqing avatar yosinski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nvcaffe's Issues

Documentation on major differences from BVLC/caffe

Hi,

I recently migrated from BVLC/caffe to nvcaffe, and there is a significant speedup during inference.
The options for layer-specific mixed-precision support is awesome.
I would also like to use nvcaffe for training and I have a few questions:

  1. There is an enum Packing in caffe.proto which can be set to NCHW(default) or NHWC. If we set Packing to NHWC in the data layer, will the packing info be automatically transmitted to all the layers in the network ? Intuitively, it feels as though NHWC should be more efficient for both convolution and batch-norm layers, particularly for CUDNN.
  2. The data augmentations are a bit confusing. Are the transformations in data_transformer.cu carried out synchronously on the GPU or are they done asynchronously with the data prefetch threads ?
  3. There is also another layer called detectnet_transform_layer.[hpp/cpp/cu] which has its own set of transformations. When/how exactly is this layer used ? In particular, I am interested in image-to-image translation problems where both input and label are images of the same size and we mirror the transformations (random crops, flips, scale, mean subtraction, etc) on both the input and label image. I was wondering if this layer could be used for that.
  4. Is multi-GPU inference with a single large image possible ?
  5. Is there some way to further reduce memory during inference for large images ? For fully convolutional architectures, a lot of memory can be saved for instance by not storing most of the intermediate layer activations and using a single buffer for most convolution layers.

Finally, if there is some specific documentation for nvcaffe, that would also be awesome.

A mismatch found in your code and paper.

In your paper, I find

(1) we get the local learning rate for each learnable parameter by α = l×||w||2/(||∇w||2+β||∇w||2);

But in your code,

 rate = gw_ratio * w_norm / (wgrad_norm + weight_decay * w_norm);

The code and equation doesn't match. Is it a type error in your paper?
α equals l×||w||2/(||∇w||2+β||w||2); I think.

make: *** [.build_release/lib/libcaffe-nv.so.0.16.4] Error 1

LD -o .build_release/lib/libcaffe-nv.so.0.16.4
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libturbojpeg.a(libturbojpeg_la-turbojpeg.o): relocation R_X86_64_32 against `.data' can not be used when making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/libturbojpeg.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
Makefile:605: recipe for target '.build_release/lib/libcaffe-nv.so.0.16.4' failed
make: *** [.build_release/lib/libcaffe-nv.so.0.16.4] Error 1

Our workstation with...
Ubuntu:16.04
Cuda: 8.0
CuDNN: 7.05

Hi, did i mistake something? I'm seeking for answer about this. Thank you.

[dead silence] NVcaffe Digit Object Detection With TensorRT GoogleNet dead silence

Conda 2.7.9 + Digits 6.0.0 + NVcaffe 0.16.4 + 4 X P40
All goes well with last line in caffe_oupt.log
#####################################################################
I1117 11:02:59.804045 15357 caffe.cpp:226] Starting Optimization
I1117 11:02:59.804064 15357 solver.cpp:386] Solving
I1117 11:02:59.804067 15357 solver.cpp:387] Learning Rate Policy: exp
I1117 11:02:59.822211 15357 net.cpp:1358] [3] Reserving 23918336 bytes of shared learnable space
I1117 11:02:59.823530 15357 solver.cpp:457] Iteration 0, Testing net (#0)
I1117 11:02:59.823545 15357 net.cpp:1004] Ignoring source layer train_data
I1117 11:02:59.823549 15357 net.cpp:1004] Ignoring source layer train_label
I1117 11:02:59.823552 15357 net.cpp:1004] Ignoring source layer train_transform
I1117 11:02:59.824621 15370 device_alternate.hpp:116] NVML initialized on thread 139987149186816
I1117 11:02:59.953352 15370 common.cpp:585] NVML succeeded to set CPU affinity on device 3
#####################################################################
But Caffe & Digit Freeze At
#####################################################################
Train Caffe Model Running
0%
#####################################################################
For several hours.

I shift to nvcaffe 0.15.13, its the same.
While ALL CLASSIFICATION JOBS GOES WELL .

I followed this demo link provided by nv:
https://github.com/dusty-nv/jetson-inference#locating-object-coordinates-using-detectnet
with our own dataset preprocessed by digits.

Can anyone help me out?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.