ijkguo / mx-rcnn Goto Github PK

View Code? Open in Web Editor NEW

670.0 670.0 291.0 758 KB

Parallel Faster R-CNN implementation with MXNet.

License: Other

Python 100.00%

mx-rcnn's People

Contributors

Stargazers

Watchers

Forkers

clcarwin wanjinchang shinexunju matrixping snakeshell jinghsu tanshen desperado1992 tornadomeet zhangjiangqige kafkafield qingswu wranglerwong starimpact argman jevonswang longwoo wait1988 fulquan vikingmew kekedan rogerborras yanshanjing arsenluca absorbguo zhanghan328 rollingstone whiteisclosing soledad89 edwardzeng liangfu johnsoningzhuang huaijin-chen baiyancheng20 batracos acaridor ieee820 cc13ny yyuzhong dihong yutinglin mengjiexu dengcy028 shenfe leewaymay statml li-haoran rainymorning weiliuxm junshipeng cypw benjamesbabala zhxwmessi remyyang jade2014 yuwenxiong myownskyw7 mlzxy pkuteaboss cepera-ang kevinwangthu sxjscience sz94 knightofdawn esraaragaa jiangxiaoyan caomw n01z3 stupidzz gxieaa hxl1990 jm2981858 huangzehao gu-yan bestlin dominjune zhangxgu mornydew xychen9459 strategist922 flyflywang zehaos guker xzhaogit scholltan cosecant-csc qoboty sunnflower thomasdic2000 cybercoderbot a382695908 ricelingz wtdeng hengshan123 leihuan925 emali2hf chuzhiml ryfan-rs traceonjyp zgsxwsdxg

mx-rcnn's Issues

AttributeError: 'module' object has no attribute 'Proposal'

i just run the demo,it show the error:
Traceback (most recent call last):
File "demo.py", line 139, in
main()
File "demo.py", line 117, in main
symbol = get_vgg_test()
File "/home/mx-rcnn/rcnn/symbol/symbol_vgg.py", line 276, in get_vgg_test
rois = mx.symbol.Proposal(
AttributeError: 'module' object has no attribute 'Proposal'

Error of mxnet/example/rcnn/demo.py

I executed demo.py(not mx-rcnn but mxnet/example/rcnn), and then I got the below error. What should I do to solve this error? I would like you to write more information about the rcnn because it was difficult for me to understand.

Keiku@ubuntu:~/opt/mxnet/example/rcnn$ python demo.py --gpu 0 --prefix ../../tools/caffe_converter/vgg16 --epoch 1
Traceback (most recent call last):
  File "demo.py", line 32, in <module>
    demo_net(detector, os.path.join(os.getcwd(), 'data', 'demo', '000004'))
  File "/home/Keiku/opt/mxnet/example/rcnn/tools/demo_net.py", line 33, in demo_net
    scores, boxes = detector.im_detect(im_array, roi_array)
  File "/home/Keiku/opt/mxnet/example/rcnn/rcnn/detector.py", line 46, in im_detect
    grad_req='null', aux_states=self.aux_params)
  File "/home/Keiku/.pyenv/versions/anaconda-4.0.0/lib/python2.7/site-packages/mxnet-0.5.0-py2.7.egg/mxnet/symbol.py", line 783, in bind
    args_handle, args = self._get_ndarray_inputs('args', args, listed_arguments, False)
  File "/home/Keiku/.pyenv/versions/anaconda-4.0.0/lib/python2.7/site-packages/mxnet-0.5.0-py2.7.egg/mxnet/symbol.py", line 625, in _get_ndarray_inputs
    raise ValueError('Must specify all the arguments in %s' % arg_key)
ValueError: Must specify all the arguments in args
[11:10:50] /home/Keiku/opt/mxnet/dmlc-core/include/dmlc/logging.h:235: [11:10:50] /home/Keiku/opt/mxnet/mshadow/mshadow/./stream_gpu-inl.h:98: Check failed: (err) == (CUBLAS_STATUS_SUCCESS) Create cublas handle failed
[11:10:50] /home/Keiku/opt/mxnet/dmlc-core/include/dmlc/logging.h:235: [11:10:50] /home/Keiku/opt/mxnet/mshadow/mshadow/./stream_gpu-inl.h:98: Check failed: (err) == (CUBLAS_STATUS_SUCCESS) Create cublas handle failed
terminate called recursively
terminate called after throwing an instance of 'dmlc::Error'
  what():  [11:10:50] /home/Keiku/opt/mxnet/mshadow/mshadow/./stream_gpu-inl.h:98: Check failed: (err) == (CUBLAS_STATUS_SUCCESS) Create cublas handle failed
Aborted (core dumped)
Keiku@ubuntu:~/opt/mxnet/example/rcnn$

I put the data/demo as below.

Keiku@ubuntu:~/opt/mxnet/example/rcnn/data$ ll
total 468544
drwxrwxr-x 5 Keiku Keiku      4096 May 25 11:35 ./
drwxrwxr-x 8 Keiku Keiku      4096 May 27 11:00 ../
drwxrwxr-x 2 Keiku Keiku      4096 May 25 10:51 demo/
drwxr-xr-x 2 Keiku Keiku      4096 Apr 30  2015 selective_search_data/
-rw-rw-r-- 1 Keiku Keiku 479764750 Apr 30  2015 selective_search_data.tgz
drwxrwxr-x 2 Keiku Keiku      4096 May 25 10:32 VOCdevkit/

Keiku@ubuntu:~/opt/mxnet/example/rcnn/data/demo$ ll
total 224
drwxrwxr-x 2 Keiku Keiku   4096 May 25 10:51 ./
drwxrwxr-x 5 Keiku Keiku   4096 May 25 11:35 ../
-rw-rw-r-- 1 Keiku Keiku  23296 May 25 10:51 000004_boxes.mat
-rw-rw-r-- 1 Keiku Keiku 102770 May 25 10:51 000004.jpg
-rw-rw-r-- 1 Keiku Keiku  16648 May 25 10:51 001551_boxes.mat
-rw-rw-r-- 1 Keiku Keiku  69440 May 25 10:51 001551.jpg
Keiku@ubuntu:~/opt/mxnet/example/rcnn/data/demo$

Memory error

Traceback (most recent call last):
File "_ctypes/callbacks.c", line 314, in 'calling callback function'
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\operator.py", line 600, in creator
op_prop = prop_cls(**kwargs)
TypeError: init() got an unexpected keyword argument 'ratios'
Traceback (most recent call last):
File "D:\Anaconda2\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "D:\Anaconda2\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "F:\mx-rcnn\tools\train_alternate.py", line 209, in
args.kv_store, args.work_load_list)
File "F:\mx-rcnn\tools\train_alternate.py", line 143, in alternate_train
test_rpn(image_set, year, root_path, devkit_path, 'model/rpn1', rpn_epoch, ctx)
File "F:\mx-rcnn\tools\train_alternate.py", line 68, in test_rpn
sym = get_vgg_rpn_test()
File "rcnn\symbol.py", line 216, in get_vgg_rpn_test
op_type='proposal', feat_stride=16, scales=(8, 16, 32), ratios=(0.5, 1, 2), output_score=True)
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\symbol.py", line 1062, in creator
ctypes.byref(sym_handle)))
WindowsError: exception: access violation writing 0x0000000100000000

Is it because the memory of my PC is insufficient?

Is the result of demo right?

I use the following arguments:
python demo.py --prefix final --epoch 0 --image data/VOCdevkit2007/VOC2007/JPEGImages/000001.jpg --gpu 2

and the results is:
person [ 69.18438721 5.09894276 599. 822.22625732] 0.998187

but the width and height of 000001.jpg is (353, 500), the above box is out of border. is it fine?

multi-gpu training

Hi @precedenceguo

After porting to MXNET, do you think we can use multi-gpu to training?

Another question, do you have plan to porting py-faster-rcnn?

thank you!

Check failed: (param_.workspace) >= (required_size)

hello, I run mx_rcnn , but itraise following error:
/home/ubuntu/mxnet/dmlc-core/include/dmlc/logging.h:235: [10:48:17] src/operator/./convolution-inl.h:296: Check failed: (param_.workspace) >= (required_size)
Minimum workspace size: 1228800000 Bytes
Given: 1073741824 Bytes

I decreased "workspce" paramter from in vgg16-symbol.json and "max_data_shape" in "train_rpn.py", also installed cudnn, but it seems nothing changes , would you please help me ?

how to use the demo.py

May I ask how to use the demo.py ? Since, now there is no data in the data folder.... Previously, there is two images as well as their bounding boxes in .mat format. However, when use the command "python demo.py", errors occurred saying that Unknown mat file type....

Is there any help ? thanks

the errors when installing mx-rcnn

I used the versions of mxnet and mshadow you forked in your repository, however, errors occurred as follows when I was making mxnet.

/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::gpu, 2, DType>, const mshadow::Tensor<mshadow::gpu, 2, DType>&, const mshadow::Tensor<mshadow::gpu, 1, DType>&)
inline void SoftmaxGrad(Tensor<gpu, 2, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: mismatched types ‘mshadow::gpu’ and ‘mshadow::cpu’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, mshadow::half::half_t>’ is not derived from ‘mshadow::Tensor<mshadow::gpu, 2, DType>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, mshadow::half::half_t>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&, const DType&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, mshadow::half::half_t>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
src/operator/./softmax_output-inl.h: In instantiation of ‘void mxnet::op::SoftmaxOutputOp<xpu, DType>::Backward(const mxnet::OpContext&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&, const std::vectormxnet::OpReqType&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&) [with xpu = mshadow::cpu; DType = double]’:
src/operator/softmax_output.cc:44:1: required from here
src/operator/./softmax_output-inl.h:116:78: error: no matching function for call to ‘SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 2, double>&, mshadow::Tensor<mshadow::cpu, 2, double>&, mshadow::Tensor<mshadow::cpu, 1, double>&, double)’
src/operator/./softmax_output-inl.h:116:78: note: candidates are:
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:275:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 2, DType>, const mshadow::Tensor<mshadow::cpu, 2, DType>&, const mshadow::Tensor<mshadow::cpu, 1, DType>&)
inline void SoftmaxGrad(Tensor<cpu, 2, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:275:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: candidate expects 3 arguments, 4 provided
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
In file included from include/mxnet/./base.h:13:0,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::gpu, 2, DType>, const mshadow::Tensor<mshadow::gpu, 2, DType>&, const mshadow::Tensor<mshadow::gpu, 1, DType>&)
inline void SoftmaxGrad(Tensor<gpu, 2, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: mismatched types ‘mshadow::gpu’ and ‘mshadow::cpu’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, double>’ is not derived from ‘mshadow::Tensor<mshadow::gpu, 2, DType>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, double>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&, const DType&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, double>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
src/operator/./softmax_output-inl.h: In instantiation of ‘void mxnet::op::SoftmaxOutputOp<xpu, DType>::Backward(const mxnet::OpContext&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&, const std::vectormxnet::OpReqType&, const std::vectormshadow::TBlob&, const std::vectormshadow::TBlob&) [with xpu = mshadow::cpu; DType = float]’:
src/operator/softmax_output.cc:44:1: required from here
src/operator/./softmax_output-inl.h:116:78: error: no matching function for call to ‘SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 2, float>&, mshadow::Tensor<mshadow::cpu, 2, float>&, mshadow::Tensor<mshadow::cpu, 1, float>&, float)’
src/operator/./softmax_output-inl.h:116:78: note: candidates are:
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:275:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 2, DType>, const mshadow::Tensor<mshadow::cpu, 2, DType>&, const mshadow::Tensor<mshadow::cpu, 1, DType>&)
inline void SoftmaxGrad(Tensor<cpu, 2, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:275:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: candidate expects 3 arguments, 4 provided
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
In file included from include/mxnet/./base.h:13:0,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::gpu, 2, DType>, const mshadow::Tensor<mshadow::gpu, 2, DType>&, const mshadow::Tensor<mshadow::gpu, 1, DType>&)
inline void SoftmaxGrad(Tensor<gpu, 2, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:668:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: mismatched types ‘mshadow::gpu’ and ‘mshadow::cpu’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, float>’ is not derived from ‘mshadow::Tensor<mshadow::gpu, 2, DType>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:291:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, float>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
In file included from /home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/tensor.h:816:0,
from include/mxnet/./base.h:13,
from include/mxnet/operator.h:18,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template void mshadow::SoftmaxGrad(mshadow::Tensor<mshadow::cpu, 3, Dtype>, const mshadow::Tensor<mshadow::cpu, 3, Dtype>&, const mshadow::Tensor<mshadow::cpu, 2, DType>&, const DType&)
inline void SoftmaxGrad(Tensor<cpu, 3, DType> dst,
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/mshadow/mshadow/./tensor_cpu-inl.h:309:13: note: template argument deduction/substitution failed:
In file included from src/operator/softmax_output.cc:7:0:
src/operator/./softmax_output-inl.h:116:78: note: template argument ‘2’ does not match ‘#‘integer_cst’ not supported by dump_decl#’
SoftmaxGrad(grad, out, label, static_cast(param_.ignore_label));
^
src/operator/./softmax_output-inl.h:116:78: note: ‘mshadow::Tensor<mshadow::cpu, 2, float>’ is not derived from ‘mshadow::Tensor<mshadow::cpu, 3, Dtype>’
In file included from src/operator/./softmax_output-inl.h:11:0,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/dmlc-core/include/dmlc/parameter.h:254:43: warning: ‘mxnet::op::make__SoftmaxOutputParamParamManager’ defined but not used [-Wunused-variable]
static ::dmlc::parameter::ParamManager &make ## PType ## ParamManager__ =
^
src/operator/softmax_output.cc:30:1: note: in expansion of macro ‘DMLC_REGISTER_PARAMETER’
DMLC_REGISTER_PARAMETER(SoftmaxOutputParam);
^
In file included from include/mxnet/operator.h:13:0,
from src/operator/./softmax_output-inl.h:12,
from src/operator/softmax_output.cc:7:
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/dmlc-core/include/dmlc/registry.h:218:22: warning: ‘mxnet::op::make_OperatorPropertyReg_SoftmaxOutput’ defined but not used [-Wunused-variable]
static EntryType & make_ ## EntryTypeName ## _ ## Name ## __ =
^
include/mxnet/operator.h:538:3: note: in expansion of macro ‘DMLC_REGISTRY_REGISTER’
DMLC_REGISTRY_REGISTER(::mxnet::OperatorPropertyReg, OperatorPropertyReg, name)
^
src/operator/softmax_output.cc:32:1: note: in expansion of macro ‘MXNET_REGISTER_OP_PROPERTY’
MXNET_REGISTER_OP_PROPERTY(SoftmaxOutput, SoftmaxOutputProp)
^
/home/dl/mxnet-RCNN/mxnet-master-precedenceguo-forked/dmlc-core/include/dmlc/registry.h:218:22: warning: ‘mxnet::op::__make_OperatorPropertyReg_Softmax’ defined but not used [-Wunused-variable]
static EntryType & _make ## EntryTypeName ## _ ## Name ## __ =
^
include/mxnet/operator.h:538:3: note: in expansion of macro ‘DMLC_REGISTRY_REGISTER’
DMLC_REGISTRY_REGISTER(::mxnet::OperatorPropertyReg, OperatorPropertyReg, name)
^
src/operator/softmax_output.cc:38:1: note: in expansion of macro ‘MXNET_REGISTER_OP_PROPERTY’
MXNET_REGISTER_OP_PROPERTY(Softmax, DeprecatedSoftmaxProp)
^
make: *** [build/src/operator/softmax_output.o] Error 1

Is there any ideas to fix it? thanks.

data label empty error

test_data = TestLoader(roidb, batch_size=1, shuffle=shuffle, has_rpn=True)
Hi

I have just tried your latest repo with Your version of MXNET(2 commit behind master)
Have you changed the data loader completely? or there could be a bug here.
Where test_data.provide_label always return empty.
I did some digging and made it work replacing provide_label to provide_data
and the program works.(not sure about the result, still testing)

Also it seems like with the newest MXnet(0.9) the processing speed slowed down?
With this repo's code I see around 20% speed drop

maybe I am doing settings wrrong

Thanks!

almost complete but..

testing 4950/4952
Writing aeroplane VOC results file
Writing bicycle VOC results file
Writing bird VOC results file
Writing boat VOC results file
Writing bottle VOC results file
Writing bus VOC results file
Writing car VOC results file
Writing cat VOC results file
Writing chair VOC results file
Writing cow VOC results file
Writing diningtable VOC results file
Writing dog VOC results file
Writing horse VOC results file
Writing motorbike VOC results file
Writing person VOC results file
Writing pottedplant VOC results file
Writing sheep VOC results file
Writing sofa VOC results file
Writing train VOC results file
Writing tvmonitor VOC results file
VOC07 metric? Y
Traceback (most recent call last):
File "D:\Anaconda2\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "D:\Anaconda2\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "F:\mx-rcnn\tools\test_final.py", line 61, in
test_net(args.image_set, args.year, args.root_path, args.devkit_path, args.prefix, args.epoch, ctx, args.vis)
File "F:\mx-rcnn\tools\test_final.py", line 36, in test_net
pred_eval(detector=detector, test_data=test_data, imdb=voc, vis=vis)
File "rcnn\tester.py", line 84, in pred_eval
imdb.evaluate_detections(all_boxes)
File "helper\dataset\pascal_voc.py", line 235, in evaluate_detections
self.do_python_eval()
File "helper\dataset\pascal_voc.py", line 288, in do_python_eval
ovthresh=0.5, use_07_metric=use_07_metric)
File "helper\dataset\voc_eval.py", line 89, in voc_eval
recs[image_filename] = parse_voc_rec(annopath.format(image_filename))
File "helper\dataset\voc_eval.py", line 17, in parse_voc_rec
tree = ET.parse(filename)
File "D:\Anaconda2\lib\xml\etree\ElementTree.py", line 1182, in parse
tree.parse(source, parser)
File "D:\Anaconda2\lib\xml\etree\ElementTree.py", line 647, in parse
source = open(source, "rb")
IOError: [Errno 2] No such file or directory: '000001.xml'

problem python train_alternate.py --gpus 0.

Hi, when I ran python train_alternate.py --gpus 0.

it reported error as below:

libdc1394 error: Failed to initialize libdc1394
INFO:root:########## TRAIN RPN WITH IMAGENET INIT
voc_2007_trainval gt roidb loaded from /mxnet/example/rcnn/data/cache/voc_2007_trainval_gt_roidb.pkl
append flipped images to roidb
prepare roidb
providing maximum shape [('data', (1, 3, 1000, 1000))] [('label', (1, 34596)), ('bbox_target', (1, 36, 62, 62)), ('bbox_inside_weight', (1, 36, 62, 62)), ('bbox_outside_weight', (1, 36, 62, 62))]
Traceback (most recent call last):
File "train_alternate.py", line 104, in
args.frequent, args.kv_store, args.work_load_list)
File "train_alternate.py", line 26, in alternate_train
'model/rpn1', ctx, begin_epoch, rpn_epoch, frequent, kv_store, work_load_list)
File "/mxnet/example/rcnn/tools/train_rpn.py", line 54, in train_rpn
args, auxs = load_param(pretrained, epoch, convert=True)
ValueError: too many values to unpack

problems on converting the pretrained fast-rcnn models

When I used caffe_converter tool provided in mxnet for conversion of pretrained fast rcnn models from https://github.com/rbgirshick/fast-rcnn/blob/master/data/scripts/fetch_fast_rcnn_models.sh, errors occurred saying "google.protobuf.text_format.ParseError: 391:3 : Message type "caffe.LayerParameter" has no field named "roi_pooling_param"."

So is there any help to fix this problem ?

Custom Training: RuntimeWarning: invalid value encountered in greater_equal

I am trying to modify some parameters in training. In some case, there are some nan appear in ws and hs in the /rcnn/rpn/proposal.py ,and will cause the error

/home/will/mx-rcnn/rcnn/rpn/proposal.py:167: RuntimeWarning: invalid value encountered in greater_equal
  keep = np.where((ws >= min_size) & (hs >= min_size))[0]
/home/will/mx-rcnn/helper/processing/bbox_transform.py:65: RuntimeWarning: invalid value encountered in subtract
  pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * (pred_w - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:67: RuntimeWarning: invalid value encountered in subtract
  pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * (pred_h - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:69: RuntimeWarning: invalid value encountered in add
  pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * (pred_w - 1.0)
/home/will/mx-rcnn/helper/processing/bbox_transform.py:71: RuntimeWarning: invalid value encountered in add
  pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * (pred_h - 1.0)

To make it robust, I am trying to delete the nan .Do you have some ideas?
@precedenceguo

    @staticmethod
    def _filter_boxes(boxes, min_size):
        """ Remove all boxes with any side smaller than min_size """
        ws = boxes[:, 2] - boxes[:, 0] + 1
        hs = boxes[:, 3] - boxes[:, 1] + 1
        np.set_printoptions(threshold='nan')
        # if np.isnan(ws):
            # print("np.isnan(ws) = ")
            # print(np.isnan(ws))
        # print("ws = ")
        # print(ws)
        # print("hs = ")
        # print(hs)
        # print("min_size = ")
        # print(min_size)
        keep = np.where((ws >= min_size) & (hs >= min_size))[0]
        return keep

ResNet: Problem when using Resnet as base net

Hi everyone, I am trying to use resnet-50 as base net of faster r-cnn via mx-rcnn. The resnet-50 model (resnet-50-0000.params resnet-50-symbol.json) is downloaded from mxnet dmlc site. The original VGG16 model could be trained and tested, but when I type in the following command, it gives me 'RuntimeError: conv2_1_bias is not presented'. Could anyone kindly help with this ?

yf@CPDP-Titan:~/mx-rcnn$ python train_alternate.py --gpus 0 --pretrained model/resnet-50 --epoch 0 --rpn_epoch 2 --rcnn_epoch 2
model/resnet-50
INFO:root:########## TRAIN RPN WITH IMAGENET INIT
voc_2007_trainval gt roidb loaded from /home/yf/mx-rcnn/data/cache/voc_2007_trainval_gt_roidb.pkl
append flipped images to roidb
prepare roidb
providing maximum shape [('data', (1, 3, 1000, 1000))] [('label', (1, 34596)), ('bbox_target', (1, 36, 62, 62)), ('bbox_inside_weight', (1, 36, 62, 62)), ('bbox_outside_weight', (1, 36, 62, 62))]
fitting training data...
Traceback (most recent call last):
File "train_alternate.py", line 105, in
args.frequent, args.kv_store, args.work_load_list)
File "train_alternate.py", line 27, in alternate_train
'model/rpn1', ctx, begin_epoch, rpn_epoch, frequent, kv_store, work_load_list)
File "/home/yf/mx-rcnn/tools/train_rpn.py", line 102, in train_rpn
arg_params=args, aux_params=auxs, begin_epoch=begin_epoch, num_epoch=end_epoch)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/base_module.py", line 337, in fit
allow_missing=allow_missing, force_init=force_init)
File "/home/yf/mx-rcnn/rcnn/module.py", line 95, in init_params
force_init=force_init)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/module.py", line 190, in init_params
_impl(name, arr, arg_params)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/module.py", line 183, in _impl
raise RuntimeError("%s is not presented" % name)
RuntimeError: conv2_1_bias is not presented

What's next?

Now that we have support for ResNet and COCO, it is time to think about what's next in this project.

Full parallelization support that scales to 8 gpus.
Standard detection io. Check ssd plan.

reshape in symbol of get_vgg_rcnn()

hi, @precedenceguo
why reshape at last in https://github.com/precedenceguo/mx-rcnn/blob/master/rcnn/symbol.py#L114-L115, does it related to the data iter or something else? thanks.

training_alternate.py TypeError: _train_multi_device() got an unexpected keyword argument 'max_label_shape'

I've got an type error while running the training script with default parameters. I've built mxnet using your folk. Tested both in Ubuntu and OS X with or without CUDA.

The running output is:

INFO:root:########## TRAIN RPN WITH IMAGENET INIT
wrote gt roidb to mx-rcnn/data/cache/voc_2007_trainval_gt_roidb.pkl
append flipped images to roidb
prepare roidb
providing maximum shape [('data', (1, 3, 1000, 1000))] [('label', (1, 35721)), ('bbox_target', (1, 36, 63, 63)), ('bbox_inside_weight', (1, 36, 63, 63)), ('bbox_outside_weight', (1, 36, 63, 63))]
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "mx-rcnn/tools/train_alternate.py", line 209, in
args.kv_store, args.work_load_list)
File "mx-rcnn/tools/train_alternate.py", line 137, in alternate_train
'model/rpn1', ctx, begin_epoch, rpn_epoch, frequent, kv_store, work_load_list)
File "/mx-rcnn/tools/train_alternate.py", line 61, in train_rpn
solver.fit(train_data, frequent=frequent)
File "rcnn/solver.py", line 105, in fit
max_label_shape=max_label_shape)
TypeError: _train_multi_device() got an unexpected keyword argument 'max_label_shape'

demo.py error: decide_slices assert len(data_shapes) > 0

Hello, when I run the command " python demo.py --image ../tj_tj_test/ --prefix model/finalnew --epoch 0 "
Traceback (most recent call last):
File "demo.py", line 143, in
main()
File "demo.py", line 122, in main
predictor = get_net(symbol, args.prefix, args.epoch, ctx)
File "demo.py", line 42, in get_net
arg_params=arg_params, aux_params=aux_params)
File "/media/D/mx-rcnn-master/rcnn/core/tester.py", line 21, in init
self._mod.bind(provide_data, provide_label, for_training=False)
File "/media/D/mx-rcnn-master/rcnn/core/module.py", line 137, in bind
force_rebind=False, shared_module=None)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/module.py", line 284, in bind
grad_req=grad_req, input_types=input_types)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/executor_group.py", line 187, in init
self.label_layouts = self.decide_slices(label_shapes)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.7.0-py2.7.egg/mxnet/module/executor_group.py", line 201, in decide_slices
assert len(data_shapes) > 0
AssertionError

RuntimeError: rpn_bbox_pred_bias is not presented

Hi,

I downloaded the vgg16 pretrained model using the script get_pretrained_model.sh . I wanted to test the repo on a dummy image, so I used this command -
python demo.py --prefix vgg16 --epoch 0 --image dog.jpg --gpu 0

I get the following error -

Traceback (most recent call last):
File "demo.py", line 139, in
main()
File "demo.py", line 118, in main
predictor = get_net(symbol, args.prefix, args.epoch, ctx)
File "demo.py", line 38, in get_net
arg_params=arg_params, aux_params=aux_params)
File "/home/tarun/daksh/mx-rcnn-master/rcnn/core/tester.py", line 22, in init
self._mod.init_params(arg_params=arg_params, aux_params=aux_params)
File "/home/tarun/daksh/mx-rcnn-master/rcnn/core/module.py", line 89, in init_params
force_init=force_init)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.1-py2.7-linux-x86_64.egg/mxnet/module/module.py", line 261, in init_params
_impl(name, arr, arg_params)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.9.1-py2.7-linux-x86_64.egg/mxnet/module/module.py", line 254, in _impl
raise RuntimeError("%s is not presented" % name)
RuntimeError: rpn_bbox_pred_bias is not presented

Any suggestions on what I am doing wrong?

Custom Training: Infer_shape error when training rcnn

I'm trying to use multiple gpus, which is 4, to train mx-rcnn. The code I use is train_alternate.py. When it runs to the step of training rcnn, it gives me such error message. Could anyone help me with this? Thanks!

num_images 77
voc_2007_trainval gt roidb loaded from data/cache/voc_2007_trainval_gt_roidb.pkl
loading data/rpn_data/voc_2007_trainval_rpn.pkl
append flipped images to roidb
add bounding box regression targets
/home/mxnet_workspace/mx-rcnn/rcnn/processing/bbox_regression.py:94: RuntimeWarning: invalid value encountered in divide
roidb[im_i]['bbox_targets'][cls_indexes, 1:] /= stds[cls, :]
infer_shape error. Arguments:
rois: (4L, 64L, 5L)
label: (4L, 64L)
data: (4L, 3L, 600L, 674L)
bbox_target: (4L, 64L, 72L)
bbox_weight: (4L, 64L, 72L)
Traceback (most recent call last):
File "train_alternate.py", line 95, in
main()
File "train_alternate.py", line 92, in main
alternate_train(args, ctx, args.pretrained, args.epoch, args.rpn_epoch, args.rcnn_epoch)
File "train_alternate.py", line 31, in alternate_train
train_rcnn(args, ctx, pretrained, epoch, 'model/rcnn1', begin_epoch, rcnn_epoch)
File "/home/mxnet_workspace/mx-rcnn/rcnn/tools/train_rcnn.py", line 61, in train_rcnn
arg_shape, out_shape, aux_shape = sym.infer_shape(**data_shape_dict)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.8.0-py2.7.egg/mxnet/symbol.py", line 459, in infer_shape
return self._infer_shape_impl(False, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.8.0-py2.7.egg/mxnet/symbol.py", line 526, in _infer_shape_impl
ctypes.byref(complete)))
File "/usr/local/lib/python2.7/dist-packages/mxnet-0.8.0-py2.7.egg/mxnet/base.py", line 77, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: InferShape Error in _minus2's rhs argument
Shape inconsistent, Provided=(1536,12), inferred shape=(256,12)

max_data_shapes/max_label_shape in MutableModule

hello,
i just find when using MutableModule, it need pass max_data_shapes/max_label_shape parameter, but in the function rcnn.module.forward(), if the shape_changed=True, then we should bind the graph again, so max_data_shapes/max_label_shape seems play no part in MutableModule, and can we just use the first iter's data_shape instead of max_shape? thanks.

MEM/HANG: rcnn demo.py error

when python demo.py --prefix final --epoch 0 --image myimage.jpg --gpu 0
error occur,can you help me,thank you!
I have installed cudnnv5.0 for cuda7.5,nvidia980,ubuntu14.04

/home/yx/mxnet/dmlc-core/include/dmlc/logging.h:235: [23:33:31] src/operator/./convolution-inl.h:299: Check failed: (param_.workspace) >= (required_size)
Minimum workspace size: 1228800000 Bytes
Given: 1073741824 Bytes
[23:33:31] /home/yx/mxnet/dmlc-core/include/dmlc/logging.h:235: [23:33:31] src/engine/./threaded_engine.h:306: [23:33:31] src/operator/./convolution-inl.h:299: Check failed: (param_.workspace) >= (required_size)
Minimum workspace size: 1228800000 Bytes
Given: 1073741824 Bytes
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.
terminate called after throwing an instance of 'dmlc::Error'
what(): [23:33:31] src/engine/./threaded_engine.h:306: [23:33:31] src/operator/./convolution-inl.h:299: Check failed: (param_.workspace) >= (required_size)
Minimum workspace size: 1228800000 Bytes
Given: 1073741824 Bytes
An fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

KeyError: 'fc8_bias' while training

Hi @precedenceguo

I got a error message say KeyError: 'fc8_bias' while trying your example, I check the train.prototxt and I didn't find fc8 layer, so I would like to confirm what's train.prototxt file do you use, can I remove following two line?

     del args['fc8_bias']
     del args['fc8_weight']

haria@haria-All-Series:~/mxnet/example/rcnn$ python train.py 
voc_2007_train gt roidb loaded from /home/haria/mxnet/example/rcnn/data/cache/voc_2007_train_gt_roidb.pkl
voc_2007_train ss roidb loaded from /home/haria/mxnet/example/rcnn/data/cache/voc_2007_train_ss_roidb.pkl
append flipped images to roidb
Premature end of JPEG file
Premature end of JPEG file
Premature end of JPEG file
Premature end of JPEG file
add bounding box regression targets
Traceback (most recent call last):
  File "train.py", line 38, in <module>
    args.prefix, ctx, args.begin_epoch, args.end_epoch, args.frequent)
  File "/home/haria/mxnet/example/rcnn/tools/train_net.py", line 39, in train_net
    del args['fc8_bias']
KeyError: 'fc8_bias'
terminate called recursively
terminate called after throwing an instance of 'dmlc::Error'
  what():  driver shutting down

thank you!

ValueError: Must specify all the arguments in args

F:\mx-rcnn>python -m tools.test_final --prefix model/rcnn2

voc_2007_test gt roidb loaded from F:\mx-rcnn\data\cache\voc_2007_test_gt_roidb.pkl
prepare roidb
testing 0/4952
Traceback (most recent call last):
File "D:\Anaconda2\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "D:\Anaconda2\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "F:\mx-rcnn\tools\test_final.py", line 61, in
test_net(args.image_set, args.year, args.root_path, args.devkit_path, args.prefix, args.epoch, ctx, args.vis)
File "F:\mx-rcnn\tools\test_final.py", line 36, in test_net
pred_eval(detector=detector, test_data=test_data, imdb=voc, vis=vis)
File "rcnn\tester.py", line 39, in pred_eval
scores, boxes = detector.im_detect(databatch.data['data'], im_info=databatch.data['im_info'])
File "rcnn\detector.py", line 57, in im_detect
grad_req='null', aux_states=self.aux_params)
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\symbol.py", line 783, in bind
args_handle, args = self._get_ndarray_inputs('args', args, listed_arguments, False)
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\symbol.py", line 625, in _get_ndarray_inputs
raise ValueError('Must specify all the arguments in %s' % arg_key)
ValueError: Must specify all the arguments in args

missing 'ratios' ops in proposal layer

Hi, Jian Guo, Thanks for your great work! I have try your code and some problem in rpn_test stage. I check the code and find 'ratios' op is missing in proposal layer. delete “ratios=(0.5, 1, 2) ” in symbol.py line 216 is okay.

Provide detailed results

It would be helpful if this repo can provide detailed results. E.g. faster_rcnn/README.md

Error when train end2end

when I command like python trian_end2end.py --gpu 0,1, error occurs like:
('Error in proposal.infer_shape: ', 'Only single item batches are supported')
I think it shuold support multi-gpu, what should I do?

memory leak?

Hi, I tried to use this great work to train on the imagenet, but in the rcnn step, the memory grows continuously, and after ~14k iterations (batch 2) the program is killed by the system since the memory is used up (128gb).

I noticed this phenomenon in the rpn step too, but the rcnn step loads way more data to memory than rpn step so rcnn is killed and rpn is not.

Has anyone met this problem?

Also I found that the memory won't go up if I keep using one same image to train.

final-0000.params

The demo.py used the final-0000.params, but how it came out, by which data set and training details?

Check failed: param_.op_type != "" ( vs. ) Custom operator type missing

Hi, Guo!
I ran the script vgg_voc07.sh, and encountering the following mistake:

INFO:root:########## GENERATE RPN DETECTION
{'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'FIXED_PARAMS': ['conv1', 'conv2'],
'FIXED_PARAMS_SHARED': ['conv1', 'conv2', 'conv3', 'conv4', 'conv5'],
'IMAGE_STRIDE': 0,
'NUM_ANCHORS': 9,
'NUM_CLASSES': 21,
'PIXEL_MEANS': array([ 103.939, 116.779, 123.68 ]),
'RCNN_FEAT_STRIDE': 16,
'RPN_FEAT_STRIDE': 16,
'SCALES': [(600, 1000)],
'TEST': {'BATCH_IMAGES': 1,
'CXX_PROPOSAL': True,
'HAS_RPN': True,
'NMS': 0.3,
'PROPOSAL_MIN_SIZE': 16,
'PROPOSAL_NMS_THRESH': 0.7,
'PROPOSAL_POST_NMS_TOP_N': 2000,
'PROPOSAL_PRE_NMS_TOP_N': 20000,
'RPN_MIN_SIZE': 16,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000},
'TRAIN': {'ASPECR_GROUPING': True,
'BATCH_IMAGES': 1,
'BATCH_ROIS': 128,
'BBOX_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZATION_PRECOMPUTED': False,
'BBOX_REGRESSION_THRESH': 0.5,
'BBOX_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_WEIGHTS': array([ 1., 1., 1., 1.]),
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'CXX_PROPOSAL': True,
'END2END': False,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'RPN_BATCH_SIZE': 256,
'RPN_BBOX_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 16,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000}}
[22:44:04] /home/weiliu/mxnet/dmlc-core/include/dmlc/./logging.h:300: [22:44:04] src/operator/./custom-inl.h:126: Check failed: param_.op_type != "" ( vs. ) Custom operator type missing

Stack trace returned 25 entries:
[bt] (0) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7fd8720a492c]
[bt] (1) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet2op12CustomOpProp4InitERKSt6vectorISt4pairISsSsESaIS4_EE+0xaaf) [0x7fd872a4473f]
[bt] (2) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet2op12ParsedOpProp4InitERKN4nnvm9NodeAttrsE+0xd3) [0x7fd8729aede3]
[bt] (3) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(+0xf61f0a) [0x7fd8729a4f0a]
[bt] (4) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm6Symbol13CreateFunctorEPKNS_2OpESt13unordered_mapISsSsSt4hashISsESt8equal_toISsESaISt4pairIKSsSsEEE+0x98) [0x7fd87366f208]
[bt] (5) /home/weiliu/mxnet/python/mxnet/../../lib/libmxnet.so(MXSymbolCreateAtomicSymbol+0x6a9) [0x7fd8728a91f9]
[bt] (6) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call_unix64+0x4c) [0x7fd8b37c3adc]
[bt] (7) /usr/lib/x86_64-linux-gnu/libffi.so.6(ffi_call+0x1fc) [0x7fd8b37c340c]
[bt] (8) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(_ctypes_callproc+0x48e) [0x7fd8b39da5fe]
[bt] (9) /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so(+0x15f9e) [0x7fd8b39dbf9e]
[bt] (10) python(PyEval_EvalFrameEx+0x98d) [0x5244dd]
[bt] (11) python(PyEval_EvalCodeEx+0x2b1) [0x555551]
[bt] (12) python(PyEval_EvalFrameEx+0x1a10) [0x525560]
[bt] (13) python(PyEval_EvalCodeEx+0x2b1) [0x555551]
[bt] (14) python(PyEval_EvalFrameEx+0x7e8) [0x524338]
[bt] (15) python(PyEval_EvalCodeEx+0x2b1) [0x555551]
[bt] (16) python(PyEval_EvalFrameEx+0x1a10) [0x525560]
[bt] (17) python(PyEval_EvalFrameEx+0xc9a) [0x5247ea]
[bt] (18) python(PyEval_EvalFrameEx+0xc9a) [0x5247ea]
[bt] (19) python() [0x567d14]
[bt] (20) python(PyRun_FileExFlags+0x92) [0x465bf4]
[bt] (21) python(PyRun_SimpleFileExFlags+0x2ee) [0x46612d]
[bt] (22) python(Py_Main+0xb5e) [0x466d92]
[bt] (23) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fd8b4d91f45]
[bt] (24) python() [0x577c2e]

Traceback (most recent call last):
File "train_alternate.py", line 110, in
main()
File "train_alternate.py", line 107, in main
args.rcnn_epoch, args.rcnn_lr, args.rcnn_lr_step)
File "train_alternate.py", line 36, in alternate_train
vis=False, shuffle=False, thresh=0)
File "/home/weiliu/projects/traffSgn/mx-rcnn/rcnn/tools/test_rpn.py", line 23, in test_rpn
sym = eval('get_' + network + '_rpn_test')(num_anchors=config.NUM_ANCHORS)
File "/home/weiliu/projects/traffSgn/mx-rcnn/rcnn/symbol/symbol_vgg.py", line 233, in get_vgg_rpn_test
threshold=config.TEST.PROPOSAL_NMS_THRESH, rpn_min_size=config.TEST.PROPOSAL_MIN_SIZE)
File "/home/weiliu/mxnet/python/mxnet/_ctypes/symbol.py", line 181, in creator
ctypes.byref(sym_handle)))
File "/home/weiliu/mxnet/python/mxnet/base.py", line 75, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: [22:44:04] src/operator/./custom-inl.h:126: Check failed: param.op_type != "" ( vs. ) Custom operator type missing

Could you help me, please? Thank you in advance!

Different scalar values for smooth_l1 loss

@precedenceguo Can you explain why we need different scalar values for the smooth_l1 loss? E.g.scalar=1.0 and scalar=3.0

ResNet: overflow problem in bbox

Recently,I replaced VGG16 pretained model into VGG19,VGG_CNN_S,VGG_CNN_M_1024,and it worked. However, when I replace the VGG16 pretained model into ResNet,it gives me overflow problems as follow

/home/sp/mx-rcnn/helper/processing/bbox_transform.py:61: RuntimeWarning: overflow encountered in exp pred_w = np.exp(dw) * widths[:, np.newaxis]

And in the rpn training stage,the smoothL1Loss reduce little.

The symbol is like this

def get_resnet_conv(data):
    """
    shared convolutional layers
    :param data: Symbol
    :return: Symbol
    """
    filter_list = [64, 64, 128, 256, 512]

    data = mx.sym.BatchNorm(data=data, fix_gamma=True, eps=2e-5, momentum=0.9, name='bn_data')

    body = mx.sym.Convolution(data=data, num_filter=filter_list[0], kernel=(7, 7), stride=(2,2), pad=(3, 3),
                              no_bias=True, name="conv0", workspace=512)
    body = mx.sym.BatchNorm(data=body, fix_gamma=False, eps=2e-5, momentum=0.9, name='bn0')
    body = mx.sym.Activation(data=body, act_type='relu', name='relu0')
    body = mx.symbol.Pooling(data=body, kernel=(3, 3), stride=(2,2), pad=(1,1), pool_type='max',name='pooling0')

    name1_1 = 'stage%d_unit%d' % (1, 1)

    bn1 = mx.sym.BatchNorm(data=body, fix_gamma=False, momentum=0.9, eps=2e-5, name=name1_1 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name1_1 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[1], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name1_1 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name1_1 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name1_1 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[1], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name1_1 + '_conv2')
    shortcut1_1 = mx.sym.Convolution(data=act1, num_filter=filter_list[1], kernel=(1, 1), stride=(1, 1), no_bias=True,
                                     workspace=512, name=name1_1 + '_sc')
    body1_1 = conv2 + shortcut1_1

    name1_2 = 'stage%d_unit%d' % (1, 2)

    bn1 = mx.sym.BatchNorm(data=body1_1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name1_2 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name1_2 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[1], kernel=(3, 3), stride=(1,1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name1_2 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name1_2 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name1_2 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[1], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name1_2 + '_conv2')
    shortcut1_2 = body1_1
    body1_2 = conv2 + shortcut1_2

    name2_1='stage%d_unit%d' % (2, 1)

    bn1 = mx.sym.BatchNorm(data=body1_2, fix_gamma=False, momentum=0.9, eps=2e-5, name=name2_1 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name2_1 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[2], kernel=(3, 3), stride=(2, 2), pad=(1, 1),
                               no_bias=True, workspace=512, name=name2_1 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name2_1 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name2_1 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[2], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name2_1 + '_conv2')
    shortcut2_1 = mx.sym.Convolution(data=act1, num_filter=filter_list[2], kernel=(1, 1), stride=(2, 2), no_bias=True,
                                  workspace=512, name=name2_1 + '_sc')
    body2_1 = conv2 + shortcut2_1

    name2_2 = 'stage%d_unit%d' % (2, 2)

    bn1 = mx.sym.BatchNorm(data=body2_1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name2_2 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name2_2 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[2], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name2_2 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name2_2 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name2_2 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[2], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name2_2 + '_conv2')
    shortcut2_2 = body2_1
    body2_2 = conv2 + shortcut2_2

    name3_1='stage%d_unit%d' % (3, 1)

    bn1 = mx.sym.BatchNorm(data=body2_2, fix_gamma=False, momentum=0.9, eps=2e-5, name=name3_1 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name3_1 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[3], kernel=(3, 3), stride=(2, 2), pad=(1, 1),
                               no_bias=True, workspace=512, name=name3_1 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name3_1 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name3_1 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[3], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name3_1 + '_conv2')
    shortcut3_1 = mx.sym.Convolution(data=act1, num_filter=filter_list[3], kernel=(1, 1), stride=(2, 2), no_bias=True,
                                  workspace=512, name=name3_1 + '_sc')
    body3_1 = conv2 + shortcut3_1

    name3_2 = 'stage%d_unit%d' % (3, 2)

    bn1 = mx.sym.BatchNorm(data=body3_1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name3_2 + '_bn1')
    act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name3_2 + '_relu1')
    conv1 = mx.sym.Convolution(data=act1, num_filter=filter_list[3], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name3_2 + '_conv1')
    bn2 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=0.9, eps=2e-5, name=name3_2 + '_bn2')
    act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name3_2 + '_relu2')
    conv2 = mx.sym.Convolution(data=act2, num_filter=filter_list[3], kernel=(3, 3), stride=(1, 1), pad=(1, 1),
                               no_bias=True, workspace=512, name=name3_2 + '_conv2')
    shortcut3_2 = body3_1
    body3_2 = conv2 + shortcut3_2

    name4_1 = 'stage%d_unit%d' % (4, 1)
    body = residual_unit(body3_2, filter_list[4], (2, 2), False,
                         name=name4_1, bottle_neck=False, workspace=512)
    name4_2 = 'stage%d_unit%d' % (4, 2)
    body = residual_unit(body, filter_list[4], (1,1), True, name=name4_2,
                         bottle_neck=False, workspace=512)

    bn1 = mx.sym.BatchNorm(data=body, fix_gamma=False, eps=2e-5, momentum=0.9, name='bn1')
    resnet_conv4 = mx.sym.Activation(data=bn1, act_type='relu', name='relu1')

    return resnet_conv4

init() got an unexpected keyword argument 'fixed_param_names'

Hi, when trying to train I encountered this error

File "train_alternate.py", line 26, in alternate_train
'model/rpn1', ctx, begin_epoch, rpn_epoch, frequent, kv_store, work_load_list)
File "/some/path/mx-rcnn/tools/train_rpn.py", line 101, in train_rpn
arg_params=args, aux_params=auxs, begin_epoch=begin_epoch, num_epoch=end_epoch)
File "/some/path/mxnet/python/mxnet/module/base_module.py", line 333, in fit
for_training=True, force_rebind=force_rebind)
File "/some/path/mx-rcnn/rcnn/module.py", line 135, in bind
fixed_param_names=self._fixed_param_names)
TypeError: init() got an unexpected keyword argument 'fixed_param_names'

as I checked the code, mx-rcnn/rcnn/module.py line 135 is
module = Module(self._symbol, self._data_names, self._label_names, logger=self.logger, context=self._context, work_load_list=self._work_load_list, fixed_param_names=self._fixed_param_names)

and the "Module" is
from mxnet.module.module import Module

there is no "fixed_param_prefix" in the init of this "Module". But there is a keyword "fixed_param_prefix" in the init of "MutableModule". So maybe line 135 should be changed like this?
module = MutableModule(..., fixed_param_prefix=self._fixed_param_names)

a question about rcnn training with batch_size > 1

hello, @precedenceguo
mx-rcnn is a great work!

i just found mx-rcnn support training with batch_size > 1 during train_rcnn() training, it use max size(of the same mini-batch) of each dim to solve different size of each instance, if the dim is shorter than the max dim, then it will pad 0 in the end:

tensor_list[ind] = np.lib.pad(tensor, pad_shape, 'constant', constant_values=pad)

does this will influence the training accuracy? during training, this may changed the input data of roi-pooling, if pad size is too large, then the last area of roi-pooling's output will be meaningless. and during testing, we need no more to pad before roi-pooling, so the behavior maybe changed.

training with multiple gpu?

does mxnet support trainimg with multiple gpu?
(note that the original py-faster-rcnn bsed on caffe cannot, see rbgirshick/py-faster-rcnn#107)

Hi, When I run demo.py on CPU, there is a error: mxnet.base.MXNetError: [00:42:42] src/operator/./convolution-inl.h:299: Check failed: (param_.workspace) >= (required_size) Minimum workspace size: 1228800000 Bytes Given: 1073741824 Byteequired_size) Minimum workspace size: 1228800000 Bytes Given: 1073741824 Bytes

Could it be possible running on CPU?

training error with latest version (1/20/17)

After checked out latest version (that removed dependency on customized version of mxnet), got following error at the beginning of training:

providing maximum shape [('data', (1, 3, 600, 1000)), ('gt_boxes', (1, 100, 5))] [('label', (1, 20646)), ('bbox_target', (1, 36, 37, 62)), ('bbox_weight', (1, 36, 37, 62))]
[20:55:24] /home/dsu/mxnet/dmlc-core/include/dmlc/logging.h:235: [20:55:24] src/symbol/symbol.cc:155: Symbol.InferShapeKeyword argument name bbox_target not found.

Training works fine in previous version. Saw following changes in train_end2end.py that caused the error:

max_data_shape, max_label_shape = train_data.infer_shape(max_data_shape)
max_data_shape.append(('gt_boxes', (input_batch_size, 100, 5)))
print 'providing maximum shape', max_data_shape, max_label_shape

# infer shape
data_shape_dict = dict(train_data.provide_data + train_data.provide_label)
arg_shape, out_shape, aux_shape = sym.infer_shape(**data_shape_dict)  <-- error here

final-0000.params

from where can I download
final-0000.params ?

Thanks

width key in roidb and tensor_vstack arguments

Hi, when I tried to train at the first time, I came across a few small issues.

there is no 'width' and 'height' key in roidb, which will raise a key error in
mx-rcnn/rcnn/loader.py", line 205, in reset
widths = np.array([r['width'] for r in self.roidb])
tensor_vstack only take one argument in image_processing.py, so another error is raised by the second argument.
mx-rcnn/rcnn/loader.py", line 289, in get_batch
all_label['label'] = tensor_vstack([batch['label'] for batch in new_label_list], pad=-1)

train faster rcnn

hello, when I run "python -m tools.train_alternate", error occurs as follow:

NFO:root:########## train rpn with imagenet init
voc_2007_trainval gt roidb loaded from
.........
File "/home/mxnet_0724/example/rcnn/tools/train_alternate.py", line 52, in train_rpn
args['rpn_conv_3x3_weight'] = mx.random.normal(mean=0, stdvar=0.01, shape=arg_shape_dict['rpn_conv_3x3_weight'])
TypeError: normal() got an unexpected keyword argument 'mean'

I find mxnet.random.normal is defined in python/mexnt/random.py file , but the input argument is "loc", not "mean", I just wondor how can you run it withnot change name of this argument???

Failed to test demo.py

Hi, there are some errors when I try to test demo.py of rcnn in mxnet/example by python demo.py --prefix final --epoch 0 --image myimage.jpg --gpu 0. The errors are as follows,

Traceback (most recent call last):
  File "_ctypes/callbacks.c", line 234, in 'calling callback function'
  File "/net/wanggu/anaconda3/lib/python3.5/site-packages/mxnet-0.7.0-py3.5.egg/mxnet/operator.py", line 600, in creator
    op_prop = prop_cls(**kwargs)
TypeError: __init__() keywords must be strings
Segmentation fault (core dumped)

How can I solve it? Thanks.
This may be a problem related to python3 compatibility.

solver.fit error in rcnn\solver.py

I followed the step to train the model as you suggested.
I only used a command:
python -m tools.train_alternate
with default conmand line params and meet an error:

Traceback (most recent call last):
File "D:\Anaconda2\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "D:\Anaconda2\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "D:\mx-rcnn-master\tools\train_alternate.py", line 210, in
args.kv_store, args.work_load_list)
File "D:\mx-rcnn-master\tools\train_alternate.py", line 138, in alternate_train
'model/rpn1', ctx, begin_epoch, rpn_epoch, frequent, kv_store, work_load_list)
File "D:\mx-rcnn-master\tools\train_alternate.py", line 62, in train_rpn
solver.fit(train_data, frequent=frequent)
File "rcnn\solver.py", line 106, in fit
max_label_shape=max_label_shape
TypeError: _train_multi_device() got an unexpected keyword argument 'mutable_data_shape'

Then i find that the last 3 parameters of _train_multi_device function are unexpected keyword as above. But i comment these 3 params, only to meet another error about data.shape.

is this a bug? Or due to my wrong operations?

How much memory is required for mx-rcnn?

Hi, I was running in CPU, but the 70G of memory is not enough.
Modified code：
train_alternate.py
#ctx = [mx.gpu(int(i)) for i in args.gpu_ids.split(',')]
ctx = [mx.cpu()]
We need to modify the other place?
Thank you very much.

Error in downloading pre-trained model

Hi,

I am trying to download the basic pre-trained model given in the README from this link - https://www.dropbox.com/s/jrr83q0ai2ckltq/final-0000.params.tar.gz?dl=0
I tried multiple times but the download fails after downloading around 320 MB of the file. Tried switching to different networks also, but no luck. Can someone confirm if the file is not corrupted on the link?

For the different link posted - http://pan.baidu.com/share/init?shareid=2372813683&uk=3979418934 , the page seems to be in chinese I think and it asks for some password I guess. Can I get access to that?

Thanks

MEM/HANG: cuda out of memory

hello, I used mxnet faster rcnn to train my own data (about 800 images)with alternative training, but it raised "out of memory" after rpn training . I used GTX 1080 and TITAN X, all failed.

approximate joint end-2-end training.

Hello, @precedenceguo
Do you have a plan to add the approximate joint end-to-end training code? i think we should first add the python op of propasal_target.py, which is something like ROIIter; then change a little of AnchorLoader.

Custom Training: Training on custom dataset

I have a question about training mx-rcnn on cutstom datasets, does the preparation steps for training is the same to py-faster-rcnn? I went through the io interface of pascal_voc and it didn't contain any rpn_roidb loading functions, did I miss anything?

Install MXNet with a version that has operators ROIPooling and smooth_l1 appeared

Thanks for your sharing,I had installed mxnet with latest version but no ROIPooling and smooth_l1, did you implement ROIPooling and smooth_l1 yourself or install mxnet with this two layers?

KeyError: fc6_weight

Traceback (most recent call last):
File "D:\Anaconda2\lib\runpy.py", line 162, in _run_module_as_main
"main", fname, loader, pkg_name)
File "D:\Anaconda2\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "F:\mx-rcnn\tools\train_alternate.py", line 209, in
args.kv_store, args.work_load_list)
File "F:\mx-rcnn\tools\train_alternate.py", line 168, in alternate_train
'model/rcnn2', ctx, begin_epoch, rcnn_epoch, frequent, kv_store, work_load_list)
File "F:\mx-rcnn\tools\train_alternate.py", line 115, in train_rcnn
solver.fit(train_data, frequent=frequent)
File "rcnn\solver.py", line 105, in fit
max_label_shape=max_label_shape)
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\model.py", line 288, in _train_multi_device
executor_manager.copy_to(arg_params, aux_params)
File "D:\Anaconda2\lib\site-packages\mxnet-0.7.0-py2.7.egg\mxnet\executor_manager.py", line 400, in copy_to
weight.astype(arg_params[name].dtype).copyto(arg_params[name])
KeyError: 'fc6_weight'

The training process ended when '########## TRAIN RCNN WITH RPN INIT AND DETECTION'
after epoch[0] at this stage.

question in fix_param

fixed_param_names = list() for name in self._symbol.list_arguments(): for prefix in self._fixed_param_prefix: if prefix in name: fixed_param_names.append(name)
the param name in mxnet is conv1_weight,or like conv2_x, so did the code can fix the params?
if prefix in name: --> if name.startswith(prefix): is right?