microsoft / maskflownet Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
Home Page: https://arxiv.org/abs/2003.10955
License: MIT License
[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask
Home Page: https://arxiv.org/abs/2003.10955
License: MIT License
AssertionError: Parameter 'maskflownet_s0_hybridsequential0_conv1aweight' is missing in file './MaskFlownet/weights/dbbSep30-1206_1000000.params', which contains parameters: 'maskflownet0_hybridsequential0_conv1aweight', 'maskflownet0_hybridsequential0_conv1abias', 'maskflownet0_hybridsequential1_conv1bweight', ..., 'maskflownet0_hybridsequential55_conv3fweight', 'maskflownet0_hybridsequential55_conv3fbias', 'maskflownet0_hybridsequential56_conv2fweight', 'maskflownet0_hybridsequential56_conv2fbias'. Please make sure source and target networks have the same prefix.
I get error as above, and how can I get occlusion masks as paper's work?
this is the command I used:
python predict_new_data.py ./test.png MaskFlownet_S.yaml --image_1 ./image_1.png --image_2 ./image_2.png -g 0 -c 5adNov03
Hi, @simon1727 I noticed you mentioned that force to make ground truth dense in training stage #5.
Then how to do post-process when we submit to Kitti eval server in prediction stage
?
In other words, how to get sparse result according to dense flow results
, which makes sure getting promising eval results? Because the ground truth is sparse in test dataset also.
Hello, thanks for your work. I could not find the .yaml config file in your project, which I think should be used in inference. Could you help upload it? I want to use it to inference on my own data based on your pretrained model. Thank you.
您好,我也是做这方面工作的。Occlusion+optical flow,去年我就完成了预测occlusion mask放入PWC-Net网络作为decoder分支网络的辅助信息类似的工作,很遗憾后面没有就这个工作深入下去。
还没深入看您的paper,但感觉思路和我当时的相差无几,如果方便的话,能否留个联系方式探讨下?
Hi,
When I run the main.py with "MaskFlownet_S_sintel.yaml --dataset_cfg sintel_kitti2015_hd1k.yaml -g 0 -c dbbSep30-1206 --clear_steps", i got the error like below:
Traceback (most recent call last):
File "/MaskFlownet-master/main.py", line 537, in
train_log = pipe.train_batch(img1, img2, flow, geo_aug, color_aug, mask = mask)
File "/MaskFlownet-master/network/pipeline.py", line 101, in train_batch
img1s, img2s, labels, masks = geo_aug(img1s, img2s, labels, masks)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 548, in call
out = self.forward(*args)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 915, in forward
return self._call_cached_op(x, *args)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 821, in _call_cached_op
out = self._cached_op(*cargs)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/_ctypes/ndarray.py", line 150, in call
ctypes.byref(out_stypes)))
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/base.py", line 253, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator geometryaugmentation0_bilinearsampler0: [14:00:56] src/operator/./bilinear_sampler-inl.h:158: Check failed: dshape[0] == lshape[0] (1 vs. 4) :
Can I ask for a help? Thank you very much in advance.
Hi Simon,
some bugfix as follow.
samples = None
->samples =-1
;samples = 32 if args.debug else -1
in main.py line194;num_files = (len(os.listdir(path_testing)) - 1) // 2
-> num_files = len(os.listdir(path_testing)) // 2
;img0 = cv2.resize(img0, resize)
img1 = cv2.resize(img1, resize)
adj_resize = (resize[1], resize[0])
img0 = cv2.resize(img0, adj_resize)
img1 = cv2.resize(img1, adj_resize)
cv2.resize()
should be dim(width, height)
-> dim(cols, rows)
Hi,
Many thanks for your nice work.
I have a question in weights fold what the difference between '771Sep25-0735_500000' and 'abbSep15-1037_500000' . I saw in the readme 771Sep25-0735_500000 in Pre-trained Models, but it is not in the evaluation table.
Could you explain the difference between them?
Best regards,
Meow
Hi!
Thanks for your greater work! I am ready to relocate your work from mxnet to pytorch. But I encountered a problem that there is no activation function in FeaturePyramidExtractor different from PWC-net.
Is that a bug?
Dear all,
I am recently trying to use the code to do some inference of my own, and I basically use the code in master branch, combine the code in main.py
and predict.py
. But I have received strange errors during the forward pass of the network. Please tell me if I have got anything wrong. Thank you very much!
My code is as follows:
import os
import sys
import argparse
import json
import yaml
import numpy as np
import mxnet as mx
import cv2
import network.config
from network import get_pipeline
# NOTE: Use Default Value when running
parser = argparse.ArgumentParser()
parser.add_argument('--config', default='MaskFlownet.yaml', type=str)
parser.add_argument('--gpu_device', default='0', type=str)
parser.add_argument('--network', default='MaskFlownet', type=str)
parser.add_argument('--data_folder', type=str, default='./data/')
parser.add_argument('--img_folder', type=str, default='images/')
parser.add_argument('--checkpoint', type=str, default='5adNov03-0005_1000000.params')
args = parser.parse_args()
args.img_folder = os.path.join(args.data_folder, args.img_folder)
def main():
ctx = [mx.gpu(gpu_id) for gpu_id in map(int, args.gpu_device.split(','))]
prefix = os.path.dirname(__file__)
config_file = 'MaskFlownet.yaml'
config_path = os.path.join(prefix, 'network/config', config_file)
with open(config_path) as f:
config = network.config.Reader(yaml.load(f))
pipe = get_pipeline('MaskFlownet', ctx=ctx, config=config)
checkpoint_path = os.path.join(prefix, 'weights', args.checkpoint)
pipe.load(checkpoint_path)
pipe.fix_head()
pre_img_name = '1.jpg'
cur_img_name = '2.jpg'
pre_img_path = os.path.join(args.img_folder, pre_img_name)
cur_img_path = os.path.join(args.img_folder, cur_img_name)
# read the image and resize
h, w = 576, 1024
pre_img = cv2.imread(pre_img_path)
pre_img = cv2.resize(pre_img, (w, h))
cur_img = cv2.imread(cur_img_path)
cur_img = cv2.resize(cur_img, (w, h))
# Output the shape of images, making sure the size is correct
# The shapes are both
print('Image Shapes {:} {:}'.format(pre_img.shape, cur_img.shape))
flow = list(pipe.predict([pre_img], [cur_img], batch_size=1))[0][0]
print(flow)
return
if __name__ == '__main__':
main()
And my error log is
Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.mw to [0.005, 0.01, 0.02, 0.08, 0.32]
Default FLAGS..optimizer.q to None
Image Shapes (576, 1024, 3) (576, 1024, 3)
Traceback (most recent call last):
File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/infer.py", line 67, in <module>
main()
File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/infer.py", line 61, in main
flow = list(pipe.predict([pre_img], [cur_img], batch_size=1))[0][0]
File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 209, in predict
flow, occ_mask, warped, _ = self.do_batch(img1s, img2s, resize = resize)
File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 136, in do_batch
flows, occ_masks, _ = self.do_batch_mx(img1, img2, resize = resize)
File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 131, in do_batch_mx
pred, flows, warpeds = self.network(img1, img2)
File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 471, in __call__
return self.forward(*args)
File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 705, in forward
return self._call_cached_op(x, *args)
File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 612, in _call_cached_op
out = self._cached_op(*cargs)
File "/root/.tspkg/lib/python3/mxnet/_ctypes/ndarray.py", line 149, in __call__
ctypes.byref(out_stypes)))
File "/root/.tspkg/lib/python3/mxnet/base.py", line 149, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator maskflownet0_maskflownet_s0_upsample0_reshape_like0: [15:34:33] src/operator/tensor/elemwise_unary_op_basic.cc:348: Check failed: (*in_attrs)[0].Size() == (*in_attrs)[1].Size() (1152 vs. 288) Cannot reshape lhs with shape [2,1,18,32]to rhs with shape [1,2,9,16] because they have different size.
Stack trace returned 10 entries:
[bt] (0) /root/.tspkg/lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b) [0x7f2f8ed8507b]
[bt] (1) /root/.tspkg/lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f2f8ed85be8]
[bt] (2) /root/.tspkg/lib/libmxnet.so(+0x15e128a) [0x7f2f8f7fe28a]
[bt] (3) /root/.tspkg/lib/libmxnet.so(+0x2f8b571) [0x7f2f911a8571]
[bt] (4) /root/.tspkg/lib/libmxnet.so(mxnet::exec::InferShape(nnvm::Graph&&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1ada) [0x7f2f911aa72a]
[bt] (5) /root/.tspkg/lib/libmxnet.so(mxnet::imperative::CheckAndInferShape(nnvm::Graph*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >&&, bool, std::pair<unsigned int, unsigned int>, std::pair<unsigned int, unsigned int>)+0x13c) [0x7f2f912abdfc]
[bt] (6) /root/.tspkg/lib/libmxnet.so(mxnet::Imperative::CachedOp::GetForwardGraph(bool, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&)+0x548) [0x7f2f9129a5a8]
[bt] (7) /root/.tspkg/lib/libmxnet.so(mxnet::Imperative::CachedOp::Forward(std::shared_ptr<mxnet::Imperative::CachedOp> const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&)+0xb5) [0x7f2f912a23f5]
[bt] (8) /root/.tspkg/lib/libmxnet.so(MXInvokeCachedOp+0xc39) [0x7f2f917c8569]
[bt] (9) /root/.tspkg/lib/libmxnet.so(MXInvokeCachedOpEx+0x3ee) [0x7f2f917c975e]
Line 64 in 2f796ec
Hi, Simon, why the flow(:,:,1)+512
? What's the 512 stand for?
Can you tell me the number of training iterations for finetune on flyings3d after pre training on flyingchairs? Would you like to use s-fine to train 0.5m iterations after flyingchairs 1.2m iterations, or restart using long + sfine for a total of 1.7m iterations?
In this line https://github.com/microsoft/MaskFlownet/blob/master/network/MaskFlownet.py#L306
Should it be c2s = [c21, c22, c23, c24, c25, c26]
instead of c2s = [c21, c12, c13, c24, c25, c26]
? The latter version doesn't make much sense to me.
I have noticed that your training loop leaks small amounts of RAM memory.Any idea on what may have caused this?
time taken= 9.865329265594482 | steps= 1 | cpu= 51.8 | ram= 34.50078675328186 | gpu= [3101]
[5613]
time taken= 0.934636116027832 | steps= 2 | cpu= 27.0 | ram= 29.34866251942084 | gpu= [5613]
[3045]
time taken= 0.8695635795593262 | steps= 3 | cpu= 29.4 | ram= 29.217970957706278 | gpu= [3045]
[3021]
time taken= 0.8483304977416992 | steps= 4 | cpu= 29.8 | ram= 29.033316428574086 | gpu= [3021]
[2997]
time taken= 0.8630681037902832 | steps= 5 | cpu= 30.2 | ram= 28.87988403913803 | gpu= [2997]
[2997]
time taken= 0.8645083904266357 | steps= 6 | cpu= 29.4 | ram= 28.714746447210654 | gpu= [2997]
[2997]
time taken= 0.864253044128418 | steps= 7 | cpu= 29.3 | ram= 28.573093657739385 | gpu= [2997]
[2997]
time taken= 0.8693573474884033 | steps= 8 | cpu= 29.3 | ram= 28.389703885656044 | gpu= [2997]
[2997]
time taken= 0.8704898357391357 | steps= 9 | cpu= 29.4 | ram= 28.298690976454438 | gpu= [2997]
[2997]
time taken= 0.8670341968536377 | steps= 10 | cpu= 29.5 | ram= 28.13385097442091 | gpu= [2997]
[2997]
time taken= 0.8750414848327637 | steps= 11 | cpu= 29.5 | ram= 27.959884882309396 | gpu= [2997]
[2997]
time taken= 0.8624210357666016 | steps= 12 | cpu= 29.9 | ram= 27.784356443255188 | gpu= [2997]
[2997]
time taken= 0.8561670780181885 | steps= 13 | cpu= 29.8 | ram= 27.644241201568796 | gpu= [2997]
[2997]
time taken= 0.8609695434570312 | steps= 14 | cpu= 29.7 | ram= 27.51883186047002 | gpu= [2997]
[2997]
time taken= 0.8462607860565186 | steps= 15 | cpu= 29.7 | ram= 27.36641623650461 | gpu= [2997]
[2997]
time taken= 0.8624782562255859 | steps= 16 | cpu= 29.2 | ram= 27.23760941078441 | gpu= [2997]
[2997]
time taken= 0.8649694919586182 | steps= 17 | cpu= 29.4 | ram= 27.113514425050127 | gpu= [2997]
[2997]
time taken= 0.8661544322967529 | steps= 18 | cpu= 29.3 | ram= 27.004993310427178 | gpu= [2997]
[2997]
time taken= 0.8687705993652344 | steps= 19 | cpu= 29.8 | ram= 26.82090916192486 | gpu= [2997]
[2997]
time taken= 0.8823645114898682 | steps= 20 | cpu= 29.6 | ram= 26.688630454109777 | gpu= [2997]
[2997]
time taken= 0.8795809745788574 | steps= 21 | cpu= 29.4 | ram= 26.517987449146226 | gpu= [2997]
[2997]
time taken= 0.8857841491699219 | steps= 22 | cpu= 29.1 | ram= 26.40289455770082 | gpu= [2997]
[2997]
time taken= 0.8605339527130127 | steps= 23 | cpu= 29.5 | ram= 26.274509317663572 | gpu= [2997]
[2997]
time taken= 0.8524265289306641 | steps= 24 | cpu= 29.8 | ram= 26.16445065525575 | gpu= [2997]
Thanks for your great job!
When I run python main.py MaskFlownet.yaml -g 0 -c 000Mar17 --predict
. I got outputs in /flows.
But it doesn't seem to be the right size.
The only thing I changed in the code was to replace imread in skimage.io with imread in cv2.
Line 3 in 2f796ec
My environment:
python 3.6.8
mxnetcu90-1.5.1
CUDA 9.0.176
cudnn 7.6.5
In addition, I would like to ask how to correctly visualize occlusion in binary form?
Hi, thanks for sharing.
I am trying to test your model on a pair of image but could not make it working.
I installed Python 3.6.10 and mxnet1.5 using anaconda and all necessary modules.
By now it crashes when reading the model, something is missing. Here is my command:
python main.py MaskFlownet_S.yaml -c 8caNov12 --predict --clear_steps --debug
and result is:
[('C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\logs\8caNov12-1532.log', '8caNov12-1532', '-1532')]
Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.mw to [0.005, 0.01, 0.02, 0.08, 0.32]
Default FLAGS..optimizer.q to None
Default FLAGS..optimizer.learning_rate to None
Load Checkpoint C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\weights\8caNov12-1532_300000.params
load the weight for the network
Traceback (most recent call last):
File "main.py", line 136, in
pipe.load(checkpoint)
File "C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\network\pipeline.py", line 57, in load
self.network.load_parameters(checkpoint, ctx=self.ctx)
File "C:\Users\cvestri\AppData\Local\conda\conda\envs\py36_mxnet\lib\site-packages\mxnet\gluon\block.py", line 394, in load_parameters
cast_dtype=cast_dtype, dtype_source=dtype_source)
File "C:\Users\cvestri\AppData\Local\conda\conda\envs\py36_mxnet\lib\site-packages\mxnet\gluon\parameter.py", line 968, in load
name[lprefix:], filename, _brief_print_list(arg_dict.keys()))
AssertionError: Parameter 'hybridsequential0_conv1aweight' is missing in file 'C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\weights\8caNov12-1532_300000.params', which contains parameters: 'maskflownet_s0_maskflownet_s0_hybridsequential0_conv1aweight', 'maskflownet_s0_maskflownet_s0_hybridsequential0_conv1abias', 'maskflownet_s0_maskflownet_s0_hybridsequential1_conv1bweight', ..., 'maskflownet_s0_deform3weight', 'maskflownet_s0_deform3bias', 'maskflownet_s0_deform2weight', 'maskflownet_s0_deform2bias'. Please make sure source and target networks have the same prefix.
it is the same with mxNet 1.6
Thanks
Hi, authors,
Thanks for sharing the code, it is a great work!
When fine-tuning on KITTI dataset, it only has sparse ground truth. In this case, If we employ some geometric transformation such as scale and roatation with biliear sampling, it will cause problem, because there are many zeros for those non-labeled pixels. Besides, the binary mask will become non-binary anymore.
In your paper, you state that For sparse ground-truth flow in KITTI, the augmented flow is weighted averaged based on the interpolated valid mask.
But in the code, I cannot find how you handle this in detail. Could you please tell me how employ geometric transformations on sparse ground truth datasets (e.g, interpolated valid mask)?
Thanks for your attention!
II think in kittyi.py line #41, there is no need to -1. There are 200 optical flow images.
And are the program developed on Windows?
Do you provide some checkpoint files or trained models to play with?
could you please explain what should be the F in this line of code :
x2_warp = self.deform5(x2, F.repeat(F.expand_dims(flow*self.scale/self.strides[1], axis=1), 9, axis=1).reshape((0, -3, -2)))
thank you
Hi, really appreciate for opening the code! Can i have few questions?
Does the image size influence the performance? I mean in the training stage, image patches (896x320 for Kitti) are used. But in testing, do you still use the image patches? Or use the entire image (around 1240x370 for Kitti)?
I know cropping is for augmentation, but If you use small pates for training, and larger images for testing, does this strategy will influence the performance?
Thank you very much!
Hi,
How can i use the network for inference or training if there are no file for network configuration MaskFlownet.yaml ?
I train the Maskflownet on Sintel train + KITTI 2015 + HD1K without any change of your code, however the performance of my own trained model is not as well as your pretrained model "8caNov12", if there is any difference in your uploaded MaskFlownet_sintel.yaml and sintel_kitti2015_hd1k.yaml? Or should I need adjust any other parameters? Thank you very much for your help in advance.
I want to fine tune pretrained maskflownet on ChairsSDHom.
I have written the dataset scripts for it but am getting this error while training.
Traceback (most recent call last):
File "main.py", line 143, in <module>
pipe.trainer.load_states(checkpoint.replace('params', 'states'))
File "/home/mask/miniconda3/envs/mask/lib/python3.6/site-packages/mxnet/gluon/trainer.py", line 515, in load_states
with open(fname, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/mask/maskflownet/MaskFlownet/weights/5adNov03-0005_1000000.states'
Can you please help me on how to proceed?
Hi,
Thank you very much for providing the source code, it's really awesome.
I have several questions about the provided checkpoints,
For the provided dbbSep30-1206_1000000
checkpoint, it seems that the real validation result is different from the score which mentioned it the README section (i.e., 2.07 / 4.07 for Sintel). I ran it on the validation set and got a score of 1.47 / 1.90
for Sintel Val.
There also exists inconsistent between the log file and the provided checkpoint, as the last line of the dbbSep30-1206.log
is correct.
I am guessing that this checkpoint is trained on the whole Sintel dataset, am I correct?
From the guess from (1), I try to upload the test results to see if the dbbSep30-1206_1000000
checkpoint can reproduce the results reported on the paper and the website. But I find that there is a gap in them: I got 4.877 / 3.182 on FINAL and CLEAN, respectively, and the reported ones on the paper and website are 4.38 / 2.77.
I would appreciate it if you could help to clarify these questions and provide the checkpoints which can reproduce the results.
Thank you for your time and consideration again!
Hi!
I am really interested in your paper and this repo.
I have trained the MaskflowNet_s using FlyingChairs dataset with the code provided in this
repo. However, the epe on mpi-sintel clean and final is 2.999 and 4.399. There is quite a gap between the reported epe 2.88 and 4.25.
So, here is my question, Is the code provided in this repo is that you used for experiment during writing the paper? Can I use the code to recurrent the performance reported in paper without further modify? The performance reported in this paper is the best checkpoint or the final checkpoint?
Thanks!
I find an inconsistency in describing the running time of VCN in Table 1 between the arxiv paper : 0.18 and the official paper: 0.03.
Could you let me know the exact runtime?
Thanks!
Issue is on training the validation loss goes up too much very quickly. check logs below.
I have added chairsSDHom data loading script as follows.
Changes:
1 . main.py
...
...
elif dataset_cfg.dataset.value == "chairsSDHom":
batch_size=3
orig_shape= [384,512]
# training
chairsSDHom_dataset = chairsSDHom.list_data()
print(chairsSDHom_dataset['flow'][0])
from pympler.asizeof import asizeof
trainImg1 = [file for file in chairsSDHom_dataset['image_0']]
trainImg2 = [file for file in chairsSDHom_dataset['image_1']]
trainFlow = [file for file in chairsSDHom_dataset['flow']]
trainMask = [file for file in chairsSDHom_dataset['mask']]
trainSize = len(trainFlow)
training_datasets = [(trainImg1, trainImg2, trainFlow,trainMask)] * batch_size
# validaion- sintel
sintel_dataset = sintel.list_data()
divs = ('training',) if not getattr(config.network, 'class').get() == 'MaskFlownet' else ('training2',)
for div in divs:
for k, dataset in sintel_dataset[div].items():
dataset = dataset[:samples]
img1, img2, flow, mask = [[sintel.load(p) for p in data] for data in zip(*dataset)]
validationSize = len(flow)
validation_datasets['sintel.' + k] = (img1, img2, flow, mask)
...
...
def iterate_data(iq, dataset):
if dataset_cfg.dataset.value == 'chairsSDHom' or dataset_cfg.dataset.value == "things3d":
gen = index_generator(len(dataset[0]))
while True:
i = next(gen)
data = [item[i] for item in dataset]
if dataset_cfg.dataset.value == "chairsSDHom":
data = [skimage.io.imread(data[0]),skimage.io.imread(data[1]),chairsSDHom.load(data[2]),skimage.io.imread(data[3])]
elif dataset_cfg.dataset.value == "things3d":
data = [cv2.imread(data[0]).astype('uint8'),skimage.io.imread(data[1]).astype('uint8'),things3d.load(data[2]).astype('float16')]
space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
# vertical flip
if np.random.randint(2):
data = [arr[:, :, ::-1] for arr in data]
data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
iq.put(data)
else:
gen = index_generator(len(dataset[0]))
while True:
i = next(gen)
data = [item[i] for item in dataset]
space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
# vertical flip
if np.random.randint(2):
data = [arr[:, :, ::-1] for arr in data]
data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
iq.put(data)
...
rest everthing is same
yet training
Logs:
[2020/12/22 21:36:48] start=0, train=21670, val=224, host=ludwig, batch=3
[2020/12/22 21:36:48] batch=8, config='MaskFlownet_ft.yaml', dataset_cfg='chairsSDHom.yaml', shard=1, gpu_device='1', checkpoint='5adNov03', clear_steps=True, network='MaskFlownet', debug=False, valid=Fa
lse, predict=False, resize=''
[2020/12/22 21:36:54] steps=1, epe=81.23613661839343, total_time=0.00
[2020/12/22 21:37:20] steps=1, sintel.clean=1.4036083221435547, sintel.final=**1.7385120391845703**
[2020/12/22 21:37:20] steps=2, epe=82.52426050579368, total_time=31.65
[2020/12/22 21:37:21] steps=3, epe=70.33922181313649, total_time=15.62
[2020/12/22 21:37:21] steps=4, epe=64.53729546698513, total_time=10.30
[2020/12/22 21:37:21] steps=5, epe=73.13790790314701, total_time=7.64
[2020/12/22 21:37:22] steps=6, epe=69.97008332644914, total_time=6.04
[2020/12/22 21:37:22] steps=7, epe=63.190831684866595, total_time=4.98
[2020/12/22 21:37:23] steps=8, epe=69.54386270096657, total_time=4.23
[2020/12/22 21:37:23] steps=9, epe=71.65906570549198, total_time=3.66
[2020/12/22 21:37:24] steps=10, epe=70.68287622669239, total_time=3.22
[2020/12/22 21:37:24] steps=11, epe=68.10887379487774, total_time=2.88
[2020/12/22 21:37:24] steps=12, epe=65.31357897717663, total_time=2.59
[2020/12/22 21:37:25] steps=13, epe=67.39865911195284, total_time=2.36
[2020/12/22 21:37:25] steps=14, epe=66.05316386284305, total_time=2.16
[2020/12/22 21:37:26] steps=15, epe=62.74090359794587, total_time=1.99
[2020/12/22 21:37:26] steps=16, epe=65.24516708995266, total_time=1.85
[2020/12/22 21:37:27] steps=17, epe=61.783343363284466, total_time=1.72
[2020/12/22 21:37:27] steps=18, epe=66.12157773880946, total_time=1.61
[2020/12/22 21:37:27] steps=19, epe=65.41601491031372, total_time=1.51
[2020/12/22 21:37:28] steps=20, epe=67.27401184191667, total_time=1.42
[2020/12/22 21:37:41] steps=50, epe=64.05605013410363, total_time=0.57
[2020/12/22 21:38:03] steps=100, epe=60.72789733634401, total_time=0.45
[2020/12/22 21:38:30] steps=100, sintel.clean=3.107024669647217, sintel.final=**3.6572041511535645**
[2020/12/22 21:38:51] steps=150, epe=58.168171286698964, total_time=0.55
[2020/12/22 21:39:14] steps=200, epe=55.366796654848244, total_time=0.45
[2020/12/22 21:39:41] steps=200, sintel.clean=4.636238098144531, sintel.final=**5.08129358291626**
[2020/12/22 21:40:03] steps=250, epe=52.92103477169547, total_time=0.56
[2020/12/22 21:40:25] steps=300, epe=50.651504112365515, total_time=0.45
[2020/12/22 21:40:52] steps=300, sintel.clean=5.46751070022583, sintel.final=**5.855245113372803**
[2020/12/22 21:41:13] steps=350, epe=48.90560261388807, total_time=0.55
[2020/12/22 21:41:36] steps=400, epe=47.090479957163055, total_time=0.45
[2020/12/22 21:42:02] steps=400, sintel.clean=6.850785255432129, sintel.final=**7.147568702697754**
[2020/12/22 21:42:24] steps=450, epe=45.47630244939083, total_time=0.55
[2020/12/22 21:42:47] steps=500, epe=43.721847967473224, total_time=0.45
[2020/12/22 21:43:14] steps=500, sintel.clean=7.392406940460205, sintel.final=**7.563663005828857**
[2020/12/22 21:43:36] steps=550, epe=41.861068025751216, total_time=0.56
[2020/12/22 21:43:59] steps=600, epe=40.728338542736246, total_time=0.45
[2020/12/22 21:44:25] steps=600, sintel.clean=8.37342643737793, sintel.final=**8.398472785949707**
[2020/12/22 21:44:47] steps=650, epe=39.22414651439415, total_time=0.55
[2020/12/22 21:45:09] steps=700, epe=38.01273616706755, total_time=0.45
[2020/12/22 21:45:36] steps=700, sintel.clean=8.904271125793457, sintel.final=**8.86906623840332**
[2020/12/22 21:45:57] steps=750, epe=36.68394209224638, total_time=0.55
[2020/12/22 21:46:20] steps=800, epe=35.51223404091925, total_time=0.45
[2020/12/22 21:46:46] steps=800, sintel.clean=9.723841667175293, sintel.final=**9.715934753417969**
[2020/12/22 21:47:08] steps=850, epe=34.441762749200876, total_time=0.55
[2020/12/22 21:47:30] steps=900, epe=33.21928807435762, total_time=0.45
[2020/12/22 21:47:56] steps=900, sintel.clean=10.129880905151367, sintel.final=**10.09166431427002**
Question 1) Any idea on why is the network output is such? And how may i fix this?
Question 2) Is there anything you think that is very wrong in the edits i have made?
Thank you so much. Highly appriciate your work.<3 :D
I tried to visualize the mask image with the code below:
output = 1-(occ_mask - occ_mask.min()) / (occ_mask.max() - occ_mask.min())
io.imsave(os.path.join(seq_output_folder, fname), output)
The result is not the same with your paper claimed.
Are there any problems here?
Hi authors,
I am trying to evaluate the optical flow networks' performance against attacks. I was wondering if the MaskFlownet model pretrained on the kitti dataset can be available?
Best Regards
Hi,
I have a question regarding the Occlusion-Aware Pyramid.
In the paper, it writes
in the code, it is
mask0 = Upsample(4)(mask2)
mask0 = F.sigmoid(mask0) - 0.5
c30 = c10
c40 = self.warp(c20, Upsample(4)(flow2)*self.scale)
# concat image 1 with zero mask
c30 = F.concat(c30, F.zeros_like(mask0), dim=1)
# concat warped image 2 with occlusion mask
c40 = F.concat(c40, mask0, dim=1)
From my understanding, the occlusion mask is a probability map (where 1 stands for occlusion and 0 stands for non-occlusion), and after subtraction by 0.5, the range would be [-0.5, 0.5], and value 0, in this case, would mean "don't know whether there is occlusion or not".
Then the question is why image 1 I1 is concatenated with a zero mask instead of a -0.5 mask, or the same occlusion map as image 2 I2? Since the follow-up conv layers are shared for variables c30 and c40, shouldn't the concatenated occlusion mask have the same meaning for both I1 and I2 ?
Thanks a lot!
wrong inference time about vcn in your paper
not 0.03 but 0.18s
Thanks for your great work!
I understand u stress the unsupervised learning of the mask and I just read your code to make sure u successfully learn the mask in an unsupervised manner. But I wonder that we all get the occlusion maps as the supervision to train the MaskFlownet-S. Simply adding an EPE or a cross-entropy loss may guide the MaskFlownet-S to learn a better attention mask. I understand it will take a long time to generate all of the mask maps in these datasets. It is indeed a problem.
Here are some questions about the demonstration of middle results:
Thanks for your attention.
Hi,
Your result is very impressive and. Unfortunately, I'm getting the following error:
*** Error in `python3': double free or corruption (!prev): 0x000055c2f1f46400 ***
Aborted
Did you ever encounter this type of error? or have any idea how to fix it?
Thanks
Hi!
Thanks for sharing!
When I read the code, I noticed that, seems all the training data is preloaded into RAM? and
when training on FlyingThings3d dataset, you have separate it into several parts, and only load one part of it during training. I didn't find the code about reload the rest parts, Could you please point it out for me ? Thanks again!
Will a TensorFlow or Pytorch port be available any time soon?
Hi,
I have noticed that your work mainly uses 'cv2.imread' as image IO, which reads an image as BGR format. But in Sintel.py, I found a mixed use of 'skimage.io.imread' that reads RGB format. Is this expected?
Though I found both RGB/BGR works fine, could you clarify what is the expected input format for the network? What is the input format you used for benchmarking?
Thanks
Min
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.