microsoft / maskflownet Goto Github PK

View Code? Open in Web Editor NEW

367.0 9.0 73.0 239.94 MB

[CVPR 2020, Oral] MaskFlownet: Asymmetric Feature Matching with Learnable Occlusion Mask

Home Page: https://arxiv.org/abs/2003.10955

License: MIT License

Python 100.00%

optical-flow occlusion cvpr2020 sintel kitti feature-matching feature-warping

maskflownet's Issues

Question about provided models

Hi,

Many thanks for your nice work.

I have a question in weights fold what the difference between '771Sep25-0735_500000' and 'abbSep15-1037_500000' . I saw in the readme 771Sep25-0735_500000 in Pre-trained Models, but it is not in the evaluation table.

Could you explain the difference between them?

Best regards,
Meow

Mask visualization problem

I tried to visualize the mask image with the code below:

output = 1-(occ_mask - occ_mask.min()) / (occ_mask.max() - occ_mask.min())
io.imsave(os.path.join(seq_output_folder, fname), output)

The result is not the same with your paper claimed.
Are there any problems here?

Error in operator maskflownet0_maskflownet_s0_upsample0_reshape_like0

Dear all,

I am recently trying to use the code to do some inference of my own, and I basically use the code in master branch, combine the code in main.py and predict.py. But I have received strange errors during the forward pass of the network. Please tell me if I have got anything wrong. Thank you very much!

My code is as follows:

import os
import sys
import argparse
import json
import yaml
import numpy as np
import mxnet as mx
import cv2

import network.config
from network import get_pipeline

# NOTE: Use Default Value when running
parser = argparse.ArgumentParser()
parser.add_argument('--config', default='MaskFlownet.yaml', type=str)
parser.add_argument('--gpu_device', default='0', type=str)
parser.add_argument('--network', default='MaskFlownet', type=str)
parser.add_argument('--data_folder', type=str, default='./data/')
parser.add_argument('--img_folder', type=str, default='images/')
parser.add_argument('--checkpoint', type=str, default='5adNov03-0005_1000000.params')
args = parser.parse_args()
args.img_folder = os.path.join(args.data_folder, args.img_folder)


def main():
    ctx = [mx.gpu(gpu_id) for gpu_id in map(int, args.gpu_device.split(','))]
    prefix = os.path.dirname(__file__)
    config_file = 'MaskFlownet.yaml'
    config_path = os.path.join(prefix, 'network/config', config_file)
    with open(config_path) as f:
        config = network.config.Reader(yaml.load(f))
    
    pipe = get_pipeline('MaskFlownet', ctx=ctx, config=config)
    checkpoint_path = os.path.join(prefix, 'weights', args.checkpoint)
    pipe.load(checkpoint_path)
    pipe.fix_head()

    pre_img_name = '1.jpg'
    cur_img_name = '2.jpg'
    pre_img_path = os.path.join(args.img_folder, pre_img_name)
    cur_img_path = os.path.join(args.img_folder, cur_img_name)
    # read the image and resize
    h, w = 576, 1024
    pre_img = cv2.imread(pre_img_path)
    pre_img = cv2.resize(pre_img, (w, h))
    cur_img = cv2.imread(cur_img_path)
    cur_img = cv2.resize(cur_img, (w, h))
    
    # Output the shape of images, making sure the size is correct
    # The shapes are both
    print('Image Shapes {:} {:}'.format(pre_img.shape, cur_img.shape))
    flow = list(pipe.predict([pre_img], [cur_img], batch_size=1))[0][0]
    print(flow)
    return


if __name__ == '__main__':
    main()

And my error log is

Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.mw to [0.005, 0.01, 0.02, 0.08, 0.32]
Default FLAGS..optimizer.q to None
Image Shapes (576, 1024, 3) (576, 1024, 3)
Traceback (most recent call last):
  File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/infer.py", line 67, in <module>
    main()
  File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/infer.py", line 61, in main
    flow = list(pipe.predict([pre_img], [cur_img], batch_size=1))[0][0]
  File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 209, in predict
    flow, occ_mask, warped, _ = self.do_batch(img1s, img2s, resize = resize)
  File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 136, in do_batch
    flows, occ_masks, _ = self.do_batch_mx(img1, img2, resize = resize)
  File "/mnt/truenas/scratch/ziqi.pang/MaskFlowNet/network/pipeline.py", line 131, in do_batch_mx
    pred, flows, warpeds = self.network(img1, img2)
  File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 471, in __call__
    return self.forward(*args)
  File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 705, in forward
    return self._call_cached_op(x, *args)
  File "/root/.tspkg/lib/python3/mxnet/gluon/block.py", line 612, in _call_cached_op
    out = self._cached_op(*cargs)
  File "/root/.tspkg/lib/python3/mxnet/_ctypes/ndarray.py", line 149, in __call__
    ctypes.byref(out_stypes)))
  File "/root/.tspkg/lib/python3/mxnet/base.py", line 149, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator maskflownet0_maskflownet_s0_upsample0_reshape_like0: [15:34:33] src/operator/tensor/elemwise_unary_op_basic.cc:348: Check failed: (*in_attrs)[0].Size() == (*in_attrs)[1].Size() (1152 vs. 288) Cannot reshape lhs with shape [2,1,18,32]to rhs with shape [1,2,9,16] because they have different size.

Stack trace returned 10 entries:
[bt] (0) /root/.tspkg/lib/libmxnet.so(dmlc::StackTrace[abi:cxx11]()+0x5b) [0x7f2f8ed8507b]
[bt] (1) /root/.tspkg/lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x28) [0x7f2f8ed85be8]
[bt] (2) /root/.tspkg/lib/libmxnet.so(+0x15e128a) [0x7f2f8f7fe28a]
[bt] (3) /root/.tspkg/lib/libmxnet.so(+0x2f8b571) [0x7f2f911a8571]
[bt] (4) /root/.tspkg/lib/libmxnet.so(mxnet::exec::InferShape(nnvm::Graph&&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1ada) [0x7f2f911aa72a]
[bt] (5) /root/.tspkg/lib/libmxnet.so(mxnet::imperative::CheckAndInferShape(nnvm::Graph*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >&&, bool, std::pair<unsigned int, unsigned int>, std::pair<unsigned int, unsigned int>)+0x13c) [0x7f2f912abdfc]
[bt] (6) /root/.tspkg/lib/libmxnet.so(mxnet::Imperative::CachedOp::GetForwardGraph(bool, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&)+0x548) [0x7f2f9129a5a8]
[bt] (7) /root/.tspkg/lib/libmxnet.so(mxnet::Imperative::CachedOp::Forward(std::shared_ptr<mxnet::Imperative::CachedOp> const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&, std::vector<mxnet::NDArray*, std::allocator<mxnet::NDArray*> > const&)+0xb5) [0x7f2f912a23f5]
[bt] (8) /root/.tspkg/lib/libmxnet.so(MXInvokeCachedOp+0xc39) [0x7f2f917c8569]
[bt] (9) /root/.tspkg/lib/libmxnet.so(MXInvokeCachedOpEx+0x3ee) [0x7f2f917c975e]

Finetuning on chairsSDHom epe doesn't go down.

Issue is on training the validation loss goes up too much very quickly. check logs below.

I have added chairsSDHom data loading script as follows.
Changes:

Loading data at iterate_data instead of reading all images into a list in main.py
added chairsSDHom.py, chairsSDHom.yaml
I have attached all code which i have updated below.

1 . main.py

...
...
elif dataset_cfg.dataset.value == "chairsSDHom":
        batch_size=3
        orig_shape= [384,512]
        # training
        chairsSDHom_dataset = chairsSDHom.list_data()
        print(chairsSDHom_dataset['flow'][0])
        from pympler.asizeof import asizeof
        trainImg1 = [file for file in chairsSDHom_dataset['image_0']]
        trainImg2 = [file for file in chairsSDHom_dataset['image_1']]
        trainFlow = [file for file in chairsSDHom_dataset['flow']]
        trainMask = [file for file in chairsSDHom_dataset['mask']]
        trainSize = len(trainFlow)
        training_datasets = [(trainImg1, trainImg2, trainFlow,trainMask)] * batch_size

        # validaion- sintel
        sintel_dataset = sintel.list_data()
        divs = ('training',) if not getattr(config.network, 'class').get() == 'MaskFlownet' else ('training2',)
        for div in divs:
                for k, dataset in sintel_dataset[div].items():
                        dataset = dataset[:samples]
                        img1, img2, flow, mask = [[sintel.load(p) for p in data] for data in zip(*dataset)]
                        validationSize = len(flow)
                        validation_datasets['sintel.' + k] = (img1, img2, flow, mask)
...
...
def iterate_data(iq, dataset):
    if dataset_cfg.dataset.value == 'chairsSDHom' or dataset_cfg.dataset.value == "things3d":
        gen = index_generator(len(dataset[0]))
        while True:
            i = next(gen)
            data = [item[i] for item in dataset]
            if dataset_cfg.dataset.value == "chairsSDHom":
                data = [skimage.io.imread(data[0]),skimage.io.imread(data[1]),chairsSDHom.load(data[2]),skimage.io.imread(data[3])]
            elif dataset_cfg.dataset.value == "things3d":
                data = [cv2.imread(data[0]).astype('uint8'),skimage.io.imread(data[1]).astype('uint8'),things3d.load(data[2]).astype('float16')]
            space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
            crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
            data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
            # vertical flip
            if np.random.randint(2):
                data = [arr[:, :, ::-1] for arr in data]
                data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
            iq.put(data)
    else:
        gen = index_generator(len(dataset[0]))
        while True:
            i = next(gen)
            data = [item[i] for item in dataset]
            space_x, space_y = data[0].shape[0] - orig_shape[0], data[0].shape[1] - orig_shape[1]
            crop_x, crop_y = space_x and np.random.randint(space_x), space_y and np.random.randint(space_y)
            data = [np.transpose(arr[crop_x: crop_x + orig_shape[0], crop_y: crop_y + orig_shape[1]], (2, 0, 1)) for arr in data]
            # vertical flip
            if np.random.randint(2):
                data = [arr[:, :, ::-1] for arr in data]
                data[2] = np.stack([-data[2][0, :, :], data[2][1, :, :]], axis = 0)
            iq.put(data)
...

rest everthing is same

yet training

updated code.zip


Logs:

[2020/12/22 21:36:48] start=0, train=21670, val=224, host=ludwig, batch=3
[2020/12/22 21:36:48] batch=8, config='MaskFlownet_ft.yaml', dataset_cfg='chairsSDHom.yaml', shard=1, gpu_device='1', checkpoint='5adNov03', clear_steps=True, network='MaskFlownet', debug=False, valid=Fa
lse, predict=False, resize=''
[2020/12/22 21:36:54] steps=1, epe=81.23613661839343, total_time=0.00
[2020/12/22 21:37:20] steps=1, sintel.clean=1.4036083221435547, sintel.final=**1.7385120391845703**
[2020/12/22 21:37:20] steps=2, epe=82.52426050579368, total_time=31.65
[2020/12/22 21:37:21] steps=3, epe=70.33922181313649, total_time=15.62
[2020/12/22 21:37:21] steps=4, epe=64.53729546698513, total_time=10.30
[2020/12/22 21:37:21] steps=5, epe=73.13790790314701, total_time=7.64
[2020/12/22 21:37:22] steps=6, epe=69.97008332644914, total_time=6.04
[2020/12/22 21:37:22] steps=7, epe=63.190831684866595, total_time=4.98
[2020/12/22 21:37:23] steps=8, epe=69.54386270096657, total_time=4.23
[2020/12/22 21:37:23] steps=9, epe=71.65906570549198, total_time=3.66
[2020/12/22 21:37:24] steps=10, epe=70.68287622669239, total_time=3.22
[2020/12/22 21:37:24] steps=11, epe=68.10887379487774, total_time=2.88
[2020/12/22 21:37:24] steps=12, epe=65.31357897717663, total_time=2.59
[2020/12/22 21:37:25] steps=13, epe=67.39865911195284, total_time=2.36
[2020/12/22 21:37:25] steps=14, epe=66.05316386284305, total_time=2.16
[2020/12/22 21:37:26] steps=15, epe=62.74090359794587, total_time=1.99
[2020/12/22 21:37:26] steps=16, epe=65.24516708995266, total_time=1.85
[2020/12/22 21:37:27] steps=17, epe=61.783343363284466, total_time=1.72
[2020/12/22 21:37:27] steps=18, epe=66.12157773880946, total_time=1.61
[2020/12/22 21:37:27] steps=19, epe=65.41601491031372, total_time=1.51
[2020/12/22 21:37:28] steps=20, epe=67.27401184191667, total_time=1.42
[2020/12/22 21:37:41] steps=50, epe=64.05605013410363, total_time=0.57
[2020/12/22 21:38:03] steps=100, epe=60.72789733634401, total_time=0.45
[2020/12/22 21:38:30] steps=100, sintel.clean=3.107024669647217, sintel.final=**3.6572041511535645**
[2020/12/22 21:38:51] steps=150, epe=58.168171286698964, total_time=0.55
[2020/12/22 21:39:14] steps=200, epe=55.366796654848244, total_time=0.45
[2020/12/22 21:39:41] steps=200, sintel.clean=4.636238098144531, sintel.final=**5.08129358291626**
[2020/12/22 21:40:03] steps=250, epe=52.92103477169547, total_time=0.56
[2020/12/22 21:40:25] steps=300, epe=50.651504112365515, total_time=0.45
[2020/12/22 21:40:52] steps=300, sintel.clean=5.46751070022583, sintel.final=**5.855245113372803**
[2020/12/22 21:41:13] steps=350, epe=48.90560261388807, total_time=0.55
[2020/12/22 21:41:36] steps=400, epe=47.090479957163055, total_time=0.45
[2020/12/22 21:42:02] steps=400, sintel.clean=6.850785255432129, sintel.final=**7.147568702697754**
[2020/12/22 21:42:24] steps=450, epe=45.47630244939083, total_time=0.55
[2020/12/22 21:42:47] steps=500, epe=43.721847967473224, total_time=0.45
[2020/12/22 21:43:14] steps=500, sintel.clean=7.392406940460205, sintel.final=**7.563663005828857**
[2020/12/22 21:43:36] steps=550, epe=41.861068025751216, total_time=0.56
[2020/12/22 21:43:59] steps=600, epe=40.728338542736246, total_time=0.45
[2020/12/22 21:44:25] steps=600, sintel.clean=8.37342643737793, sintel.final=**8.398472785949707**
[2020/12/22 21:44:47] steps=650, epe=39.22414651439415, total_time=0.55
[2020/12/22 21:45:09] steps=700, epe=38.01273616706755, total_time=0.45
[2020/12/22 21:45:36] steps=700, sintel.clean=8.904271125793457, sintel.final=**8.86906623840332**
[2020/12/22 21:45:57] steps=750, epe=36.68394209224638, total_time=0.55
[2020/12/22 21:46:20] steps=800, epe=35.51223404091925, total_time=0.45
[2020/12/22 21:46:46] steps=800, sintel.clean=9.723841667175293, sintel.final=**9.715934753417969**
[2020/12/22 21:47:08] steps=850, epe=34.441762749200876, total_time=0.55
[2020/12/22 21:47:30] steps=900, epe=33.21928807435762, total_time=0.45
[2020/12/22 21:47:56] steps=900, sintel.clean=10.129880905151367, sintel.final=**10.09166431427002**

Question 1) Any idea on why is the network output is such? And how may i fix this?
Question 2) Is there anything you think that is very wrong in the edits i have made?

Thank you so much. Highly appriciate your work.<3 :D

Memory curroption

Hi,

Your result is very impressive and. Unfortunately, I'm getting the following error:

*** Error in `python3': double free or corruption (!prev): 0x000055c2f1f46400 ***
Aborted

Did you ever encounter this type of error? or have any idea how to fix it?

Thanks

What 512 stand for?

MaskFlownet/predict.py

Line 64 in 2f796ec

pred[:, :, 2] = (64.0 * (flow[:, :, 0] + 512)).astype(np.uint16)

Hi, Simon, why the flow(:,:,1)+512? What's the 512 stand for?

bugfix for kitti

Hi Simon,
some bugfix as follow.

reader/kitti.py: line44 and line 99
samples = None ->samples =-1;
because samples = 32 if args.debug else -1 in main.py line194;
reader/kitti.py: line98
num_files = (len(os.listdir(path_testing)) - 1) // 2 -> num_files = len(os.listdir(path_testing)) // 2;
reader/kitti.py: line105 & line106
img0 = cv2.resize(img0, resize)
img1 = cv2.resize(img1, resize)
->
adj_resize = (resize[1], resize[0])
img0 = cv2.resize(img0, adj_resize)
img1 = cv2.resize(img1, adj_resize)
because cv2.resize() should be dim(width, height) -> dim(cols, rows)

confusing about FlyingThings3d dataset preload

Hi!
Thanks for sharing!
When I read the code, I noticed that, seems all the training data is preloaded into RAM? and
when training on FlyingThings3d dataset, you have separate it into several parts, and only load one part of it during training. I didn't find the code about reload the rest parts, Could you please point it out for me ? Thanks again!

Finetuning on KITTI dataset with sparse ground truth

Hi, authors,

Thanks for sharing the code, it is a great work!

When fine-tuning on KITTI dataset, it only has sparse ground truth. In this case, If we employ some geometric transformation such as scale and roatation with biliear sampling, it will cause problem, because there are many zeros for those non-labeled pixels. Besides, the binary mask will become non-binary anymore.

In your paper, you state that For sparse ground-truth flow in KITTI, the augmented flow is weighted averaged based on the interpolated valid mask. But in the code, I cannot find how you handle this in detail. Could you please tell me how employ geometric transformations on sparse ground truth datasets (e.g, interpolated valid mask)?

Thanks for your attention!

There is not Network Configuration yaml files

Hi,
How can i use the network for inference or training if there are no file for network configuration MaskFlownet.yaml ?

How to set the checkpoint parameter for inference?

II think in kittyi.py line #41, there is no need to -1. There are 200 optical flow images.

And are the program developed on Windows?

Do you provide some checkpoint files or trained models to play with?

Some Question About the Ablation Exps

Thanks for your great work!
I understand u stress the unsupervised learning of the mask and I just read your code to make sure u successfully learn the mask in an unsupervised manner. But I wonder that we all get the occlusion maps as the supervision to train the MaskFlownet-S. Simply adding an EPE or a cross-entropy loss may guide the MaskFlownet-S to learn a better attention mask. I understand it will take a long time to generate all of the mask maps in these datasets. It is indeed a problem.
Here are some questions about the demonstration of middle results:

Did u do the comparison of the supervised and unsupervised learning of mask and their final influence on the result?
The mask seems to be right on the foreground-background case, but actually, we don't really care about the background flow which is far from the foreground. Do u get more visualization of the mask on the objects which are moving close to each other like the third and seventh row in Figure 12?

Thanks for your attention.

BGR vs RGB input

Hi,

I have noticed that your work mainly uses 'cv2.imread' as image IO, which reads an image as BGR format. But in Sintel.py, I found a mixed use of 'skimage.io.imread' that reads RGB format. Is this expected?
Though I found both RGB/BGR works fine, could you clarify what is the expected input format for the network? What is the input format you used for benchmarking?

Thanks
Min

wrong inference time about vcn in your paper

wrong inference time about vcn in your paper
not 0.03 but 0.18s

TensorFlow or Pytorch port

Will a TensorFlow or Pytorch port be available any time soon?

Some problems about MXNetError

Hi,
When I run the main.py with "MaskFlownet_S_sintel.yaml --dataset_cfg sintel_kitti2015_hd1k.yaml -g 0 -c dbbSep30-1206 --clear_steps", i got the error like below:
Traceback (most recent call last):
File "/MaskFlownet-master/main.py", line 537, in
train_log = pipe.train_batch(img1, img2, flow, geo_aug, color_aug, mask = mask)
File "/MaskFlownet-master/network/pipeline.py", line 101, in train_batch
img1s, img2s, labels, masks = geo_aug(img1s, img2s, labels, masks)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 548, in call
out = self.forward(*args)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 915, in forward
return self._call_cached_op(x, *args)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/gluon/block.py", line 821, in _call_cached_op
out = self._cached_op(*cargs)
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/_ctypes/ndarray.py", line 150, in call
ctypes.byref(out_stypes)))
File "/anaconda3/envs/maskflownet2/lib/python3.6/site-packages/mxnet/base.py", line 253, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator geometryaugmentation0_bilinearsampler0: [14:00:56] src/operator/./bilinear_sampler-inl.h:158: Check failed: dshape[0] == lshape[0] (1 vs. 4) :
Can I ask for a help? Thank you very much in advance.

How to do post-processing when submit flow results to KITTI eval server?

Hi, @simon1727 I noticed you mentioned that force to make ground truth dense in training stage #5.
Then how to do post-process when we submit to Kitti eval server in prediction stage?
In other words, how to get sparse result according to dense flow results, which makes sure getting promising eval results? Because the ground truth is sparse in test dataset also.

RAM leak problem when training in command batch_queue.get(). Where do you release resources once a batch is trained.

I have noticed that your training loop leaks small amounts of RAM memory.Any idea on what may have caused this?

time taken= 9.865329265594482 | steps= 1 | cpu= 51.8 | ram= 34.50078675328186 | gpu= [3101]
[5613]
time taken= 0.934636116027832 | steps= 2 | cpu= 27.0 | ram= 29.34866251942084 | gpu= [5613]
[3045]
time taken= 0.8695635795593262 | steps= 3 | cpu= 29.4 | ram= 29.217970957706278 | gpu= [3045]
[3021]
time taken= 0.8483304977416992 | steps= 4 | cpu= 29.8 | ram= 29.033316428574086 | gpu= [3021]
[2997]
time taken= 0.8630681037902832 | steps= 5 | cpu= 30.2 | ram= 28.87988403913803 | gpu= [2997]
[2997]
time taken= 0.8645083904266357 | steps= 6 | cpu= 29.4 | ram= 28.714746447210654 | gpu= [2997]
[2997]
time taken= 0.864253044128418 | steps= 7 | cpu= 29.3 | ram= 28.573093657739385 | gpu= [2997]
[2997]
time taken= 0.8693573474884033 | steps= 8 | cpu= 29.3 | ram= 28.389703885656044 | gpu= [2997]
[2997]
time taken= 0.8704898357391357 | steps= 9 | cpu= 29.4 | ram= 28.298690976454438 | gpu= [2997]
[2997]
time taken= 0.8670341968536377 | steps= 10 | cpu= 29.5 | ram= 28.13385097442091 | gpu= [2997]
[2997]
time taken= 0.8750414848327637 | steps= 11 | cpu= 29.5 | ram= 27.959884882309396 | gpu= [2997]
[2997]
time taken= 0.8624210357666016 | steps= 12 | cpu= 29.9 | ram= 27.784356443255188 | gpu= [2997]
[2997]
time taken= 0.8561670780181885 | steps= 13 | cpu= 29.8 | ram= 27.644241201568796 | gpu= [2997]
[2997]
time taken= 0.8609695434570312 | steps= 14 | cpu= 29.7 | ram= 27.51883186047002 | gpu= [2997]
[2997]
time taken= 0.8462607860565186 | steps= 15 | cpu= 29.7 | ram= 27.36641623650461 | gpu= [2997]
[2997]
time taken= 0.8624782562255859 | steps= 16 | cpu= 29.2 | ram= 27.23760941078441 | gpu= [2997]
[2997]
time taken= 0.8649694919586182 | steps= 17 | cpu= 29.4 | ram= 27.113514425050127 | gpu= [2997]
[2997]
time taken= 0.8661544322967529 | steps= 18 | cpu= 29.3 | ram= 27.004993310427178 | gpu= [2997]
[2997]
time taken= 0.8687705993652344 | steps= 19 | cpu= 29.8 | ram= 26.82090916192486 | gpu= [2997]
[2997]
time taken= 0.8823645114898682 | steps= 20 | cpu= 29.6 | ram= 26.688630454109777 | gpu= [2997]
[2997]
time taken= 0.8795809745788574 | steps= 21 | cpu= 29.4 | ram= 26.517987449146226 | gpu= [2997]
[2997]
time taken= 0.8857841491699219 | steps= 22 | cpu= 29.1 | ram= 26.40289455770082 | gpu= [2997]
[2997]
time taken= 0.8605339527130127 | steps= 23 | cpu= 29.5 | ram= 26.274509317663572 | gpu= [2997]
[2997]
time taken= 0.8524265289306641 | steps= 24 | cpu= 29.8 | ram= 26.16445065525575 | gpu= [2997]

Question regarding Occlusion-Aware Pyramid

Hi,

I have a question regarding the Occlusion-Aware Pyramid.

In the paper, it writes

in the code, it is

mask0 = Upsample(4)(mask2)  
mask0 = F.sigmoid(mask0) - 0.5  
c30 = c10  
c40 = self.warp(c20, Upsample(4)(flow2)*self.scale)  
# concat image 1 with zero mask
c30 = F.concat(c30, F.zeros_like(mask0), dim=1)  
# concat warped image 2 with occlusion mask
c40 = F.concat(c40, mask0, dim=1)

From my understanding, the occlusion mask is a probability map (where 1 stands for occlusion and 0 stands for non-occlusion), and after subtraction by 0.5, the range would be [-0.5, 0.5], and value 0, in this case, would mean "don't know whether there is occlusion or not".

Then the question is why image 1 I1 is concatenated with a zero mask instead of a -0.5 mask, or the same occlusion map as image 2 I2? Since the follow-up conv layers are shared for variables c30 and c40, shouldn't the concatenated occlusion mask have the same meaning for both I1 and I2 ?

Thanks a lot!

How to adjust the parameter?

I train the Maskflownet on Sintel train + KITTI 2015 + HD1K without any change of your code, however the performance of my own trained model is not as well as your pretrained model "8caNov12", if there is any difference in your uploaded MaskFlownet_sintel.yaml and sintel_kitti2015_hd1k.yaml? Or should I need adjust any other parameters? Thank you very much for your help in advance.

Typo in network code

In this line https://github.com/microsoft/MaskFlownet/blob/master/network/MaskFlownet.py#L306

Should it be c2s = [c21, c22, c23, c24, c25, c26] instead of c2s = [c21, c12, c13, c24, c25, c26] ? The latter version doesn't make much sense to me.

Is this ' kitti training‘？

https://github.com/microsoft/MaskFlownet/blob/master/reader/kitti.py#L24

And why set samples = -1
https://github.com/microsoft/MaskFlownet/blob/master/main.py#L194

Does image size matters?

Hi, really appreciate for opening the code! Can i have few questions?

Does the image size influence the performance? I mean in the training stage, image patches (896x320 for Kitti) are used. But in testing, do you still use the image patches? Or use the entire image (around 1240x370 for Kitti)?

I know cropping is for augmentation, but If you use small pates for training, and larger images for testing, does this strategy will influence the performance?

Thank you very much!

How can I infer my own data by net MaskFlownet_S?

AssertionError: Parameter 'maskflownet_s0_hybridsequential0_conv1aweight' is missing in file './MaskFlownet/weights/dbbSep30-1206_1000000.params', which contains parameters: 'maskflownet0_hybridsequential0_conv1aweight', 'maskflownet0_hybridsequential0_conv1abias', 'maskflownet0_hybridsequential1_conv1bweight', ..., 'maskflownet0_hybridsequential55_conv3fweight', 'maskflownet0_hybridsequential55_conv3fbias', 'maskflownet0_hybridsequential56_conv2fweight', 'maskflownet0_hybridsequential56_conv2fbias'. Please make sure source and target networks have the same prefix.

I get error as above, and how can I get occlusion masks as paper's work?
this is the command I used:
python predict_new_data.py ./test.png MaskFlownet_S.yaml --image_1 ./image_1.png --image_2 ./image_2.png -g 0 -c 5adNov03

Questions about the provided checkpoints

Hi,

Thank you very much for providing the source code, it's really awesome.

I have several questions about the provided checkpoints,

For the provided dbbSep30-1206_1000000 checkpoint, it seems that the real validation result is different from the score which mentioned it the README section (i.e., 2.07 / 4.07 for Sintel). I ran it on the validation set and got a score of 1.47 / 1.90
for Sintel Val.

There also exists inconsistent between the log file and the provided checkpoint, as the last line of the dbbSep30-1206.log is correct.

I am guessing that this checkpoint is trained on the whole Sintel dataset, am I correct?

From the guess from (1), I try to upload the test results to see if the dbbSep30-1206_1000000 checkpoint can reproduce the results reported on the paper and the website. But I find that there is a gap in them: I got 4.877 / 3.182 on FINAL and CLEAN, respectively, and the reported ones on the paper and website are 4.38 / 2.77.

I would appreciate it if you could help to clarify these questions and provide the checkpoints which can reproduce the results.

Thank you for your time and consideration again!

Details about fine-tune in the Flyingthings3D

Can you tell me the number of training iterations for finetune on flyings3d after pre training on flyingchairs? Would you like to use s-fine to train 0.5m iterations after flyingchairs 1.2m iterations, or restart using long + sfine for a total of 1.7m iterations?

What is the variable F in your code please . . .

could you please explain what should be the F in this line of code :
x2_warp = self.deform5(x2, F.repeat(F.expand_dims(flow*self.scale/self.strides[1], axis=1), 9, axis=1).reshape((0, -3, -2)))

thank you

No activation function in FeaturePyramidExtractor?

Hi!
Thanks for your greater work! I am ready to relocate your work from mxnet to pytorch. But I encountered a problem that there is no activation function in FeaturePyramidExtractor different from PWC-net.
Is that a bug?

您好，我也是做这方面工作的。能否交流下？

您好，我也是做这方面工作的。Occlusion+optical flow，去年我就完成了预测occlusion mask放入PWC-Net网络作为decoder分支网络的辅助信息类似的工作，很遗憾后面没有就这个工作深入下去。
还没深入看您的paper，但感觉思路和我当时的相差无几，如果方便的话，能否留个联系方式探讨下？

Some problems about the prediction of optical flow and occlusion

Thanks for your great job!

When I run python main.py MaskFlownet.yaml -g 0 -c 000Mar17 --predict. I got outputs in /flows.

But it doesn't seem to be the right size.

The only thing I changed in the code was to replace imread in skimage.io with imread in cv2.

MaskFlownet/reader/sintel.py

Line 3 in 2f796ec

import skimage.io

My environment:
python 3.6.8
mxnetcu90-1.5.1
CUDA 9.0.176
cudnn 7.6.5

In addition, I would like to ask how to correctly visualize occlusion in binary form?

A question on VCN performance in Table 1

I find an inconsistency in describing the running time of VCN in Table 1 between the arxiv paper : 0.18 and the official paper: 0.03.

Could you let me know the exact runtime?

Thanks!

could not find the yaml file

Hello, thanks for your work. I could not find the .yaml config file in your project, which I think should be used in inference. Could you help upload it? I want to use it to inference on my own data based on your pretrained model. Thank you.

Can you provide the .state file?

I want to fine tune pretrained maskflownet on ChairsSDHom.
I have written the dataset scripts for it but am getting this error while training.

Traceback (most recent call last):
  File "main.py", line 143, in <module>
    pipe.trainer.load_states(checkpoint.replace('params', 'states'))
  File "/home/mask/miniconda3/envs/mask/lib/python3.6/site-packages/mxnet/gluon/trainer.py", line 515, in load_states
    with open(fname, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/mask/maskflownet/MaskFlownet/weights/5adNov03-0005_1000000.states'

Can you please help me on how to proceed?

Is there a pytorch implementation?

How to use inference with my own dataset?

Hi, thanks for sharing.
I am trying to test your model on a pair of image but could not make it working.
I installed Python 3.6.10 and mxnet1.5 using anaconda and all necessary modules.
By now it crashes when reading the model, something is missing. Here is my command:
python main.py MaskFlownet_S.yaml -c 8caNov12 --predict --clear_steps --debug
and result is:
[('C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\logs\8caNov12-1532.log', '8caNov12-1532', '-1532')]
Default FLAGS..network.flow_multiplier to 1.0
Default FLAGS..network.deform_bias to True
Default FLAGS..network.upfeat_ch to [16, 16, 16, 16]
Default FLAGS..network.mw to [0.005, 0.01, 0.02, 0.08, 0.32]
Default FLAGS..optimizer.q to None
Default FLAGS..optimizer.learning_rate to None
Load Checkpoint C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\weights\8caNov12-1532_300000.params
load the weight for the network
Traceback (most recent call last):
File "main.py", line 136, in
pipe.load(checkpoint)
File "C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\network\pipeline.py", line 57, in load
self.network.load_parameters(checkpoint, ctx=self.ctx)
File "C:\Users\cvestri\AppData\Local\conda\conda\envs\py36_mxnet\lib\site-packages\mxnet\gluon\block.py", line 394, in load_parameters
cast_dtype=cast_dtype, dtype_source=dtype_source)
File "C:\Users\cvestri\AppData\Local\conda\conda\envs\py36_mxnet\lib\site-packages\mxnet\gluon\parameter.py", line 968, in load
name[lprefix:], filename, _brief_print_list(arg_dict.keys()))
AssertionError: Parameter 'hybridsequential0_conv1aweight' is missing in file 'C:\Users\cvestri\Work\Dev\RDVision\Code\MaskFlownet\weights\8caNov12-1532_300000.params', which contains parameters: 'maskflownet_s0_maskflownet_s0_hybridsequential0_conv1aweight', 'maskflownet_s0_maskflownet_s0_hybridsequential0_conv1abias', 'maskflownet_s0_maskflownet_s0_hybridsequential1_conv1bweight', ..., 'maskflownet_s0_deform3weight', 'maskflownet_s0_deform3bias', 'maskflownet_s0_deform2weight', 'maskflownet_s0_deform2bias'. Please make sure source and target networks have the same prefix.

it is the same with mxNet 1.6
Thanks

Can I use the code to recurrent the performance reported in paper?

Hi!
I am really interested in your paper and this repo.
I have trained the MaskflowNet_s using FlyingChairs dataset with the code provided in this
repo. However, the epe on mpi-sintel clean and final is 2.999 and 4.399. There is quite a gap between the reported epe 2.88 and 4.25.
So, here is my question, Is the code provided in this repo is that you used for experiment during writing the paper? Can I use the code to recurrent the performance reported in paper without further modify? The performance reported in this paper is the best checkpoint or the final checkpoint?
Thanks!

Pretrained weights on the kitti dataset for Adv attacks

Hi authors,

I am trying to evaluate the optical flow networks' performance against attacks. I was wondering if the MaskFlownet model pretrained on the kitti dataset can be available?

Best Regards

microsoft / maskflownet Goto Github PK

maskflownet's Issues

Recommend Projects

Recommend Topics

Recommend Org