Coder Social home page Coder Social logo

naoto0804 / pytorch-inpainting-with-partial-conv Goto Github PK

View Code? Open in Web Editor NEW
575.0 20.0 132.0 2.05 MB

Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions' [Liu+, ECCV2018]

License: MIT License

Python 100.00%
pytorch inpainting cnn

pytorch-inpainting-with-partial-conv's Introduction

pytorch-inpainting-with-partial-conv

Official implementation is released by the authors.

Note that this is an ongoing re-implementation and I cannot fully reproduce the results. Suggestions and PRs are welcome!

This is an unofficial pytorch implementation of a paper, Image Inpainting for Irregular Holes Using Partial Convolutions [Liu+, arXiv2018].

Requirements

  • Python 3.6+
  • Pytorch 0.4.1+
pip install -r requirements.txt

Usage

Preprocess

  • download Places2 and place it somewhere. The dataset should contain data_large, val_large, and test_large as the subdirectories. Don't forget to specify the root of the dataset by --root ROOT when using train.py or test.py

  • Generate masks by following [1] (saved under ./masks by default). Note that the way of the mask generation is different from the original work

python generate_data.py

Train

CUDA_VISIBLE_DEVICES=<gpu_id> python train.py

Fine-tune

CUDA_VISIBLE_DEVICES=<gpu_id> python train.py --finetune --resume <checkpoint_name>

Test

CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --snapshot <snapshot_path>

Results

Here are some results from the test set after the training of 500,000 iterations and fine-tuning (freezing BN in encoder) of 500,000 iterations. The model is available here, but I don't ensure the quality. (Top to bottom: input, mask, image generated by the network, image which is combined with the original non-masked region of image, ground truth) Results

References

pytorch-inpainting-with-partial-conv's People

Contributors

naoto0804 avatar thehamsta avatar violetamenendez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-inpainting-with-partial-conv's Issues

Slight scaling issue in PartialConv function

Hi, thanks for your great project. I just wanted to point out a potential issue with the implementation of the PartialConv function here, which is easily spotted if you run the following:

size = (1, 1, 10, 10)
X = torch.ones(size) # > Input layer
Y = torch.ones(size) # > Mask layer (=all elements are good to go)
convH0 = torch.nn.Conv2d(1,1,3,1,1,bias=False)
with torch.no_grad(): # > Manually set the weights of the convolution kernel
    convH0.weight = nn.Parameter(torch.FloatTensor([[[[ 0.2273,  0.1403, -1.0889],
                                                      [-0.0351, -0.2992,  0.2029],
                                                      [ 0.0280,  0.2878,  0.5101]]]]))
output0 = convH0(X) # > Result from standard convolution kernel
PConv = PartialConv(1,1,3,1,1,bias=False) 
with torch.no_grad(): # > Set weights of PConv layer equal to conv. layer
    PConv.input_conv.weight = nn.Parameter(torch.FloatTensor([[[[ 0.2273,  0.1403, -1.0889],
                                                                [-0.0351, -0.2992,  0.2029],
                                                                [ 0.0280,  0.2878,  0.5101]]]]))
output1, mask1 = PConv(X,Y) # > Result from partial convolution layer

I would expect the result for both operations to be the same. However, output1=output0/9! The cause of the error lies in the following line:

output_pre = (output - output_bias) / mask_sum + output_bias

where 'mask_sum' is a tensor mostly filled with the value 9. In the original papers, that corresponds to the sum(M) in the denominator. But what is missing is the sum(1) numerator, which should cancel this value of 9 again. I think it can be fixed if you compute the following in the __init__ part of PartialConv

self.sumI = kernel_size**2*in_channels

and then in the forward call you compute

output_pre = (output - output_bias) * self.sumI / mask_sum + output_bias

These changes [assuming a square kernel -- otherwise I suppose you could compute self.sumI by multiplying the shape of the weights or something like that] also correctly fix the results in case holes are present. That is, it would then be fully in line with the original paper.

I don't know how big the effect will be on the training, but they could be non-zero.

Oops. I only just now see that this is the same as issue #44 ! Well, this time with some more background then.

Pre-trained Model

Is the pre-trained model trained on the Places2 or ImageNet dataset?

Because in the net.py file, a pre-trained model is called (vgg16) and it's trained on ImageNet.

vgg16 = models.vgg16(pretrained=True)

And there is a python file called Places2.py, so I'm not sure on which dataset is the 100000.pth trained on.

Thank you for your time.

test input size

I have image masks with different sizes, but I find that I can't set the test input size randomly.

How to load Places2 dataset?

How to access the Places2 datasets in this code? Should I download or not? If download in advance, where does it need to place? thanks!

test.py uses Places2 Class incorrectly

Bug:

Places2-Class Signature in places.py

class Places2(torch.utils.data.Dataset):
    def __init__(self, img_root, mask_root, img_transform, mask_transform, split='train'): ....

How Places2 is called in test.py:

dataset_val = Places2(args.root, img_transform, mask_transform, 'val')

This misses the mask_root folder

Suggested Fix:
Either use a specific value in the image to create a mask, such as:

mask = np.zeros_like(img)
black_pixels_mask = np.all(img== [0, 0, 0], axis=-1)

or add a mask-root folder to the args

Results

Can you share some of your training results like 5000 iteration 10000 itertation and so on?

there is a patch in image corner when training?

I train my images, and I find that there is a patch like a QR code pattern on image corner
Sometimes disappear when image is not complex. And not always on the same location but on 4 corner.
image
image

Is there a worng understand in total variation?

I find this does not conform to the original paper’s method, I think the sum of the abs value should be taken into the Loss(tv), and the tv loss is not the global difference of the whole picture, it just around the hole areas (P is the region of 1-pixel dilation of the hole region).

def total_variation_loss(image):
    # shift one pixel and get difference (for both x and y direction)
    loss = torch.mean(torch.abs(image[:, :, :, :-1] - image[:, :, :, 1:])) + \
        torch.mean(torch.abs(image[:, :, :-1, :] - image[:, :, 1:, :]))
    return loss

About Model Size

Thanks for your codes. I got a question that is it possible to compress the model (it is >393M now) to make it run faster?

when I running train.py

Traceback (most recent call last):
File "C:/Users/xxx/Downloads/pytorch-inpainting-with-partial-conv-master/train.py", line 85, in
num_workers=args.n_threads))
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 501, in iter
return _DataLoaderIter(self)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 297, in init
self._put_indices()
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 345, in _put_indices
indices = next(self.sample_iter, None)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\sampler.py", line 138, in iter
for idx in self.sampler:
File "C:/Users/xxx/Downloads/pytorch-inpainting-with-partial-conv-master/train.py", line 34, in loop
yield order[i]
IndexError: index 0 is out of bounds for axis 0 with size 0

Inquiry about LICENSE

I would like to know the License about pytorch-inpainting-with-partial-conv.
Will it be MIT LICENSE such as other your repository, or if there is any restriction, I would like to know for using this code.

Thank you for your great job.

bad examples

@naoto0804 hi, so appreciate with your great work. When I test my image with your pretrained model,it coms bad result:
image
image
image
image
image
how about yours result? can you help me?

Pretrained model issue

Hi,

I tried the pretrained model you shared here and run it using test.py code. However, there are so many artifacts in the results. Could you show me where is the problem?

regarding updating the mask step

thanks for sharing yr work
i have a question regarding how do u update the mask? i really don't get it from paper
your help is really appreciated

Is the test.py script wrong?

Should I have to provide the corresponding mask when testing?
i find no mask dataset root in the test.py, .
in the test.py:
dataset_val = Places2(args.root, img_transform, mask_transform, 'val') (no mask root)
in the train.py:
dataset_val = Places2(args.root, args.mask_root, img_tf, mask_tf, 'val') (with mask root)
so, how to implement 'test' exactly?

A problem in net.py

Thanks for your awesome reproduce work!

But, I think you have a inadvertent error in net.py.

output_pre = (output - output_bias) / mask_sum + output_bias

I think it should be as follows:

output_pre = (output - output_bias) / mask_sum * (self.kernel_size * self.kernel_size * self.in_channels) + output_bias

test

image

Why do I use your code to test that the output image has a mask edge and more than one part, such as red circles?

How output all results on the test set in Places2?

Hi, I have tested test.py file and it does work. Now, I try to test on
all images in Places (in fact, only 50 images for test), and the
evaluation.py file as follows (default 8 -> len(dataset)):

import torch
from torchvision.utils import make_grid
from torchvision.utils import save_image

from util.image import unnormalize


def evaluate(model, dataset, device, filename):
    image, mask, gt = zip(*[dataset[i] for i in range(len(dataset)])
    image = torch.stack(image)
    mask = torch.stack(mask)
    gt = torch.stack(gt)
    with torch.no_grad():
        output, _ = model(image.to(device), mask.to(device))
    output = output.to(torch.device('cpu'))
    output_comp = mask * image + (1 - mask) * output
    output_comp = unnormalize(output_comp)
  for j in range(len(dataset):
      each_output_comp = output_comp[j]
      absolut_file_name = os.path.join(save_result_dir, file_name[j])
      save_image(each_output_comp, absolut_file_name)

but it outputs errors:

Traceback (most recent call last):
  File "test.py", line 42, in <module>
    evaluate(model, dataset_val, device, save_result_dir)
  File "/home/user/FG/code/pytorch-inpainting-with-partial-conv/evaluation.py", line 15, in evaluate
    output, _ = model(image.to(device), mask.to(device))
  File "/home/user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/FG/code/pytorch-inpainting-with-partial-conv/net.py", line 180, in forward
    h, h_mask = getattr(self, dec_l_key)(h, h_mask)
  File "/home/user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/FG/code/pytorch-inpainting-with-partial-conv/net.py", line 117, in forward
    h, h_mask = self.conv(input, input_mask)
  File "/home/user/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/user/FG/code/pytorch-inpainting-with-partial-conv/net.py", line 87, in forward
    output_pre = (output - output_bias) / mask_sum + output_bias
RuntimeError: CUDA out of memory. Tried to allocate 640.00 MiB (GPU 0; 10.73 GiB total capacity; 9.23 GiB already allocated; 399.62 MiB free; 370.56 MiB cached)

Please help me. How to solve this such that I can test several thousands
images. Thank you.

finetune lr parameter overwritten on resuming (last fix seems insufficient)

args.resume, [('model', model)], [('optimizer', optimizer)])

When you want to resume your model for finetuning, you overwritting new optimizer with the old one, and smaller lr is not applied. You would need something like this:

    for param_group in optimizer.param_groups:
        param_group['lr'] = lr

btw. finetuning with bn=False and smaller lr is giving nice results in my experiments with similar UNet architecture. I would suggest you to try it, if you didn't do that yet.

Running on CPU

Is it possible to use CPU for training instead of CUDA GPU?

where is the mask_root?

there is parser.add_argument 'mask_root' in the train.py, but i did not find it in the cloned files. is it needed?

Blurry problem in training

Hi @naoto0804 , Thanks for your helpful project. I'm opening this issue to ask whether you have meeted the problem of blurry artifacts in the training process. It seems that the results in 30w iter are still blurry. Could you give some hints for the results in the training process?

Pytorch version issue & Pretrained weight

Hello,

First of all, thanks for sharing your code.

I have several issues regarding to pytorch version and pretrained weight sharing request.

Due to dependency problem of another code, I am using now pytorch 0.4.0 version not 0.4.1 which is noted as a required version for this repo.

However, I found that by replacing F.interpolate function with F.upsample function makes everything fine. Is it okay to use this repo as this way?

Moreover, could you please share your pretrained weight ?

Looking forward to your reply.

Thanks.

About resizing the mask

Thanks for your sharing!@naoto0804
I found that you generate 512 * 512 masks and resize them to 256 * 256 using transforms.Resize(size=256).
However, the default downsampling method is the linear interpolation, which causes the value of the resized mask to be not only 0 and 1, but also other values in [0,1]. The resized mask is not exactly the mask defined in the original paper.
Therefore, when I use 256 * 256 mask ( without resizing, i.e. its value only contains 0 and 1 ) for testing the 1000000.pth pre-trained model, the result is not good.
To avoid this, maybe resizing the mask with the nearest neighbor downsampling method when training the network works.

train.py error

After installing torch0.4 in python3.6, I run the train.py. At the beginning, error was
image
So, I add interplolate copy from net like below into functional.py

def interpolate(input, size=None, scale_factor=None, mode='nearest', align_corners=None):
r"""Down/up samples the input to either the given :attr:size or the given
:attr:scale_factor
The algorithm used for interpolation is determined by :attr:mode.
Currently temporal, spatial and volumetric sampling are supported, i.e.
expected inputs are 3-D, 4-D or 5-D in shape.
The input dimensions are interpreted in the form:
mini-batch x channels x [optional depth] x [optional height] x width.
The modes available for resizing are: nearest, linear (3D-only),
bilinear (4D-only), trilinear (5D-only), area
Args:
input (Tensor): the input tensor
size (int or Tuple[int] or Tuple[int, int] or Tuple[int, int, int]):
output spatial size.
scale_factor (float or Tuple[float]): multiplier for spatial size. Has to match input size if it is a tuple.
mode (string): algorithm used for upsampling:
'nearest' | 'linear' | 'bilinear' | 'trilinear' | 'area'. Default: 'nearest'
align_corners (bool, optional): if True, the corner pixels of the input
and output tensors are aligned, and thus preserving the values at
those pixels. This only has effect when :attr:mode is linear,
bilinear, or trilinear. Default: False
.. warning::
With align_corners = True, the linearly interpolating modes
(linear, bilinear, and trilinear) don't proportionally align the
output and input pixels, and thus the output values can depend on the
input size. This was the default behavior for these modes up to version
0.3.1. Since then, the default behavior is align_corners = False.
See :class:~torch.nn.Upsample for concrete examples on how this
affects the outputs.
"""
from numbers import Integral
from .modules.utils import _ntuple
def _check_size_scale_factor(dim):
if size is None and scale_factor is None:
raise ValueError('either size or scale_factor should be defined')
if size is not None and scale_factor is not None:
raise ValueError('only one of size or scale_factor should be defined')
if scale_factor is not None and isinstance(scale_factor, tuple)
and len(scale_factor) != dim:
raise ValueError('scale_factor shape must match input shape. '
'Input is {}D, scale_factor size is {}'.format(dim, len(scale_factor)))
def _output_size(dim):
_check_size_scale_factor(dim)
if size is not None:
return size
scale_factors = _ntuple(dim)(scale_factor)
# math.floor might return float in py2.7
return [int(math.floor(input.size(i + 2) * scale_factors[i])) for i in range(dim)]
if mode in ('nearest', 'area'):
if align_corners is not None:
raise ValueError("align_corners option can only be set with the "
"interpolating modes: linear | bilinear | trilinear")
else:
if align_corners is None:
warnings.warn("Default upsampling behavior when mode={} is changed "
"to align_corners=False since 0.4.0. Please specify "
"align_corners=True if the old behavior is desired. "
"See the documentation of nn.Upsample for details.".format(mode))
align_corners = False
if input.dim() == 3 and mode == 'nearest':
return torch._C._nn.upsample_nearest1d(input, _output_size(1))
elif input.dim() == 4 and mode == 'nearest':
return torch._C._nn.upsample_nearest2d(input, _output_size(2))
elif input.dim() == 5 and mode == 'nearest':
return torch._C._nn.upsample_nearest3d(input, _output_size(3))
elif input.dim() == 3 and mode == 'area':
return adaptive_avg_pool1d(input, _output_size(1))
elif input.dim() == 4 and mode == 'area':
return adaptive_avg_pool2d(input, _output_size(2))
elif input.dim() == 5 and mode == 'area':
return adaptive_avg_pool3d(input, _output_size(3))
elif input.dim() == 3 and mode == 'linear':
return torch._C._nn.upsample_linear1d(input, _output_size(1), align_corners)
elif input.dim() == 3 and mode == 'bilinear':
raise NotImplementedError("Got 3D input, but bilinear mode needs 4D input")
elif input.dim() == 3 and mode == 'trilinear':
raise NotImplementedError("Got 3D input, but trilinear mode needs 5D input")
elif input.dim() == 4 and mode == 'linear':
raise NotImplementedError("Got 4D input, but linear mode needs 3D input")
elif input.dim() == 4 and mode == 'bilinear':
return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners)
elif input.dim() == 4 and mode == 'trilinear':
raise NotImplementedError("Got 4D input, but trilinear mode needs 5D input")
elif input.dim() == 5 and mode == 'linear':
raise NotImplementedError("Got 5D input, but linear mode needs 3D input")
elif input.dim() == 5 and mode == 'bilinear':
raise NotImplementedError("Got 5D input, but bilinear mode needs 4D input")
elif input.dim() == 5 and mode == 'trilinear':
return torch._C._nn.upsample_trilinear3d(input, _output_size(3), align_corners)
else:
raise NotImplementedError("Input Error: Only 3D, 4D and 5D input Tensors supported"
" (got {}D) for the modes: nearest | linear | bilinear | trilinear"
" (got {})".format(input.dim(), mode))
Now the error is image Could you help me?

Pretrained model for testing issue

result
Hi
I test the pretrained model you shared online using test.py but the results are very different from yours (as you can see above). There are many artifacts in the masked region. Could you please help me figure this out? Maybe I missed something during the implementation?

Thank you very much!

Problem while using net.py

when i tried to link your model and loss function into a project of mine i encountered thus issue
RuntimeError: Sizes of tensors must match except in dimension 1. Got 10 and 9 in dimension 2 (The offending index is 1)
line h = torch.cat([h, h_dict[enc_h_key]], dim=1)
the sizes of the tensors in order were
torch.Size([32, 512, 10, 10])
torch.Size([32, 512, 9, 9])
can you help me fix it
thanks in advance

Accuracy Function

Hello

I'm wondering if you have used an accuracy function for determining the accuracy during the training because I couldn't find any metric function to tell how good is the output.

Also, does the code run over the whole dataset in each iteration (epoch) or does it pick 8 images randomly each iteration? (considering batch size=8)

Thank you in advance!

How to test this code on my own dataset?

Hi, I want to know how to use this code to test on my own dataset? I try to run the following code "python test.py --snapshot ./1000000.pth --root ./dataset", where "./dataset" is my own dataset (.jpg). But it outputs the error:
Traceback (most recent call last):
File "test.py", line 32, in
dataset_val = Places2(args.root, img_transform, mask_transform, 'val')
File "/home/user/FG/code/pytorch-inpainting-with-partial-conv/places2.py", line 21, in init
self.mask_paths = glob('{:s}/
.jpg'.format(mask_root))
TypeError: unsupported format string passed to Compose.format

How to solve this. Very thank you.

why the generated masks are different from that in the original paper?

thanks for your work. i note that the generated masks using the code are different from the the original paper.
the generated masks in the paper are as follows:
image

the generated masks using the code are as follows:
image

do you follow the method of the paper or just use your own way for generating masks?

misunderstading about Unet architecture in your work

Thanks for your awesome reproduce work! while reading your code, I am a little curious about the number of Unet layers. According to your code in net.py, you use 14 layers Unet in your work:

layer_size= 7 
self.layer_size = layer_size
        self.enc_1 = PCBActiv(input_channels, 64, bn=False, sample='down-7')
        self.enc_2 = PCBActiv(64, 128, sample='down-5')
        self.enc_3 = PCBActiv(128, 256, sample='down-5')
        self.enc_4 = PCBActiv(256, 512, sample='down-3')
        for i in range(4, self.layer_size):
            name = 'enc_{:d}'.format(i + 1)
            setattr(self, name, PCBActiv(512, 512, sample='down-3')

It seems a little different from the paper, since the paper uses 16 layers totally, both encoder and decoder are 8 layers equally.
I am wandering if its your trick to train size 256*256 images? or its just a inadvertent error here? Thank you for your time.

Suggestions: Efficient Partial Convolution

Firstly, I would like to thank you for your implementation.

Below is my version. The main improvement is in using PyTorch's masked_fill_. I guess this is the fastest method without creating a customized C++ function.

class PartialConv(nn.Module):
    # reference:
    # Image Inpainting for Irregular Holes Using Partial Convolutions
    # http://masc.cs.gmu.edu/wiki/partialconv/show?time=2018-05-24+21%3A41%3A10
    # https://github.com/naoto0804/pytorch-inpainting-with-partial-conv/blob/master/net.py
    # https://github.com/SeitaroShinagawa/chainer-partial_convolution_image_inpainting/blob/master/common/net.py
    # mask is binary, 0 is holes; 1 is not
    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1, bias=True):
        super(PartialConv, self).__init__()
        random.seed(0)
        self.feature_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride,
                                      padding, dilation, groups, bias)
        nn.init.kaiming_normal_(self.feature_conv.weight)

        self.mask_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride,
                                   padding, dilation, groups, bias=False)
        torch.nn.init.constant_(self.mask_conv.weight, 1.0)

        for param in self.mask_conv.parameters():
            param.requires_grad = False

    def forward(self, args):
        x, mask = args
        output = self.feature_conv(x * mask)
        if self.feature_conv.bias is not None:
            output_bias = self.feature_conv.bias.view(1, -1, 1, 1).expand_as(output)
        else: 
            output_bias = torch.zeros_like(output)

        with torch.no_grad():
            output_mask = self.mask_conv(mask)  # mask sums

        no_update_holes = output_mask == 0
        # because those values won't be used , assign a easy value to compute
        mask_sum = output_mask.masked_fill_(no_update_holes, 1.0)

        output_pre = (output - output_bias) / mask_sum + output_bias
        output = output_pre.masked_fill_(no_update_holes, 0.0)

        new_mask = torch.ones_like(output)
        new_mask = new_mask.masked_fill_(no_update_holes, 0.0)
        return output, new_mask

Benchmark:

Your code:
Runtime: 1.7311532497406006
Memory increment on a forward pass: 125.9 MiB

My code:
Runtime: 0.3832552433013916
Memory increment on a forward pass: 57.1 MiB

Output feature difference: 0.0
Mask output difference: 0.0

Codes for the benchmark

import time
from memory_profiler import profile
import torch
from torch import nn
import random
from torch.nn import functional as F


def proftime(func):
    def timed(*args, **kw):
        ts = time.time()
        result = func(*args, **kw)
        te = time.time()
        print(f"Runtime: {te-ts}")
        return result

    return timed


class PConv2d(nn.Module):
    def __init__(self, in_ch, out_ch, kernel_size, stride=1, padding=0):
        super().__init__()
        random.seed(0)
        self.conv2d = nn.Conv2d(in_ch, out_ch, kernel_size, stride, padding)
        nn.init.kaiming_normal_(self.conv2d.weight)
        self.mask2d = nn.Conv2d(in_ch, out_ch, kernel_size, stride, padding)
        self.mask2d.weight.data.fill_(1.0)
        self.mask2d.bias.data.fill_(0.0)

        # mask is not updated
        for param in self.mask2d.parameters():
            param.requires_grad = False

    @profile
    @proftime
    def forward(self, input, input_mask):
        # http://masc.cs.gmu.edu/wiki/partialconv
        # C(X) = W^T * X + b, C(0) = b, D(M) = 1 * M + 0 = sum(M)
        # W^T* (M .* X) / sum(M) + b = [C(M .* X) – C(0)] / D(M) + C(0)

        input_0 = input.new_zeros(input.size())

        output = F.conv2d(
            input * input_mask, self.conv2d.weight, self.conv2d.bias,
            self.conv2d.stride, self.conv2d.padding, self.conv2d.dilation,
            self.conv2d.groups)

        output_0 = F.conv2d(input_0, self.conv2d.weight, self.conv2d.bias,
                            self.conv2d.stride, self.conv2d.padding,
                            self.conv2d.dilation, self.conv2d.groups)

        with torch.no_grad():
            output_mask = F.conv2d(
                input_mask, self.mask2d.weight, self.mask2d.bias,
                self.mask2d.stride, self.mask2d.padding, self.mask2d.dilation,
                self.mask2d.groups)

        n_z_ind = (output_mask != 0.0)
        z_ind = (output_mask == 0.0)  # skip all the computation

        output[n_z_ind] = \
            (output[n_z_ind] - output_0[n_z_ind]) / output_mask[n_z_ind] + \
            output_0[n_z_ind]
        output[z_ind] = 0.0

        output_mask[n_z_ind] = 1.0
        output_mask[z_ind] = 0.0

        return output, output_mask


class PartialConv(nn.Module):
    # reference:
    # Image Inpainting for Irregular Holes Using Partial Convolutions
    # http://masc.cs.gmu.edu/wiki/partialconv/show?time=2018-05-24+21%3A41%3A10
    # https://github.com/naoto0804/pytorch-inpainting-with-partial-conv/blob/master/net.py
    # https://github.com/SeitaroShinagawa/chainer-partial_convolution_image_inpainting/blob/master/common/net.py
    # mask is binary, 0 is holes; 1 is not
    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1, bias=True):
        super(PartialConv, self).__init__()
        random.seed(0)
        self.feature_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride,
                                      padding, dilation, groups, bias)
        nn.init.kaiming_normal_(self.feature_conv.weight)

        self.mask_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride,
                                   padding, dilation, groups, bias=False)
        torch.nn.init.constant_(self.mask_conv.weight, 1.0)

        for param in self.mask_conv.parameters():
            param.requires_grad = False

    @profile
    @proftime
    def forward(self, args):
        x, mask = args
        output = self.feature_conv(x * mask)
        if self.feature_conv.bias is not None:
            output_bias = self.feature_conv.bias.view(1, -1, 1, 1).expand_as(output)
        else:
            output_bias = torch.zeros_like(output)

        with torch.no_grad():
            output_mask = self.mask_conv(mask)  # mask sums

        no_update_holes = output_mask == 0
        # because those values won't be used , assign a easy value to compute
        mask_sum = output_mask.masked_fill_(no_update_holes, 1.0)

        output_pre = (output - output_bias) / mask_sum + output_bias
        output = output_pre.masked_fill_(no_update_holes, 0.0)

        new_mask = torch.ones_like(output)
        new_mask = new_mask.masked_fill_(no_update_holes, 0.0)
        return output, new_mask

# Your method
model1 = PConv2d(in_ch=256, out_ch=256, kernel_size=3, stride=1, padding=1)

# My method
model2 = PartialConv(in_channels=256, out_channels=256, kernel_size=3, stride=1,
                     padding=1, dilation=1, groups=1, bias=True)

# mask sure all learnable convolutions share the same weights 
model2.feature_conv.weight.data.copy_(model1.conv2d.weight.data)
model2.feature_conv.bias.data.copy_(model1.conv2d.bias.data)
random.seed(0)
x1 = torch.randn(1, 256, 64, 64)
x2 = x1.clone()
mask1 = torch.ones_like(x1)
mask1[:, :, 25:50, 25:50] = 0
mask2 = mask1.clone()
y1 = model1.forward(x1, mask1)
y2 = model2.forward((x2, mask2))

print(f"Output feature output difference {torch.sum(y2[0] - y1[0])}")
print(f'Mask output difference {torch.sum(y2[1] - y1[1])}')

Some comments:

I find you are using batch norm after partial convolution. I would suggest disabling the bias term in the convolution right before batch norm which also include a bias term and offset the convolution's bias.
In addition, I prefer in place batch norm that is able to save around 20% - 40% memory usage while maintaining fast computation.

In your training script, the default learning rate is 2e-4. I highly recommend using cyclical learning rate and PyTorch's implementation . I am using 0.04-0.08 learning rate. If you are able to train a large batch size, the learning rate can be moving on [0.1, 1] or even larger, which is called super convergence.

Personal ad:

I am using partial convolution to create an manga inpainting tool: use image segmentation to figure out text locations, and then use inpainting to repair background & color. Suggestions and comments are very welcome.

can I just run test.py without dowload Places2 dataset?

Can I just download the trained model file and won't download the dataset? The dataset is really too big!! I just wanna to test the result(some pictures I found from interet for random for testing), can it work?

I downloaded the 1000000.pth file your trained and placed it in ./snapshots/default/1000000.pth
then I change the test.py to parser.add_argument('--snapshot', type=str, default='./snapshots/default/1000000.pth')

I made a folder structure like(without any pictures or dataset):
error3

also I run generate_data.py and got the following:
error4

then I run "python test.py", and got errors as follow:
error1

And I also found the same problem as #24. The number params input is not match.

in test.py, use it like dataset_val = Places2(args.root, img_transform, mask_transform, 'val')
and in places2.py,

error2

Can you pls let me kown wether there is a way just to test the model with some random pictures downloaded from internet?

about the requirment.txt

thanks for sharing your work
whenever i run pip install -r requirements.txt, i got the following error:

Could not find a version that satisfies the requirement pkg-resources==0.0.0 (from -r requirements.txt (line 3)) (from versions: )
No matching distribution found for pkg-resources==0.0.0 (from -r requirements.txt (line 3))
can u help? thanks

Blurry results

Hi, thanks for your project.
I've trained your model on a face dataset.
After 40K iterations, the result is still blurry.
image

Here is the loss log
image
image

I'm wondering is that normal? How long will it take to get good results?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.