dqiaole / zits_inpainting Goto Github PK

View Code? Open in Web Editor NEW

314.0 13.0 36.0 32.86 MB

[CVPR 2022] Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding

License: Apache License 2.0

Python 74.53% HTML 2.81% CSS 0.46% JavaScript 22.20%

zits_inpainting's Introduction

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding

by Qiaole Dong*, Chenjie Cao*, Yanwei Fu

Paper and Supplemental Material (arXiv)

Our project page is available at https://dqiaole.github.io/ZITS_inpainting/.

🔥🔥🔥 News: Our Extended version ZITS++ has been accepted by TPAMI, codes and dataset have been released in here.

Pipeline

The overview of our ZITS. At first, the TSR model is used to restore structures with low resolutions. Then the simple CNN based upsampler is leveraged to upsample edge and line maps. Moreover, the upsampled sketch space is encoded and added to the FTR through ZeroRA to restore the textures.

TO DO

Releasing inference codes.
Releasing pre-trained model.
Releasing training codes.

Preparation

Preparing the environment:

as there are some bugs when using GP loss with DDP (link), we strongly recommend installing Apex without CUDA extensions via torch1.9.0 for the multi-gpu training

conda create -n train_env python=3.6
conda activate train_env
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirement.txt
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --no-build-isolation ./

For training, MST provide irregular and segmentation masks (download) with different masking rates. And you should define the mask file list before the training as in MST.

The training masks we used are contained in coco_mask_list.txt and irregular_mask_list.txt, besides test_mask.zip includes 1000 test masks.
Download the pretrained masked wireframe detection model to the './ckpt' fold: LSM-HAWP (MST ICCV2021 retrained from HAWP CVPR2020).

Prepare the wireframes:

Update: No need prepare another environment anymore, just extract wireframes with following code

conda activate train_env
python lsm_hawp_inference.py --ckpt_path <best_lsm_hawp.pth> --input_path <input image path> --output_path <output image path> --gpu_ids '0'

If you need to train the model, please download the pretrained models for perceptual loss, provided by LaMa:

mkdir -p ade20k/ade20k-resnet50dilated-ppm_deepsup/
wget -P ade20k/ade20k-resnet50dilated-ppm_deepsup/ http://sceneparsing.csail.mit.edu/model/pytorch/ade20k-resnet50dilated-ppm_deepsup/encoder_epoch_20.pth

Indoor Dataset and Test set of Places2 (Optional)

To download the full Indoor dataset: BaiduDrive, passward:hfok; Google drive (link).

The training and validation split of Indoor can be find on indoor_train_list.txt and indoor_val_list.txt.

The test set of our Places2 can be find on places2_test_list.txt.

Eval

Download pretrained models on Places2 here.

Link for BaiduDrive, password:qnm5

Batch Test

For batch test, you need to complete steps 3 and 4 above.

Put the pretrained models to the './ckpt' fold. Then modify the config file according to you image, mask and wireframes path.

Test on 256 images:

conda activate train_env
python FTR_inference.py --path ./ckpt/zits_places2 --config_file ./config_list/config_ZITS_places2.yml --GPU_ids '0'

Test on 512 images:

conda activate train_env
python FTR_inference.py --path ./ckpt/zits_places2_hr --config_file ./config_list/config_ZITS_HR_places2.yml --GPU_ids '0'

Single Image Test

This code only supports squared images (or they will be center cropped).

conda activate train_env
python single_image_test.py --path <ckpt_path> --config_file <config_path> \
 --GPU_ids '0' --img_path ./image.png --mask_path ./mask.png --save_path ./

Training

⚠️ Warning: The training codes is not fully tested yet after refactoring

Training TSR

python TSR_train.py --name places2_continous_edgeline --data_path [training_data_path] \
 --train_line_path [training_wireframes_path] \
 --mask_path ['irregular_mask_list.txt', 'coco_mask_list.txt'] \
 --train_epoch 12 --validation_path [validation_data_path] \
 --val_line_path [validation_wireframes_path] \
 --valid_mask_path [validation_mask] --nodes 1 --gpus 1 --GPU_ids '0' --AMP

python TSR_train.py --name places2_continous_edgeline --data_path [training_data_path] \
 --train_line_path [training_wireframes_path] \
 --mask_path ['irregular_mask_list.txt', 'coco_mask_list.txt'] \
 --train_epoch 15 --validation_path [validation_data_path] \
 --val_line_path [validation_wireframes_path] \
 --valid_mask_path [validation_mask] --nodes 1 --gpus 1 --GPU_ids '0' --AMP --MaP

Train SSU

We recommend to use the pretrained SSU. You can also train your SSU refered to https://github.com/ewrfcas/StructureUpsampling.

Training LaMa First

python FTR_train.py --nodes 1 --gpus 1 --GPU_ids '0' --path ./ckpt/lama_places2 \
--config_file ./config_list/config_LAMA.yml --lama

Training FTR

256:

python FTR_train.py --nodes 1 --gpus 2 --GPU_ids '0,1' --path ./ckpt/places2 \
--config_file ./config_list/config_ZITS_places2.yml --DDP

256~512:

python FTR_train.py --nodes 1 --gpus 2 --GPU_ids '0,1' --path ./ckpt/places2_HR \
--config_file ./config_list/config_ZITS_HR_places2.yml --DDP

More 1K Results

Acknowledgments

This repo is built upon MST, ICT and LaMa.

Cite

If you found our program helpful, please consider citing:

@inproceedings{dong2022incremental,
      title={Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding}, 
      author={Qiaole Dong and Chenjie Cao and Yanwei Fu},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      year={2022}
}

zits_inpainting's People

Contributors

Stargazers

Watchers

zits_inpainting's Issues

baseline

Hello author, when you compared other baseline models, did you use the mask generated in this paper and the corresponding data set to retrain the baseline model? Like the lama model，did you use which type of mask?Looking forward to your reply！

Bad results

I am getting some very poor results. I am using the single_image script and resizing images to 512,512

Can some of the images + masks from the showed resuts can be shared? This way I could verify if I did something weird

multi_gpu train

how can TSR support multi_gpu train?what's the trianing sentence? thanks a lot

Is it possible to run it without a GPU machine

Hi
Can anybody please tell, how to run this project without a GPU, only with CPU.
Is it possible with this setup.

Provide discriminator weights

The config file contains this line:
dis_weights_path0: './ckpt/lama_places2/InpaintingModel_dis.pth'

Can you please provide these weights? Or point me to where I can get them?

When will you release your code?

Hi, I have read your paper, and I am very interested in your works.
When will you plan to release your code?

continue train

请问我想使用自己的数据集进行继续训练的话，是只需要训练FTR_train还是TSR_train和SSU都要训练？

lsm_hawp_inference.py_result_bad

I try to use the lsm_hawp_inference.py to generate the .pkl of my dataset (place365).
I used the best_lsm_hawp.pth which you provided.
But the result is really bad.
I tried reduce threshold=0.8 > 0.5 but it still has bad result.

Do you have the best_palce365_lsm_hawp.pth?
Or how do we train our own hawp.

The iamge is the sample from training.(14001.jpg)

Is the input of image ground truth during training?

I do not find the image dot multiple the mask. So is the input of image ground truth during training?

MPE module

Hello author, how is the ablation experiment of the MPE module designed in this table in the paper?
(1) Is it the result of fine-tuning with ReZero?
(2) Or directly add the MPE module to the Lama model for training?
Looking forward to your reply! Thanks!

pre trained model error

Hi,when I train FTR to 1001 iterations, I am prompted to add InpainitingModel_best_gen_HR.pth. But I found that the pre trained model you provided does not include this one. Can you provide this model? Thank you very much.
Look forward to your reply！

Is there a demo code to make inference on custom image and mask

Hi, I tried single_image_test.py, but it is hard coded for Places 365 Standard. Is there any simpler demo code to show the results based on a pair of inputs such as image and corresponding mask?

wrong training

您好，出错的地方是继承自nn.modulem，kwags是设置为config.training_model，lama里的training_model这些参数在什么地方应用请问？请问出错原因？

root@dl-1558365045-pod-jupyter-8489459f68-rjkx2:~/share/program/ZITS_inpainting-main# python FTR_train.py --nodes 1 --gpus 1 --GPU_ids '0' --path ./ckpt/lama_places2 --config_file ./config_list/config_LAMA.yml --lama
here {'visualize_each_iters': 1000, 'concat_mask': True, 'store_discr_outputs_for_vis': True}
Traceback (most recent call last):
File "FTR_train.py", line 101, in
mp.spawn(main_worker, nprocs=args.world_size, args=(args,))
File "/root/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/root/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
while not context.join():
File "/root/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 160, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/root/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "/root/share/program/ZITS_inpainting-main/FTR_train.py", line 52, in main_worker
model = LaMa(config, gpu, rank)
File "/root/share/program/ZITS_inpainting-main/src/FTR_trainer.py", line 26, in init
self.inpaint_model = LaMaInpaintingTrainingModule(config, gpu=gpu, rank=rank, test=test, **kwargs).to(gpu)
File "/root/share/program/ZITS_inpainting-main/src/models/FTR_model.py", line 320, in init
super().init(*args, gpu=gpu, name='InpaintingModel', rank=rank, test=test, **kwargs)
File "/root/share/program/ZITS_inpainting-main/src/models/FTR_model.py", line 34, in init
super().init(*args, **kwargs)
TypeError: init() got an unexpected keyword argument 'visualize_each_iters'

ERROR: Could not find a version that satisfies the requirement torch==1.3.1

Hi,
You recommend to inference the wireframes with torch 1.3.1 on README, but could not find the version by pip.

ERROR: Could not find a version that satisfies the requirement torch==1.3.1 (from versions: 1.4.0, 1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0)
ERROR: No matching distribution found for torch==1.3.1

How did you install the old version?

can this model generate various results?

can this model do image completion and generate various results?

Questions/Issues about batch test evaluation.

Mask Image Correspondence
If I am batch evaluating on a custom dataset, do I need to have the same number of masks as the number of evaluation images? How does the code decide correspondence between an image and mask? Is there any naming convention? I have generated the wireframes for all evaluation images. I have 1200 images. But the provided test masks are only 1000. I am wondering if the batch evaluation will work.

Possible bug
https://github.com/DQiaole/ZITS_inpainting/blob/main/src/FTR_trainer.py#L271
If test is True (batch eval), the self.val_dataset is never created.

Wouldn't https://github.com/DQiaole/ZITS_inpainting/blob/main/src/FTR_trainer.py#L443 throw an error?

the path in config

modify the image path
# origin images?
TRAIN_FLIST: ./data_list/sp_large_train_list.txt
VAL_FLIST: ./data_list/sp_large_val_list.txt
TEST_FLIST: ./data_list/sp_large_val_list.txt

set the GT images folder for metrics computation
# origin Val image?
GT_Val_FOLDER: './datasets/inpaint_data/val_images/'

modify the mask path
# the mask of random generation?
TRAIN_MASK_FLIST: [ './data_list/mask_large_train_list.txt',
'./data_list/mask_large_train_list.txt' ]

the real mask of object when object remove?
TEST_MASK_FLIST: ./datasets/inpaint_data/val_SH_binary_masks/

Could you tell me that the mean of these path in my mind is right?

FileNotFoundError

Hello,

thanks a lot for your great work!

When I run the single image test:

conda activate wireframes_inference_env
python single_image_test.py --path <ckpt_path> --config_file <config_path> \
 --GPU_ids '0' --img_path ./image.png --mask_path ./mask.png --save_path ./

I get an error
FileNotFoundError: [Errno 2] No such file or directory: '/home/wmlce/places365_standard/places2_all/train_list.txt'

It is triggered when single_image_test.py is executing model = ZITS(config, 0, 0, True). This path is defined in the config files. Can the script be run without this file?

Thank you!

ZITS已经攀登了高峰，而进一步是需要惠民和普及

ZITS在学术领域已经是很高端了，在代码具体调试和使用方面希望更能详尽，而进一步让更多的人了解并普及给广大民众。

Lama Cleaner add ZITS support

Lama Cleaner is integrated with ZITS, you can try it here

Pretrained Indoor Model

Hi,
Can you upload the pretrained Indoor data model - the results of which you share in your paper? Also, can you share the trained models of the comparative methods you show results for in your paper?

Thank you.

The path problems

Hi,there! Just read your cvpr papers,andi am confused about the 'valid_mask_path' in the TSR_train.py(line 86),i just don't know what that supposed to be.

Here's my understandings about the train path,please correct me if i was wrong.
1.data_path and validation_path means the dataset's(indoor&places2) train and val set.
2.train_line_path and val_line_path means the train wireframes and val wireframes extracted from datasets with lsm_hawp_inference.py.
3.The valid_mask_path i didn't see its' mask_val_list in your codes,i guess it's the test masks?

And i would like to konw how many gpus you guys use and how long would this model run.

Thanks for your great work and looking forward to your reply!

Access to the Indoor dataset and specific Places images for test ( What was your test image with masks)

Hi. How we can access to the Indoor dataset for training?
How we can access to the test set train set for the Places and Indoor datasets?
@DQiaole

TSR_inference.py quesetion

The code line: 47 48
test_dataset = ContinuousEdgeLineDatasetMask(opts.image_url, test_mask_path=opts.mask_url, is_train=False,
image_size=opts.image_size)

Dose the code loss 'line_path'?
I think it should be like this.

parser.add_argument('--test_line_path', type=str, default=' ', help='Indicate where is the wireframes of test set')

test_dataset = ContinuousEdgeLineDatasetMask(opts.image_url, test_mask_path=opts.mask_url, is_train=False,
image_size=opts.image_size,line_path=opts.test_line_path)

If I am wrong. Please tell me.

How to train from scratch?

Is there any documentation for training the model from scratch on data other than Indoor or Places?

single_image_test

我用Visual Studio Code 调试single_image_test.py
在
config_path = os.path.join(args.path, 'config.yml')

发生异常: TypeError
expected str, bytes or os.PathLike object, not NoneType
File "D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\single_image_test.py", line 289, in
config_path = os.path.join(args.path, 'config.yml')

inpainting_metrics.py中ValueError: axes don't match array错误

作者你好，在文件的这行代码出现上述问题

如果此时传入的是indoor数据集的原图像，不是256*256大小的，根据博客：https://stackoverflow.com/questions/37747021/create-numpy-array-of-images 给出的方法，对图像进行处理后可以正常运行，但是不知道这样是否会对结果产生影响。

我想知道你们有遇到过这个问题吗？能否给我一点建议呢？期待您的回复，谢谢

Question about loss and activation function

Hi,
I have questions about activation function and loss.

Why you calculate the loss before the activation function?
According to your code, the cross entropy loss is calculated before sigmoid function. In the general CNN, I think the loss is calculated after the activation. Could you tell me why.
Why you use only cross entropy loss?
According to your code, only the cross entropy loss is calculated in the TSR. I wonder if you could use other losses (L1 loss, feature matching loss) after upsampling because there are convolution layers after transformer blocks.

Question about Wireframe extraction difference Single vs Batch mode

It seems for Single Image Test, the wireframe extraction is done for masked images.
https://github.com/DQiaole/ZITS_inpainting/blob/main/single_image_test.py#L173 (Masking)
https://github.com/DQiaole/ZITS_inpainting/blob/main/single_image_test.py#L194 (Wireframe inference)
https://github.com/DQiaole/ZITS_inpainting/blob/main/single_image_test.py#L219 (obj_remove False, so use lines_masked and scores_masked).
Side Q. When should obj_remove be used? (The code also calculates wireframes for original image but it is not used if obj_remove is false).

But, for Batch test, the wireframe extraction is done on original images (it is recommended to precompute the wireframes).
https://github.com/DQiaole/ZITS_inpainting#batch-test (precompute wireframes)
Then, the image, edge and line is masked before passing through transformer.
https://github.com/DQiaole/ZITS_inpainting/blob/main/src/utils.py#L273

Q. I am not an expert in wireframe extraction. But, wouldn't passing a masked image for wireframe extraction vs passing the full image and then masking give different results? Or is it the same and it doesn't matter? Ideally, for inpainting, we wouldn't have access to the original unmasked images and cannot extract wireframes on them. Why this difference in implementation?

Finetuning on custom dataset

Hi,

I used your pre-trained models on my custom dataset and the inpainting results were not great.
I was thinking of finetuning the pretrained weights on my custom dataset to improve inpainting quality.

What should be the steps/commands? Which models do I need to finetune? TSR and FTR?

It would be great if you could provide some suggestions.

wireframe model is irrelevant

Hi,

I've been playing quite a bit with your model due to the amazing results. Something that i've noticed is that the wireframe model is irrelevant. If I return a full zeros tensor of lines_tensor of the same shape as the actual output in wf_inference_test, I get the same final outputs. Is there a bug somewhere?

To replicate:

return torch.zeros_like(lines_tensor.detach()) in wf_inference_test

Update:

It seems that also the edges seem to be useless.

  batch["line_256"] = torch.zeros_like(batch["mask_256"])
  batch["line"] = torch.zeros_like(batch["mask_512"]) 
  batch["edge"] = torch.zeros_like(batch["mask_512"])

Do this change gives me the same results

Let me know if im doing something wrong

SFI-Swin: Symmetric Face Inpainting

Dear reaserchers, please also consider checking our newly introduced face inpainting method to address the symmetry problems of general inpainting methods by using swin transformer and semantic aware discriminators.
Our proposed method showed better results in terms of fid score and newly proposed metric which focus on the face symmetry compared to some of the state-of-the-art methods including lama.
Our paper is availabe at:
https://www.researchgate.net/publication/366984165_SFI-Swin_Symmetric_Face_Inpainting_with_Swin_Transformer_by_Distinctly_Learning_Face_Components_Distributions

The code also will be published in:
https://github.com/mohammadrezanaderi4/SFI-Swin

Question about image size

Thank you for sharing your great works.

I am planning to integrate your pre-train model into lama-cleaner. However, I noticed that the single_image_test.py script only supports square images, is this a limitation in the network structure? Or is it just a limitation of this test script?

inference/latency time benchmark against LaMa

Hey, loved your paper!

Did you by chance compared your model to LaMa in terms of speed/memory usage/CPU.GPU usage etc'? Super interesting.

Thanks!

AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base'

运行TSR_train.py 时出现错误
File "TSR_train.py", line 7, in
from src.TSR_trainer import TrainerConfig, TrainerForContinuousEdgeLine, TrainerForEdgeLineFinetune
File "D:\AIworkspace\ZITS_inpainting-main\src\TSR_trainer.py", line 14, in
from apex import amp
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex_init_.py", line 27, in
from . import transformer
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer_init_.py", line 4, in
from apex.transformer import pipeline_parallel
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel_init_.py", line 1, in
from apex.transformer.pipeline_parallel.schedules import get_forward_backward_func
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules_init_.py", line 3, in
from apex.transformer.pipeline_parallel.schedules.fwd_bwd_no_pipelining import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules\fwd_bwd_no_pipelining.py", line 10, in
from apex.transformer.pipeline_parallel.schedules.common import Batch
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\pipeline_parallel\schedules\common.py", line 14, in
from apex.transformer.tensor_parallel.layers import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel_init_.py", line 21, in
from apex.transformer.tensor_parallel.layers import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel\layers.py", line 32, in
from apex.transformer.tensor_parallel.mappings import (
File "D:\Users\lcx\anaconda3\envs\train_env\lib\site-packages\apex\transformer\tensor_parallel\mappings.py", line 29, in
torch.distributed.reduce_scatter_tensor = torch.distributed._reduce_scatter_base
AttributeError: module 'torch.distributed' has no attribute '_reduce_scatter_base'
我的环境是 torch=1.9.0+cu111 cuda=11.1 ，请问作者如何解决？
谢谢

大概啥时候可以开源代码呀

Upgrading project to Pytorch 1.12.1 + CUDA 11.6 - No CUDA GPUs are available

Hi!

I've been testing ZITS_inpainting using my CPU very successfully but I have an RTX 3060 on Windows 10 and cannot seem to get it running on my GPU using the original setup that you describe in the 'Preparations' section.

I decided to upgrade the project dependencies as best as I am able to. The settings below worked for me.

conda create -n train_env python=3.10
conda activate train_env

conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
pip install -r requirement.txt (All set to latest versions)

However when running the single image test using the settings below and GPU_ids='0', I get the following output:

python single_image_test.py --img_path="D:\GANProjects\ZITS_inpainting-main\Tests\Test_Image_1.png" --mask_path="D:\GANProjects\ZITS_inpainting-main\Tests\Test_Image_Mask.png" --save_path="D:\GANProjects\ZITS_inpainting-main\Tests\Results\Test_Image_1.png" --config_file ./ckpt/config.yml --GPU_ids='0'

File "D:\GANProjects\ZITS_inpainting-main\single_image_test.py", line 356, in <module>
    model = ZITS(config, 0, 0, True, True)
  File "D:\GANProjects\ZITS_inpainting-main\src\FTR_trainer.py", line 256, in __init__
    self.inpaint_model = DefaultInpaintingTrainingModule(config, gpu=gpu, rank=rank, test=test, **kwargs).to(gpu)
  File "D:\GANProjects\ZITS_inpainting-main\src\models\FTR_model.py", line 424, in __init__
    super().__init__(*args, gpu=gpu, name='InpaintingModel', rank=rank, test=test, **kwargs)
  File "D:\GANProjects\ZITS_inpainting-main\src\models\FTR_model.py", line 156, in __init__
    self.str_encoder = StructureEncoder(config).cuda(gpu)
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\nn\modules\module.py", line 747, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\nn\modules\module.py", line 639, in _apply
    module._apply(fn)
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\nn\modules\module.py", line 639, in _apply
    module._apply(fn)
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\nn\modules\module.py", line 662, in _apply
    param_applied = fn(param)
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\nn\modules\module.py", line 747, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "C:\Miniconda3\envs\train_env\lib\site-packages\torch\cuda\__init__.py", line 227, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

Pytorch 1.12.1 and CUDA 11.6 are installed and my GPU is visible and recognized inside the train_env environment.
I'm having a difficult time figuring out what exactly the issue is.

As you folks are most familiar with the codebase, could you point me in the right direction in order to help me get this running on my GPU?

Mask Question

Hello, I want to ask why there are two masks in data_TSR.py？ Will using only one mask affect the experimental results？
Looking forward to your reply.

Any way to retain RGB values greater 1.0?

First off, fantastic work on this model!
I've been having a great time digging through the code and figuring out how everything works.
I've added the ability to write out EXR files along with color management using opencolorio.

One thing that I can't seem to figure out is how to maintain image float values above 1.0 and below 0.
They seem to be getting clamped somewhere between loading the image and writing the predicted image to an image file.
I've gone through everything to make sure that image values aren't being clipped but I think I may have missed something.

Any help with this would be very much appreicated.

Access to the pre-trained model

Loved the paper! The results compare to LaMa are amazing.
Can I have access to the lightest pre-trained model? (Benchmarking on mobile devices)

Best regards,
Roi

ZITS++

Hello author! I read another ZITS++ paper by your team, when will the code be released? Very much looking forward to it!

Is there a pretrained model?

Single image test

你好，作者，你做的工作非常棒，只是我在进行源码测试时有一些疑问：在下面的配置设置中
python single_image_test.py --path <ckpt_path> --config_file <config_path>
--GPU_ids '0' --img_path ./image.png --mask_path ./mask.png --save_path ./
权重path使用哪个呢？config_file使用哪个文件呢？我自己设置的设置如下：
python single_image_test.py --path ./ckpt/zits_places2_hr --config_file ./config_list/config_ZITS_HR_places2.yml --GPU_ids '0' --img_path ./test_i/img1.png --mask_path ./test_i/mask1.png --save_path ./test_i/
但是出现了下面的错误：
Traceback (most recent call last):
File "single_image_test.py", line 322, in
model = ZITS(config, 0, 0, True)
File "D:\pythonProject\7_4\inpaint\ZITS_inpainting-main\src\FTR_trainer.py", line 296, in init
min_sigma=min_sigma, max_sigma=max_sigma)
File "D:\pythonProject\7_4\inpaint\ZITS_inpainting-main\datasets\dataset_FTR.py", line 178, in init
f = open(flist, 'r')
FileNotFoundError: [Errno 2] No such file or directory: '/home/wmlce/places365_standard/places2_all/test_sub_list.txt'

单图测试也需要跟数据集一样的设置吗？希望能将测试步骤更加详细一些。希望能回复，非常感谢。希望能将测试步骤更加详细一些。

About Transformer Block

Thank you for sharing your great works!!

I have two questions.

Could you explain your transformer block in Fig.2 of your paper?
According to your code, the transformer block consists of the following layers.
AxialAttention -> my_Block_2(CausalSelfAttention + MLP)
I think the first feedforward refers to CausalSelfAttention and Vanilla Attention refers to MLP, but what is the last feedforward?
What is the differences between your transformer block and ICT transformer block?

Thank you in advance.

How to make the Umsampling model?

Thank you for sharing your great work!!

I try to train the network from scratch.
TSR_train.py created best.pth, latest.pth and log.txt in the ckpt directory. After that, I try to run the FTR_train.py, but there is a error.
FileNotFoundError: [Errno 2] No such file or directory: './ckpt/StructureUpsampling.pth'

How to make the Upsampling model?

Data problem

Hi, I tried FTR_inference.py.But I don't know What are the training and verification images set in config.yml. Is the edge and line generated by TSR? I hope to receive your explanation.
Looking forward to your reply,thanks.

可否预上传一份pth样本，直接调试

config_ZITS_places2.yml

transformer_ckpt_path: './ckpt/best_transformer_places2.pth'
gen_weights_path0: './ckpt/lama_places2/InpaintingModel_gen.pth' # Not required at the time of eval
dis_weights_path0: './ckpt/lama_places2/InpaintingModel_dis.pth' # Not required at the time of eval
structure_upsample_path: './ckpt/StructureUpsampling.pth'

D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\src\models\FTR_model.py

data = torch.load(config.structure_upsample_path, map_location='cpu')

发生异常: AttributeError
'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

During handling of the above exception, another exception occurred:

File "D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\src\models\FTR_model.py", line 165, in init
data = torch.load(config.structure_upsample_path, map_location='cpu')
File "D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\src\models\FTR_model.py", line 427, in init
super().init(*args, gpu=gpu, name='InpaintingModel', rank=rank, test=test, **kwargs)
File "D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\src\FTR_trainer.py", line 256, in init
self.inpaint_model = DefaultInpaintingTrainingModule(config, gpu=gpu, rank=rank, test=test, **kwargs).to(gpu)
File "D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\single_image_test.py", line 323, in
model = ZITS(config, 0, 0, True)

PS D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main> & 'D:\pm\python\python38\python.exe' 'c:\Users\Administrator.vscode\extensions\ms-python.python-2022.4.1\pythonFiles\lib\python\debugpy\launcher' '40191' '--' 'd:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\single_image_test.py' '--path=D:\pm\python\lama\LaMa_models\lama-places\lama-fourier\models' '--config_file=D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\config_list\config_ZITS_places2.yml' '--GPU_ids=-1' '--img_path=D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\imgs\y\i1.png' '--mask_path=D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\imgs\mask\i1.png' '--save_path=D:\pm\python\inpaint\ZITS_inpainting-main\ZITS_inpainting-main\imgs'
Backend TkAgg is interactive backend. Turning interactive mode on.
BaseInpaintingTrainingModule init called
Loading InpaintingModel StructureUpsampling...

why the cited fid values are different from that in the original paper?

the cited fid value which is from MST is this:

while the original fid value from MST is this:

Why they are different?

Another question, is the code for computing fid score right? why my results got by the given code are very high, can you provide several results (the results in the paper do not include ground truth)

Config files, prediction steps and seeds.

Hi there!

I've been looking for parameters to manipulate both in the config files and code base that will allow me to control inpaint generation variables, such as how many steps go into predicting an image and how to change the seed in order to get a variation of the generated inpaint image. I've tried modifying the config files and setting the seed via the function provided but it doesn't seem to have any effect on the result.

Any assistance with getting me started in the right direction would be appreciated 🙏

Where is the MPE module

Where is the MPE module ？