megvii-research / dcls-sr Goto Github PK
View Code? Open in Web Editor NEWOfficial PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.
License: MIT License
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.
License: MIT License
想请问作者,内核重构公式内和dcls最后的H里的傅里叶变换是主要是起什么作用?低通滤波吗?
为什么我使用set5数据集做四倍超分,结果只有23左右呢????望解答
您好,提供的预训练权重链接我这边无法下载,请问是否有百度网盘或者百度云其他的下载方式?
I am a new researcher in super-resolution work, for Blind Image Super-Resolution, I think we cannot get the SR image information and true kernel. But in your work, you have used this information, maybe this is not Blind Image Super-Resolution?
File "../../utils/dcls_utils.py", line 50, in inv_fft_kernel_est
+ ker_p[:, :, :, :, 1] * ker_p[:, :, :, :, 1]
IndexError: too many indices for tensor of dimension 4
作者你好,能帮我看看这个问题吗?
你好我想请问一下论文对比方法中的IKC等是用了相同的退化参数重新训练了模型,还是使用的官方发布的预训练模型
Ri是去模糊之后的特征图, 那么如果用laplacian算子得到pRi特征图范数应该是更大的,因为边缘化更突出.不知道为什么这里写的是将pRi作为目标函数最小化去优化.
Thanks for your interesting work! As I reproduce your work, I find that you don't add any noise for anisotropic kernel training as your paper claimed. So is there any bug?
您好,关于您论文中3.3节,有几个问题:
1、公式(10)是怎么得到的,需要最小化的不应该是||GiX↓s—Ri||2吗,就像Deep Wiener Decovolution中的公式(4a)。
2、为什么在特征空间中,平滑滤波器P和拉格朗日乘子可能不一致?如何理解?
3、为什么利用神经网络也就是代码中的self.grad_filter可以预测出来一组具有隐式拉格朗日乘数的平滑滤波器,怎么体现的呢?什么原理?
作者您好,感谢您的工作。
我在您的论文中理解到,模糊核是在算法中只在LR空间进行估计,并不像DAN算法那样将重建图像也加入到核估计算法中采用递归的进行迭代优化,我想了解一下这个理解是否正确,并想问在DCLS算法训练中对于图像的模糊核是只进行了一次估计吗?期待您的回复。
您好,请问这个模型可以泛化到真实世界的图像上吗?
我使用了论文中的权重,测试在真实图像上的效果,发现效果并不是很好,是因为Setting1和Setting2的权重只是使用了模糊,没有加噪声进去训练的缘故吗?
并且想请问您在论文中Figure11的真实图片的结果中,使用的是哪个权重?谢谢!
您好,您在CLS的forward()方法中有一个循环16次的语句:
for i in range(feature_pad.shape[1]):
#对16个通道中的每个通道进行操作
feature_ch = feature_pad[:, i:i+1, :, :]
clear_feature_ch = get_uperleft_denominator(feature_ch, kernel, kernel_P[:, i:i+1, :, :])
clear_features[:, i:i+1, :, :] = clear_feature_ch[:, :, ks:-ks, ks:-ks]
其中get_uperleft_denominator(feature_ch, kernel, kernel_P[:, i:i+1, :, :])方法我看了好多遍,完全看不懂,只知道这个方法是对经过填充的原始特征,预测的平滑滤波器,模糊核进行一些傅里叶变换操作。您能对下列的4~8行解释下这几行代码是在干嘛吗?跪跪谢!
ps(第四行方法里进行的循环移位操作看的怀疑人生,能看懂但是不知道为什么要这样做!哭)
1.# ------------------------------------------------------
2.# -----------Constraint Least Square Filter-------------
3.def get_uperleft_denominator(img, kernel, grad_kernel):
4. ker_f = convert_psf2otf(kernel, img.size()) # discrete fourier transform of kernel 对模糊核进行离散傅里叶操作
5. ker_p = convert_psf2otf(grad_kernel, img.size()) # discrete fourier transform of kernel
6. denominator = inv_fft_kernel_est(ker_f, ker_p)
7. numerator = torch.rfft(img, 3, onesided=False)
8. deblur = deconv(denominator, numerator)
9. return deblur
Is the result compared with real-esrgan? Is it better than real-esrgan on blind deblur and blind super resolution task?
因为新版torch的原因,在网络上查了一些新旧更替的资料,发现对于图像数据,torch.rfft(img, 3, onesided=False)这里的dim给的都是2,不是3,请问是有什么影响吗?
您好,按照setting1中设置训练50,0000。在每隔2000个迭代会进行一次验证,这些验证保存的结果让我满脸问号???以set5中的‘baby’为例,下面是每次验证保存的结果分别是lr,2000iter,4000iter,6000iter,8000iter,10000iter。10000iter后的图片和10000iter结果差不多,都是黑屏:
原始lr:
2000iter:
4000iter:
6000iter:
8000iter:
10000iter:
后面从第10000iter到20,0000的结果都是全黑图片。这是在第8000iter后过拟合了吗。第4000iter和6000iter结果也很离谱。求救,您认为为什么会这样
感谢分享。
请问下,式子10和式子11,为什么是对第一项最小化?在知乎的解读中看到,“意味着使得图像趋近平滑,也就说明图像从模糊变清晰了”。
确实最小化会使得图像平滑,但是平滑是不是意味着更模糊了,为什么是平滑意味着更清晰了,这一点有些不理解,谢谢您。
您好,當我執行train.py時,在dcls_arch.py的檔案發生錯誤訊息如下
File "/home/wu/DCLS/codes/config/DCLS/models/modules/dcls_arch.py", line 87, in forward
clear_features[:, i:i+1, :, :] = clear_feature_ch[:, :, ks:-ks, ks:-ks]
RuntimeError: expand(torch.cuda.FloatTensor{[64, 1, 64, 64, 2, 2]}, size=[64, 1, 64, 64]): the number of sizes provided (4) must be greater or equal to the number of dimensions in the tensor (6)
原始碼的部分是在這邊:
class CLS(nn.Module):
def __init__(self, nf, reduction=4):
super().__init__()
self.reduce_feature = nn.Conv2d(nf, nf//reduction, 1, 1, 0)
self.grad_filter = nn.Sequential(
nn.Conv2d(nf//reduction, nf//reduction, 3),
nn.LeakyReLU(0.1, inplace=True),
nn.Conv2d(nf//reduction, nf//reduction, 3),
nn.LeakyReLU(0.1, inplace=True),
nn.Conv2d(nf//reduction, nf//reduction, 3),
nn.AdaptiveAvgPool2d((3, 3)),
nn.Conv2d(nf//reduction, nf//reduction, 1),
)
self.expand_feature = nn.Conv2d(nf//reduction, nf, 1, 1, 0)
def forward(self, x, kernel):
cls_feats = self.reduce_feature(x)
kernel_P = torch.exp(self.grad_filter(cls_feats))
kernel_P = kernel_P - kernel_P.mean(dim=(2, 3), keepdim=True)
clear_features = torch.zeros(cls_feats.size()).to(x.device)
print(clear_features.shape)
ks = kernel.shape[-1]
dim = (ks, ks, ks, ks)
feature_pad = F.pad(cls_feats, dim, "replicate")
for i in range(feature_pad.shape[1]):
feature_ch = feature_pad[:, i:i+1, :, :]
print(feature_ch.shape)
clear_feature_ch = get_uperleft_denominator(feature_ch, kernel, kernel_P[:, i:i+1, :, :])
print(clear_feature_ch)
clear_features[:, i:i+1, :, :] = clear_feature_ch[:, :, ks:-ks, ks:-ks]
x = self.expand_feature(clear_features)
return x
這程式碼中 clear_feature_ch = get_uperleft_denominator(feature_ch, kernel, kernel_P[:, i:i+1, :, :])會用到get_uperleft_denominator的這個function
這function如下:
def get_uperleft_denominator(img, kernel, grad_kernel):
ker_f = convert_psf2otf(kernel, img.size()) # discrete fourier transform of kernel
ker_p = convert_psf2otf(grad_kernel, img.size()) # discrete fourier transform of kernel
denominator = inv_fft_kernel_est(ker_f, ker_p)
numerator = torch.fft.fftn(img, dim=(-3, -2, -1))
numerator = torch.stack((numerator.real, numerator.imag), -1)
#numerator = torch.fft.ifft2(torch.complex(img[..., 0], img[..., 1]), dim=(-3, -2, -1))
deblur = deconv(denominator, numerator)
return deblur
其中convert_psf2otf這Function如下:
def convert_psf2otf(ker, size):
psf = torch.zeros(size).cuda()
# circularly shift
centre = ker.shape[2]//2 + 1
psf[:, :, :centre, :centre] = ker[:, :, (centre-1):, (centre-1):]
psf[:, :, :centre, -(centre-1):] = ker[:, :, (centre-1):, :(centre-1)]
psf[:, :, -(centre-1):, :centre] = ker[:, :, : (centre-1), (centre-1):]
psf[:, :, -(centre-1):, -(centre-1):] = ker[:, :, :(centre-1), :(centre-1)]
# compute the otf
#otf = torch.rfft(psf, 3, onesided=False)
#otf = torch.fft.ifft2(torch.complex(psf[..., 0], psf[..., 1]), dim=(-3, -2, -1))
otf = torch.fft.fftn(psf, dim=(-3, -2, -1))
otf = torch.stack((otf.real, otf.imag), -1)
return otf
我對 clear_features、clear_feature_ch以及 clear_features 分別進行.shape發現它們分別的維度如以下:
torch.Size([64, 16, 64, 64])
torch.Size([64, 1, 106, 106])
torch.Size([64, 1, 106, 106, 2, 2])
請問這是為甚麼會有這個情況以及這個部分要怎麼去修改呢?謝謝!
作者您好,想請問你可以提供去模糊特徵圖R的可視化方法嗎?
有嘗試使用一些方法但可視化的圖片很奇怪。
Hi, when I train the model under setting1_x2, I can not obtain a good result. Maybe I missed something in my training. Could you give some advice on it?
#### general settings
name: DCLSx2_setting1
use_tb_logger: true
model: blind
distortion: sr
scale: 2
gpu_ids: [0, 1, 2, 3]
pca_matrix_path: ../../../pca_matrix/DCLS/pca_matrix.pth
degradation:
random_kernel: True
ksize: 21
code_length: 10
sig_min: 0.2
sig_max: 2.0
rate_iso: 1.0
random_disturb: false
#### datasets
datasets:
train:
name: DIV2K
mode: GT
dataroot_GT: /datasets/DF2K/HR/x2HR.lmdb
use_shuffle: true
n_workers: 4 # per GPU
batch_size: 64
GT_size: 128
LR_size: 64
use_flip: true
use_rot: true
color: RGB
val:
name: Set5
mode: LQGT
dataroot_GT: /datasets/Set5/x2HR.lmdb
dataroot_LQ: /datasets/Set5/x2LRblur.lmdb
#### network structures
network_G:
which_model_G: DCLS
setting:
nf: 64
nb: 10
ng: 5
input_para: 256
kernel_size: 21
#### path
path:
pretrain_model_G: ~
strict_load: true
resume_state: ~
#### training settings: learning rate scheme, loss
train:
lr_G: !!float 4e-4
lr_E: !!float 4e-4
lr_scheme: MultiStepLR
beta1: 0.9
beta2: 0.99
niter: 500000
warmup_iter: -1 # no warm up
lr_steps: [200000, 400000]
lr_gamma: 0.5
eta_min: !!float 1e-7
pixel_criterion: l1
pixel_weight: 1.0
manual_seed: 0
val_freq: !!float 100
#### logger
logger:
print_freq: 20
save_checkpoint_freq: !!float 1000
This is the config settings. Thank you for your reply!
刚接触深度学习,请问预训练模型是哪个呀,是DCLSx4_setting1.pth这个文件吗?文件里面也找不到DCLSx4_setting1.pth这个文件。
想请教下,既然调转了采样和模糊的顺序,那么先去模糊再上采样,走样问题该如何解决?上采样的插值过程不会造成图片走样吗?
您好,注意到您合成数据是用的各向同性高斯模糊核,并没有实现各向异性高斯模糊核的代码。你是认为各向同性高斯模糊核更有一般性吗?
Hello,
MSU Graphics & Media Lab Video Group has recently launched two new Super-Resolution Benchmarks.
If you are interested in participating, you can add your algorithm following the submission steps:
We would be grateful for your feedback on our work!
您好,在您的README中“Dataset Preparation”节写到“To transform datasets to binary files for efficient IO, run:”,但是当我运行create_lmdb.py文件时,并不知道其中的img_folder应该指向什么样的训练数据路径。换句话说,在生成训练数据时,我应该将img_folder设置为DIV2K的HR图像,还是DIV2Kx4下的LR图像?
作者,你好,请问
Traceback (most recent call last):
File "test.py", line 24, in
opt = option.parse(parser.parse_args().opt, is_train=False)
File "D:\pythonProject\7_4\DCLS-SR-master\DCLS-SR-master\codes\config\DCLS\options.py", line 60, in parse
config_dir = path.split("/")[-2]
IndexError: list index out of range
这个是为啥?一直不行,还有一个问题就是先验模型放在哪个位置呢?可以详细写下测试过程吗?测试过程不需要数据集吧?感谢
您好,非常感谢您的工作!
在文中您提到,对于模糊核估计阶段,使用转换后的模糊核作为监督,因此估计出的也应该是转换后的模糊核。但在图7和表4中,似乎您展示的是在原始模糊核域上的对比,想请问是如何将估计出的模糊核变换回原始模糊核域的?
您好,小白刚接触深度学习这方面,之前训练老是断开,还没有完成过一次。想问您怎么从上一次断开处重新训练?
我看见option的path里有
pretrain_model_G: ~
strict_load: true
resume_state: ~ # true
还有断开时的迭代次数和epoch怎么得到,我该怎么改代码?
您好我想咨询一下3090显卡用torch1.8跑rfft和irfft需要怎么处理,我按照网上帖子处理后不报错了但是loss一直是nan。而我本人笔记本2060跑的时候同样torch1.8却不报错,loss也正常。
您好,请问我用您给的模型来测试set5,PSNR只有7左右,生成的result文件里面图片是全黑的,是什么问题呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.