felixfuyihui / uformer Goto Github PK

Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation

Python 100.00%

uformer's Introduction

👋 Hi, I’m 付艺辉/Yihui Fu
👀 I’m interested in Speech processing
🌱 I’m currently learning Quantitative Trading
💞️ I’m looking to collaborate on alcohol
📫 How to reach me [email protected]

uformer's People

Contributors

Stargazers

Watchers

Forkers

ishine shaun95 wangtianrui fragrantrookie alanliudx royandzoe yaoao2017 ouleiwa newoneincntk fixernhc birdyfun laozhanger

uformer's Issues

Can the Uformer run in real time

Hi,
I am confused whether the Uformer works in real time.
Best Regards!

关于文章中参考模型参数量的问题

您好，我阅读了您的文章，想请教一个问题：
UFormer文章中表1中的DCCRN的参数量是8.99M。但是DCCRN原文中的参数量是3.7M。我想请问一下您文章中DCCRN的参数配置是怎么样的。您文章中的DCCRN是否性能是优于DCCRN原文的（因为从结果来看DCCRN甚至超过了DCCRN+）。谢谢！

Input error when running uformer.py

The paper is one of the best out there, congrats!
I am trying to run uformer.py but I get the following error:

RuntimeError: Given normalized_shape=[12], expected input with shape [*, 12], but got input of size[10, 64, 2, 749, 5]

Am I missing something here?

Training

Is there any script to run and use for training? Can you update the training script？

DCCRN PLUS 的实现能开源吗

你好
我看你文章中有对比DCCRN plus，我目前没有找到这个开源的实现，你是否可以开源一下？谢谢！

loss问题

calloss_magmse全带幅度谱loss最后是除以batch和频点维度，而分段幅度谱loss(calloss_magmse_subband)除的是batch和帧数，因为output_mag.shape[2]应该是T维度；
另外请问为什么不在T维度求平均呢？一般使用F.l1_loss的reduction直接用Mean就会在batch和F和T求平均，这样是对效果有啥影响吗？

关于训练过程出现的问题

作者您好，在复现uformer过程中，我发现，验证模型的时候，enhanced的pesq、stoi等评价指标比noisy低，测试模型时输出是静音。关于网络部分没有进行改动。请问您遇到过这种情况吗？或者您觉得问题出在哪？期望您的早日回复。

关于复现网络出现的问题

作者您好，昨天的问题不知道怎么就不见了。我邮箱收到了您的回复，训练loss是正常的。我现在依然在查找问题的根源所在。如果您能把test 模型的源码公开，这对我解决问题会有很大的帮助。有一点需要咨询一下您，计算loss时为什么是返回三个值？期望您的回复。

昨天的问题是：测试模型时输出为静音。

Question about torch version

Hi felix,
Thanks for sharing this project, great work!
I encountered some issues when running your code directly by:
python uformer.py
I have tried torch version from 1.8.1 to 1.10.

In trans.py
I need to change line 85
K = th.fft.fft(I / S, 1)
to
K = th.view_as_real(th.fft.fft(I / S, 1)).squeeze(2)

Then in dsconv2d_cplx.py
I meet
Traceback (most recent call last):
File "uformer.py", line 451, in
outputs = net(inputs,inputs)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "uformer.py", line 336, in forward
out, mag = self.conformer1(out, mag)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/apdcephfs/share_1149801/speech_user/tomasyu/experiment/frontend/Enhancement-Paas/ENHPAAS/model/UFORMER/Uformer/dilated_dualpath_conformer.py", line 118, in forward
cplx = self.dsconv_cplxidx
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/apdcephfs/share_1149801/speech_user/tomasyu/experiment/frontend/Enhancement-Paas/ENHPAAS/model/UFORMER/Uformer/dsconv2d_cplx.py", line 48, in forward
y = self.layernorm_conv1(x.transpose(2,4)).transpose(2,4)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, **kwargs)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/normalization.py", line 171, in forward
input, self.normalized_shape, self.weight, self.bias, self.eps)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/functional.py", line 2205, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[12], expected input with shape [, 12], but got input of size[10, 64, 2, 749, 5]

Can you give me some clues.

Bests,
tomasyu

your work's torch version

Thanks your creative work. But I encountered a problem in the process of reproduction, and I suspect it was a version problem.The questions are as follows：
File "D:\DLSPEECHENHANCEMENT\Uformer\Uformer-main\uformer\conv2d_cplx.py", line 37, in forward
out_real1 = self.real_conv(inputs_real)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\conv.py", line 457, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\conv.py", line 453, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: y.get_desc().is_nhwc() INTERNAL ASSERT FAILED at "C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\mkldnn\Conv.cpp":143, please report a bug to PyTorch.
There is no error in batchsize=8 during training, but there is a problem during verification.

felixfuyihui / uformer Goto Github PK

uformer's Introduction

uformer's People

Contributors

Stargazers

Watchers

Forkers

uformer's Issues

Can the Uformer run in real time

关于文章中参考模型参数量的问题

Input error when running uformer.py

Training

DCCRN PLUS 的实现能开源吗

loss问题

关于训练过程出现的问题

关于复现网络出现的问题

Question about torch version

your work's torch version

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent