Coder Social home page Coder Social logo

uformer's Introduction

  • 👋 Hi, I’m 付艺辉/Yihui Fu
  • 👀 I’m interested in Speech processing
  • 🌱 I’m currently learning Quantitative Trading
  • 💞️ I’m looking to collaborate on alcohol
  • 📫 How to reach me [email protected]

uformer's People

Contributors

felixfuyihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

uformer's Issues

关于文章中参考模型参数量的问题

您好,我阅读了您的文章,想请教一个问题:
UFormer文章中表1中的DCCRN的参数量是8.99M。但是DCCRN原文中的参数量是3.7M。我想请问一下您文章中DCCRN的参数配置是怎么样的。您文章中的DCCRN是否性能是优于DCCRN原文的(因为从结果来看DCCRN甚至超过了DCCRN+)。谢谢!

Input error when running uformer.py

The paper is one of the best out there, congrats!
I am trying to run uformer.py but I get the following error:

RuntimeError: Given normalized_shape=[12], expected input with shape [*, 12], but got input of size[10, 64, 2, 749, 5]

Am I missing something here?

Training

Is there any script to run and use for training? Can you update the training script?

DCCRN PLUS 的实现能开源吗

你好
我看你文章中有对比DCCRN plus,我目前没有找到这个开源的实现,你是否可以开源一下?谢谢!

loss问题

  1. calloss_magmse全带幅度谱loss最后是除以batch和频点维度,而分段幅度谱loss(calloss_magmse_subband)除的是batch和帧数,因为output_mag.shape[2]应该是T维度;
  2. 另外请问为什么不在T维度求平均呢?一般使用F.l1_loss的reduction直接用Mean就会在batch和F和T求平均,这样是对效果有啥影响吗?

关于训练过程出现的问题

作者您好,在复现uformer过程中,我发现,验证模型的时候,enhanced的pesq、stoi等评价指标比noisy低,测试模型时输出是静音。关于网络部分没有进行改动。请问您遇到过这种情况吗?或者您觉得问题出在哪?期望您的早日回复。

关于复现网络出现的问题

作者您好,昨天的问题不知道怎么就不见了。我邮箱收到了您的回复,训练loss是正常的。我现在依然在查找问题的根源所在。如果您能把test 模型的源码公开,这对我解决问题会有很大的帮助。有一点需要咨询一下您,计算loss时为什么是返回三个值?期望您的回复。

昨天的问题是:测试模型时输出为静音。

Question about torch version

Hi felix,
Thanks for sharing this project, great work!
I encountered some issues when running your code directly by:
python uformer.py
I have tried torch version from 1.8.1 to 1.10.

In trans.py
I need to change line 85
K = th.fft.fft(I / S, 1)
to
K = th.view_as_real(th.fft.fft(I / S, 1)).squeeze(2)

Then in dsconv2d_cplx.py
I meet
Traceback (most recent call last):
File "uformer.py", line 451, in
outputs = net(inputs,inputs)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "uformer.py", line 336, in forward
out, mag = self.conformer1(out, mag)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/apdcephfs/share_1149801/speech_user/tomasyu/experiment/frontend/Enhancement-Paas/ENHPAAS/model/UFORMER/Uformer/dilated_dualpath_conformer.py", line 118, in forward
cplx = self.dsconv_cplxidx
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/apdcephfs/share_1149801/speech_user/tomasyu/experiment/frontend/Enhancement-Paas/ENHPAAS/model/UFORMER/Uformer/dsconv2d_cplx.py", line 48, in forward
y = self.layernorm_conv1(x.transpose(2,4)).transpose(2,4)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(input, **kwargs)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/modules/normalization.py", line 171, in forward
input, self.normalized_shape, self.weight, self.bias, self.eps)
File "/data/tools/anaconda3/envs/uformer/lib/python3.7/site-packages/torch/nn/functional.py", line 2205, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: Given normalized_shape=[12], expected input with shape [
, 12], but got input of size[10, 64, 2, 749, 5]

Can you give me some clues.

Bests,
tomasyu

your work's torch version

Thanks your creative work. But I encountered a problem in the process of reproduction, and I suspect it was a version problem.The questions are as follows:
File "D:\DLSPEECHENHANCEMENT\Uformer\Uformer-main\uformer\conv2d_cplx.py", line 37, in forward
out_real1 = self.real_conv(inputs_real)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\conv.py", line 457, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\Anaconda\envs\TCNN\lib\site-packages\torch\nn\modules\conv.py", line 453, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: y.get_desc().is_nhwc() INTERNAL ASSERT FAILED at "C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\mkldnn\Conv.cpp":143, please report a bug to PyTorch.
There is no error in batchsize=8 during training, but there is a problem during verification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.