wenet-e2e / wekws Goto Github PK

View Code? Open in Web Editor NEW

425.0 425.0 108.0 370 KB

Production First and Production Ready End-to-End Keyword Spotting Toolkit

License: Apache License 2.0

Python 76.27% Shell 2.39% C++ 15.17% CMake 0.96% Java 5.21%

wekws's People

Contributors

Stargazers

Watchers

Forkers

jingyonghou ishine spxia lijianhackthon xbsdsongnan xiexukang stevenlol haojiepan1 xiaoyuxiaoer zycv changxiangshi ductho9799 freenowill maxmax2016 ferb2015 fanlu xiaoyeye1117 macroustc liziru harryfyodor zhibinqiu swagshaw fengshi-cherish lavine2011 xsx93 lacking1 hulucky1102 yymax-max yitingss ryoha000 chmod740 xianchao-wu zhazhafon normonisping xuridongsheng7142 liangye10086 1579109909 fyj1116 yangyyt lvchigo sundy1219 ziggy1209 momojihwan qiuchaofan kli017 shawl336 benbill2077 robotnc cucbui57 02bigboy sugarcase blessyyyu xiaoqiang306 franklin-qi cdliang11 y-zyy veelion chaos-observer ruanmk yz-f zhm1 feiyun1265 yangppde megazone87 william1617 moumeneb1 garyzqg duj12 liangxt2012 deep-cognition dd-rongfa pancheng91 mikegithub-cq heart2016 lishiyu yczhou2020 cenwurong wdwlinda ahyswang wangtiance honglinchu zhuangxinnan liqiang4113 mlxu995 dl4035 lrqq yousirui1 iwaterxt xinwangg alamja kuro96 shirain-he yourengod jiafeng5513 zhaopufeng tianchaolangzi iamfaith skysbird peiyi-li vbcalinao

wekws's Issues

Finetuning support

Hi,
Is there any way to train a KWS model, in case we want to train a new wake word model, without having much data?

Or can I finetune an existing model on the new wake word, which is trained on a good amount of data? If yes, how?

Thanks

Demo Code ?

Hello,

Is there any demo code for wake word usage in realtime ?

Thanks to this great light-weight models

Error while applying RIRS augmentation

I'm getting following error, while applying RIRS augmentation on my training dataset.
ValueError: volume and kernel should have the same dimensionality

I downloaded the RIRS dataset, process it, and created lmdb file succesfully, during training I'm getting that issue.

使用 wenet::WavReader 初始化一个wav文件报错

你好，在linux电脑上编译通过onnxruntime项目文件，生成kws_main可执行文件，我自己录制了一段17s的录音（采样率16000）
当运行kws_main的时候，加载wav报错，header中的num_channel_，sample_rate 全部是0

在step2 显示 start training 后一直报错：
Traceback (most recent call last):
File "wekws/bin/export_onnx.py", line 96, in
main()
File "wekws/bin/export_onnx.py", line 46, in main
load_checkpoint(model, args.checkpoint)
File "/home/ying.liu12/wekws/wekws/utils/checkpoint.py", line 27, in load_checkpoint
checkpoint = torch.load(path)
File "/home/ying.liu12/anaconda3/envs/wekws/lib/python3.8/site-packages/torch/serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/ying.liu12/anaconda3/envs/wekws/lib/python3.8/site-packages/torch/serialization.py", line 230, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/home/ying.liu12/anaconda3/envs/wekws/lib/python3.8/site-packages/torch/serialization.py", line 211, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/home/ying.liu12/wekws/examples/hi_xiaowen/s0/exp/ds_tcn/avg_30.pt'

run.sh 的路径设置为：
config=/home/ying.liu12/wekws/examples/hi_xiaowen/s0/conf/ds_tcn.yaml
norm_mean=true
norm_var=true
gpus="0"

checkpoint=
dir=/home/ying.liu12/wekws/examples/hi_xiaowen/s0/exp/ds_tcn

num_average=30
score_checkpoint=$dir/avg_${num_average}.pt

download_dir=/data/ying.liu/mobvoi_hotword_dataset_detail/mobvoi_hotword_dataset

路径下也有path/to/s0/exp/ds_tcn/ ，报这个错该咋办？

MDTC causal config missing and cause failed

Traceback (most recent call last):
File "kws/bin/train.py", line 230, in
main()
File "kws/bin/train.py", line 141, in main
model = init_model(configs['model'])
File "/home/pengteng.spt/wekws-master/kws/model/kws_model.py", line 125, in init_model
causal = configs['backbone']['causal']
KeyError: 'causal'
Traceback (most recent call last):
File "kws/bin/train.py", line 230, in
main()
File "kws/bin/train.py", line 141, in main
model = init_model(configs['model'])
File "/home/pengteng.spt/wekws-master/kws/model/kws_model.py", line 125, in init_model
causal = configs['backbone']['causal']
KeyError: 'causal'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 101738) of binary: /home/pengteng.spt/miniconda2/envs/wenet/bin/python

memory leak using LookupCustomMetadataMap

Describe the bug
pointer returned by LookupCustomMetadataMap must be released using allocator.Free();
(https://github.com/wenet-e2e/wekws/blob/main/runtime/core/kws/keyword_spotting.cc#L38-L41)

WeKws Roadmap 2.0

WeKws is a community-driven project and we love your feedback and proposals on where we should be heading.
Feel free to volunteer yourself if you are interested in trying out some items(they do not have to be on the list).

The following items are in 2.0:

Rubustness, improve the robustness by learning acoustic feature rather than other features of the keywords.
Various on-device chips support.
Unsupervised model or pretrained model exploration.

关于cmvn使用的几个问题，还请指导

1、从提供的example中，hey snips 和hi xiaowen 使用了cmvn归一化，而speech command v1没有使用，请问这有什么讲究吗？
2、如果训练集的数据分布，和测试场景的数据分布不一致，此时使用cmvn技术，请问有哪些建议？
（比如训练集使用不同类型的手机收集，而测试场景是一些低端mic采集的声音，用在一些玩具上的语音识别）
直接收集部分测试场景数据，离线计算cmvn？但是此种方式无法穷举不同mic采集声音的类型，不同的mic 采集的声音分布不同，这个如何去应对，有没有妙计？
还是动态计算cmvn值？
异或者直接使用训练集的cmvn值？

fsmn_ctc模型导出onnx时报错！

Describe the bug
当我运行run_fsmn_ctc.sh来训练时，前面地流程一切正常，当运行至第四阶段，导出onnx模型时报错：

结合报错信息似乎是cache的设置不对，但我不知道如何正确的设置。

Question: time stamp of recognized keyword

Hi Team, thanks for this interesting project.

Can we get a time-stamp for recognized keyword?

运行android demo报错

GRU-based model?

I wonder why there isn't any GRU-based model implemented here.

Reading your paper, especially Fig. 1-4, I have the impression that GRU-based model performed better than TCN-based counterpart. I understand that for production system there are more considerations/constraints (model size, FLOPS/energy, etc.) than just FAR/FRR. Just want to make sure I didn't miss anything.

Some questions about the example "Nihao, wenwen"..

I think the wenet-kws is a pretty good project, but I run the example and find some problems in it.

At first, I don't know why there has assert kernel_size % 2 == 0 in line 189 in kws/model/mdtc.py
the config in mdtc_small.yaml set the kernel_size = 5.

Then, I comment the assert kernel_size % 2 == 0, and the result is below:

2021-12-02 06:53:06,227 DEBUG CV Batch 99/350 loss 0.00088373 acc 1.00000000 history loss 0.00451618
2021-12-02 06:53:07,493 DEBUG CV Batch 99/360 loss 0.01370203 acc 0.99000000 history loss 0.00447106
2021-12-02 06:53:08,590 DEBUG CV Batch 99/370 loss 0.00051790 acc 1.00000000 history loss 0.00438003

Namespace(dst_model='exp/mdtc_small/avg_10.pt', max_epoch=65536, min_epoch=0, num=10, src_path='exp/mdtc_small', val_best=True)
exp/mdtc_small/84.yaml {'cv_loss': 0.004494945056033555, 'epoch': 84, 'lr': 1e-06}
exp/mdtc_small/85.yaml {'cv_loss': 0.00420595490825506, 'epoch': 85, 'lr': 1e-06}
exp/mdtc_small/80.yaml {'cv_loss': 0.004422268416391168, 'epoch': 80, 'lr': 1.953125e-06}
exp/mdtc_small/86.yaml {'cv_loss': 0.00426848801635275, 'epoch': 86, 'lr': 1e-06}
exp/mdtc_small/40.yaml {'cv_loss': 0.005622736757912678, 'epoch': 40, 'lr': 0.000125}
exp/mdtc_small/12.yaml {'cv_loss': 0.014624565885012986, 'epoch': 12, 'lr': 0.0005}
exp/mdtc_small/19.yaml {'cv_loss': 0.01066642991893427, 'epoch': 19, 'lr': 0.0005}
exp/mdtc_small/67.yaml {'cv_loss': 0.004372600030651831, 'epoch': 67, 'lr': 7.8125e-06}
exp/mdtc_small/3.yaml {'cv_loss': 0.021692731020592233, 'epoch': 3, 'lr': 0.001}
exp/mdtc_small/64.yaml {'cv_loss': 0.004502788117951097, 'epoch': 64, 'lr': 1.5625e-05}
exp/mdtc_small/79.yaml {'cv_loss': 0.00467467649684301, 'epoch': 79, 'lr': 1.953125e-06}
exp/mdtc_small/61.yaml {'cv_loss': 0.005073365263766749, 'epoch': 61, 'lr': 1.5625e-05}
exp/mdtc_small/54.yaml {'cv_loss': 0.00454669024873601, 'epoch': 54, 'lr': 3.125e-05}
exp/mdtc_small/58.yaml {'cv_loss': 0.004749447677743553, 'epoch': 58, 'lr': 3.125e-05}
exp/mdtc_small/49.yaml {'cv_loss': 0.005313400925359438, 'epoch': 49, 'lr': 6.25e-05}
exp/mdtc_small/1.yaml {'cv_loss': 0.026169386569775077, 'epoch': 1, 'lr': 0.001}
exp/mdtc_small/29.yaml {'cv_loss': 0.008564605423066796, 'epoch': 29, 'lr': 0.00025}
exp/mdtc_small/11.yaml {'cv_loss': 0.015435188782970496, 'epoch': 11, 'lr': 0.0005}
exp/mdtc_small/63.yaml {'cv_loss': 0.0047399615163172355, 'epoch': 63, 'lr': 1.5625e-05}
exp/mdtc_small/82.yaml {'cv_loss': 0.004607287351367818, 'epoch': 82, 'lr': 1e-06}
exp/mdtc_small/83.yaml {'cv_loss': 0.004391176717157161, 'epoch': 83, 'lr': 1e-06}
exp/mdtc_small/97.yaml {'cv_loss': 0.0043467484049330655, 'epoch': 97, 'lr': 1e-06}
exp/mdtc_small/28.yaml {'cv_loss': 0.007137542134704971, 'epoch': 28, 'lr': 0.00025}
exp/mdtc_small/98.yaml {'cv_loss': 0.004537665521999014, 'epoch': 98, 'lr': 1e-06}
exp/mdtc_small/52.yaml {'cv_loss': 0.004649514840120405, 'epoch': 52, 'lr': 3.125e-05}
exp/mdtc_small/56.yaml {'cv_loss': 0.00448786976629719, 'epoch': 56, 'lr': 3.125e-05}
exp/mdtc_small/39.yaml {'cv_loss': 0.005829682733898623, 'epoch': 39, 'lr': 0.000125}

exp/mdtc_small/71.yaml {'cv_loss': 0.004532397123410156, 'epoch': 71, 'lr': 3.90625e-06}
exp/mdtc_small/65.yaml {'cv_loss': 0.004342411889022515, 'epoch': 65, 'lr': 7.8125e-06}
exp/mdtc_small/35.yaml {'cv_loss': 0.007356636884470326, 'epoch': 35, 'lr': 0.00025}
exp/mdtc_small/92.yaml {'cv_loss': 0.004499009573627954, 'epoch': 92, 'lr': 1e-06}
exp/mdtc_small/73.yaml {'cv_loss': 0.004482646395611368, 'epoch': 73, 'lr': 3.90625e-06}
exp/mdtc_small/72.yaml {'cv_loss': 0.004230009907933799, 'epoch': 72, 'lr': 3.90625e-06}
exp/mdtc_small/25.yaml {'cv_loss': 0.007332921293077766, 'epoch': 25, 'lr': 0.00025}
exp/mdtc_small/62.yaml {'cv_loss': 0.005137007407078835, 'epoch': 62, 'lr': 1.5625e-05}
exp/mdtc_small/13.yaml {'cv_loss': 0.010610442829166993, 'epoch': 13, 'lr': 0.0005}
exp/mdtc_small/55.yaml {'cv_loss': 0.004722100089821484, 'epoch': 55, 'lr': 3.125e-05}
exp/mdtc_small/99.yaml {'cv_loss': 0.004347019938513233, 'epoch': 99, 'lr': 1e-06}
exp/mdtc_small/26.yaml {'cv_loss': 0.007253867137949056, 'epoch': 26, 'lr': 0.00025}
exp/mdtc_small/21.yaml {'cv_loss': 0.010421987414134434, 'epoch': 21, 'lr': 0.0005}
exp/mdtc_small/32.yaml {'cv_loss': 0.0077251731347400005, 'epoch': 32, 'lr': 0.00025}
best val scores = [0.00420595 0.00423001 0.00423915 0.00424814 0.00426849 0.00432958
 0.00434241 0.00434298 0.00434621 0.00434675]
selected epochs = [85 72 77 81 86 89 65 87 95 97]
['exp/mdtc_small/85.pt', 'exp/mdtc_small/72.pt', 'exp/mdtc_small/77.pt', 'exp/mdtc_small/81.pt', 'exp/mdtc_small/86.pt', 'exp/mdtc_small/89.pt', 'exp/mdtc_small/65.pt', 'exp/mdtc_small/87.pt', 'exp/mdtc_small/95.pt', 'exp/mdtc_small/97.pt']
Processing exp/mdtc_small/85.pt
Processing exp/mdtc_small/72.pt
Processing exp/mdtc_small/77.pt
Processing exp/mdtc_small/81.pt
Processing exp/mdtc_small/86.pt
Processing exp/mdtc_small/89.pt
Processing exp/mdtc_small/65.pt
Processing exp/mdtc_small/87.pt
Processing exp/mdtc_small/95.pt
Processing exp/mdtc_small/97.pt
Saving to exp/mdtc_small/avg_10.pt
kernel_size = 5
stack_size = 4
num_stack = 3
hidden_dim = 32
Receptive Fields: 184
2021-12-02 06:53:14,000 INFO Checkpoint: loading from checkpoint exp/mdtc_small/avg_10.pt for GPU
Progress batch 0
Progress batch 10
Progress batch 20
Progress batch 30
Progress batch 40
Progress batch 50
Progress batch 60
Progress batch 70
...
Progress batch 280

I have three questions about the result:
Can cv_loss represents frr ？ What's the define of cv_loss?

How to check the frr results of "nihao wenwen" and "hi， xiaowen"， there are only one result in print？

How does FAR fix a false alarm every hour?

Thanks very much to answer these questions in your spare time! I believe the wenet-kws will be better and better!

Server socket error when training while another task already run

Describe the bug

I have a server with 4GPU gtx1080 ubuntu 16.4

When I run train process using run.sh, if already another train task was already running, it will occur error:

Start training ...
[W socket.cpp:401] [c10d] The server socket has failed to bind to [::]:29400 (errno: 98 - Address already in use).
[W socket.cpp:401] [c10d] The server socket has failed to bind to 0.0.0.0:29400 (errno: 98 - Address already in use).
[E socket.cpp:435] [c10d] The server socket has failed to listen on any local network address.

How to solve this case ?

run_fsmn_ctc.sh在训练过程中，加载base.pt爆出问题

您好：
我在训练hi_xiaowen中，运行run_fsmn_ctc.sh，发现在训练阶段（stage 1）的时候，git下base模型，在价值checkpoint的时候，爆出如下问题：

怎么获得自定义“关键词”识别模型？

我用TTS(100个发音人左右)生成自己的“关键词”数据，训练之后发现效果不太好。

Max pooling loss is slow

I create a model of 100 keywords, and found that the max_pooling loss calculation is very slow. the loss calculation takes 8 second, loss.backward() takes about 16 second per batch(256).
Any plan to improve the speed of this loss ?

容易过拟合

您好，你们的工程非常棒，集合了小型的优秀的唤醒词模型以及提出创新性的了max_pooling loss．从我们用自己的数据跑你们的模型来看，比较容易过拟合，具体表现：
１，训练集loss过快收敛，训练集acc过快的到达95%以上，大概两个step的时间

2，验证集的数据稍微和训练集有些不一致，loss就比较大，验证集acc=0．如果从同类的数据集中划出一部分数据作为验证集，剩余的作为训练集，loss就比较正常，acc也能达到95%以上.

3，和验证集比较类似的测试集（包括纯干净的数据)，测试结果也不佳，激活很差，有的激活率为0

4，从我们的实验结果来看，我们最终的测试集得和训练集尽可能的像，哪怕有比较小的差距，测试结果都是一边倒，个位数的识别率．

5，不知道你们有没有这样的情况，或者说我们还有哪里的技术点没有get到？有没有一些解决方案？
谢谢，期待你们的回复．

How to run demo with keywords "hi wenwen"

Dear author:
I noticed you have published the pretrained "hey wenwen" checkpoints. But I dont know how to play with your pretrained models. Could you kindly give some simple guides on it? thank you very much.

Add acknowledgement to Mobvoi

The TCN model and the max-pooling loss are basically the same as the one used inside Mobvoi.
Also, one of the contributors did his internship at Mobvoi.

I would recommend adding acknowledgement to Mobvoi in README.md.

训练意外中断，能否以断点处的模型为基准，继续训练？

如题

deployment of wekws on microcontroller

Hi @robin1001,
I'd like to customize your model for my own wake word and then deploy it on a microcontroller.
Can this model be customized and deployed on ESPRESSIF ESP32-S3 boards in streaming mode?

Thank you very much in advance.

工程训练以及推理流程

哈喽，可以提供下该工程的训练以及推理流程吗，我看文档中没有这些。

Android can not build to release apk

CMVN expression in cmvn.py

when I read code between source file compute_cmvn_stats.py & cmvn.py, it seems that variance is calculated by

var = sum(x_i^2)/n - mean(x)^2

Is there something wrong, or something else can explain this?

打响指用来唤醒

Is your feature request related to a problem? Please describe.
打响指的唤醒方式

Describe the solution you'd like
打一个响指, 作为唤醒的声音.

Describe alternatives you've considered

Additional context
让用户录制一个唤醒声音, 比如打响指, 或者用户说"张三"

conda install 报错

执行命令

$ conda install pytorch=1.10.0 torchaudio=0.10.0 cudatoolkit=11.1 -c pytorch -c conda-forge

提示：Solving environment: failed with initial frozen solve. Retrying with flexible solve.

Add the Common Voice Single Word Dataset

https://commonvoice.mozilla.org/en/datasets

Finding single word datasets for English is hard and the Single Word Dataset from Common Voice is a rarity being multi national.
Also it has a very useful sample 'Hey' that can be concatenated with other keywords.
I used Sox to arrange by pitch/trim and then used Hey from Common Voice and Marvin from the Google Dataset to concatenate 'Hey Marvin' to create a good phonetically unique keyword.

CMVN为什么使用平方和

我看CMVN计算方差使用的是平方和而不是方差，请问这有什么特殊的含义吗，是使用平方和效果会更好一点吗？
torch.sum(torch.square(mat), axis=0)

Issue during testing while training with 80 Fbins hyperparameter

Hi,
I train a new model after changing Fbins to 80. The model trained successfully, but when I tried to test the model on some test cases, getting the following error:

RuntimeError: The size of tensor a (80) must match the size of tensor b (40) at non-singleton dimension 2

I just change the hyperparameter in conf file, apart from this, do we need to make change anywhere else in the code?
Is it something hard coded?

Plotted DET Curve is empty

Hi,
I train a model on my custom dataset, with a single keyword.
Using compute_accuracy.py, I'm getting, 97% accuracy on the test dataset.

But when I tried to plot the DET curve using, plot_det_curve.py, the output DET curve image is empty. (Before running this, I already compute score.txt, and stats.txt)

Thanks

ds_tcn parameter count incorrect?

2022-09-21 12:14:09,323 WARNING the number of model params: 287490
according to log, it's about 287K

Regarding KWS model generation for web browser

I have used speech commands for KWS
https://github.com/wenet-e2e/wekws/tree/main/examples/speechcommand_v1/s0
to generate a model. I have generated model in onnx format. I do not know how to proceed after that. I see that this model is not compatible with Vosk web browser script. Please let me know if there is any better approach as well.

Overtraining and Bias towards keywords

Hi,

I'm facing the following issues using wekws:

It is overfitting after 2-4 epochs only. (Even train on hundreds, or thousands of hours of data).
High False positive. When there are many keywords like 20, it's confusing between those, and more bias towards keywords than Freetext (-1) class.
Confuse between similar sounding words, and predict freetext as a keyword.

Can you please suggest any solution for it?

Thanks

android项目似乎存在内存泄露的问题

Is your feature request related to a problem? Please describe.
运行android demo似乎存在内存泄露的问题，在移动设备上运行几个小时 native的内存会达到几百mb。
检查发现似乎在runtime/android/app/src/main/cpp/wekws.cc中存在问题，51行accept_waveform需要加上
env->ReleaseShortArrayElements(jWaveform, waveform,0)释放内存

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

About personalized wake-up word.

Hi, I have a question that I look at this code but didn't found the voiceprint part.
Can this project now support personalized wake-up word with voiceprint?

Step wise guidelines

Hi team,
This is awesome. I want to recognize few set of key words from live audio and print them. I am planning to use my own training dataset. I don't understand what the flow is. What inputs to give(i mean arguments)? and what is the expected output.

FFT result different with kaldi SRFFT

Hello
I compared the result of WeKws fft and kaldi srfft, I found that the result is different, could you tell me which method of fft do we used ? the project is clean and light, I liked it!

Thanks

Questions about optimizer.zero_grad()

Hello, this repository is a very amazing project about KWS.
When I was reading the source code, I noticed that in the training stage, when optimizer.step() is called, optimizer.zero_grad() is not called, there are some differences with other projects I've seen before, and I'm wondering if this is a special trick or some other reason.
I'm very much looking forward to your reply.

Converting onnx model to ort format

Do we have an inbuilt function within the script to convert onnx model to ort format?
If yes where it is, if no, how can I convert it?
Thanks

How to prepare dataset for RIR & Musan Augmentation

Hi,
Would like know how to prepare dataset for RIR & Musan Augmentation
I go through the script, and understand that it needs data in .mdb format that should be inside lmdb folder. I have raw audio files, how to prepare data for it?
Also, would like to know, is there any flag in the configuration file, which I can use as a flag to apply augmentation or not.

Thanks

AttributeError: 'MDTC' object has no attribute 'padding'

when i ran "examples/speechcommand_v1/s0/run.sh" step 4 , reported this error

hi_xiaowen-mdtc perfermance can't be reproduced

I followed the stages in "$root_path/examples/hi_xiaowen/s0/run.sh" ,
and try to reproduce network perfermance using config/mdtc.yaml, num_average=10, max_epoch=80,
but after training and check results(score.txt, stats.0.txt, stats.1.txt), the score, fa and recall seems very strange:

the highest score comes out at first 1/2 frame, the keyword speech has not be spoken at that time.
most scores are lower than 0.5.
because we config two keywords, but most postive wav got valuable and high score at both two output points.
this is strange, when i change to ds_tcn config, the score seems correct.

missing token_file and lexicon_file

Hi, thanks for your work, when i run wekws/bin/stream_kws_ctc.py，mising token_file and lexicon_file, could you provide a
example?

转换 pytorch 模型到 onnx模型出错

运行 examples/speechcommand_v1/s0/run.sh 代码。在转换 onnx模型时报错。请问一下 onnx对应的版本号是多少？

/home/ycwang/wekws/wekws/model/mdtc.py:247: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if in_cache.size(0) > 0:
/home/ycwang/wekws/wekws/model/mdtc.py:106: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if cache.size(0) == 0:
/home/ycwang/wekws/wekws/model/mdtc.py:110: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert outputs.size(2) > self.padding
/home/ycwang/wekws/wekws/model/mdtc.py:257: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if in_cache.size(0) > 0:
/home/ycwang/wekws/wekws/model/mdtc.py:187: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if in_cache.size(0) > 0:
Export to onnx succeed, but pytorch/onnx have different
                 outputs when given the same input, please check!!!

Is there any pretrained model?

Can you provide some pretrained model for us to test?

when mdtc add cache?

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

l2正则化

你好，你们wekws的优化器，默认是对所有参数进行l2正则化的吗？bias和bn参数，也l2正则化了？

Roadmap file missed ???

The Roadmap link in the homepage is invaild.

wenet-e2e / wekws Goto Github PK

wekws's People

Contributors

Stargazers

Watchers

Forkers

wekws's Issues

Recommend Projects

Recommend Topics

Recommend Org