athena-team / athena-signal Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
How about making a release version, so that we can using pip to directly install
你好,dios_ssp_mvdr_header.c中频域结果做ifft的时候,有如下代码
ptr_mvdr->fft_in[0] = ptr_mvdr->m_mvdr_out_re[0];
ptr_mvdr->fft_in[ptr_mvdr->m_fft_size / 2] = ptr_mvdr->m_mvdr_out_re[ptr_mvdr->m_fft_size / 2];
for (i = 1; i < ptr_mvdr->m_fft_size / 2; i++)
{
ptr_mvdr->fft_in[i] = ptr_mvdr->m_mvdr_out_re[i];
ptr_mvdr->fft_in[ptr_mvdr->m_fft_size - i] = -ptr_mvdr->m_mvdr_out_im[i];
}
dios_ssp_share_irfft_process(ptr_mvdr->mvdr_fft, ptr_mvdr->fft_in, ptr_mvdr->m_win_data);
从代码看,ptr_mvdr->fft_in前半部分只有实部m_mvdr_out_re,后半部分只有虚部-ptr_mvdr->m_mvdr_out_im,为什么是这么放呢?我理解一般都是后半部分放前半部分的共轭呀?望高手解答!
Hi
想請問一下, AEC 算法本身的延遲時間多少呢? 我估計了範例聲音檔AEC處理後大概過了 175ms 後近端語音才出現,這個 175 ms 可以透過什麼方法縮短呢?
Thanks
#Test AGC input_file = ["agc_testaudio3.wav"] out_file = ["agc_testaudio3_athenaout.wav"] # config = { "add_AGC": 1, "add_NS":0, 'add_AEC': 0} config = { "add_AGC": 1} athena_signal_process(input_file, out_file, config)
I want to test agc algoirthm in 'athena_signal_test.py' which is shown above.
However, error occurred:
Any advice to solve this problem?
Looking forward to your reply.
Installing collected packages: athena-signal
Successfully installed athena-signal-0.1.0
Traceback (most recent call last):
File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 14, in swig_import_helper
return importlib.import_module(mname)
File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'athena_signal._dios_signal'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "examples/athena_signal_test.py", line 19, in <module>
from athena_signal.dios_ssp_api import athena_signal_process
File "/home/pi/athena-signal/athena_signal/__init__.py", line 18, in <module>
from athena_signal import dios_ssp_api
File "/home/pi/athena-signal/athena_signal/dios_ssp_api.py", line 19, in <module>
from athena_signal.dios_signal import dios_ssp_v1
File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 17, in <module>
_dios_signal = swig_import_helper()
File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 16, in swig_import_helper
return importlib.import_module('_dios_signal')
File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_dios_signal'
Is it possible to provide an interface to Alsa or Pulseaudio that will feed a multichannel pcm into pcm streams?
So you can stream direct from microphone and output for ref channel?
Dear athena-signal author :
I want to make the doa only respond to human voice,could you tell me any information?
Regard,
Thanks
使用3角麦的数据,DOA容易在正确方向和对角线方向间跳跃,比如正确方向是60度,声源位置不变的情况下会时不时指向240度;也就是容易在角度a和(a+180)间切换,请问是什么原因?致谢!
Thanks for sharing the noise reduction code. There exists some questions that make me confusing in the noise reduction module(in the file dios_ssp_ns_api.c).
In the following code from the dios_ssp_ns_api.c (computing mmse gain), can you tell me the meaning of the variable "pSAP" or where can I find it in any research paper?
Thanks very much!
for (i = 0; i < srv->m_sp_size; ++i )
{
vk = srv->m_sp_snr[i]srv->m_gammak[i]/(1+srv->m_sp_snr[i]);
j00 = first_modified_Bessel( 0, vk/2 );
j11 = first_modified_Bessel( 1, vk/2 );
tmpC = (float)exp( -0.5vk );
if (srv->m_gammak[i] < 1.0e-3)
{
tmpA = 0; // Limitation
}
else
{
tmpA = (float)sqrt(PI) / 2 * (float)pow( vk, 0.5 ) * tmpC / srv->m_gammak[i] ;
}
tmpB = (1+vk)j00+vkj11;
evk = (float)exp( vk );
Lambda = (1-0.3f) / 0.3f * evk / ( 1+srv->m_sp_snr[i] );
pSAP = Lambda/(Lambda+1 );
tmp = tmpA*tmpB*pSAP;
srv->m_gain[i] = tmp;
}
athena_signal/dios_signal.h
line 24
int dios_v1(int argc, char **argv, int *fe_switch, size_t m, float *mic_coord, size_t n, int mic_num, int ref_num, float loc_phi);
shoube be
int dios_ssp_v1(int argc, char **argv, int *fe_switch, size_t m, float *mic_coord, size_t n, int mic_num, int ref_num, float loc_phi);
athena_signal/dios_ssp_api.py
line 88
if self.feature_switch[4] == 1 and mic_coord is not None:
should be
if self.feature_switch[4] >= 1 and mic_coord is not None:
when I use mvdr, it appear The matrix is singular error, do you meet this problem?and how I solve this problem。
In the line 119 of dios_ssp_share_noiselevel.c, only if in_energy is low enough, we'll start to update srv->noise_level_first , so the srv->noise_level_first is suppose to lower than srv->noise_level_second which is update in all condition.
But in the line 151, the in_energy is compared to 20.0f * srv->noise_level_second and 20.0f * srv->noise_level_first seperately
if srv->noise_level_first < srv->noise_level_second is always true, is this can be just writen as
if ((in_energy > 20.0f * srv->noise_level_second) )
So what's the real intention to update srv->noise_level_first while we have the srv->noise_level_second? Maybe there is some good trick I haven't realized yet. Thanks
in file
/kernels/dios_ssp_doa/dios_ssp_doa_api.c
line 125, ptr_doa->m_doa_fid = (int*)calloc(ptr_doa->m_angle_num, sizeof(int));
you only alloc ptr_doa->m_angle_num ,but when you use it
line 169- > Line 172
for(i = 0; i < ptr_doa->m_frq_bin_num; ++i)
{
ptr_doa->m_doa_fid[i] = ptr_doa->m_low_fid + (iptr_doa->m_frq_spptr_doa->m_fft_size)/ptr_doa->m_fs;
}
when tr_doa->m_frq_bin_num larger than ptr_doa->m_angle_num
there could be a memory leak risk occurs
It seems I cannot use .wav files directly as input. What type of inputs should I use? Thanks a lot for your help!
下面这个结构体中x表示毫米, 还是厘米?
typedef struct
{
float x; // left to right
float y; // near to far
float z; // low to high
} PlaneCoord;
How can the angle of incidence of the sound source be set manually for MVDR?
Hi athena author:
Could you tell me the algorithm of VAD?
Thank you.
Thanks for sharing the GSC processing code! There are existing some code in the "dios_ssp_gsc_abm.c" that make me confusing.
line 338 in the function "dios_ssp_gsc_gscabm_processonedatablock"
/* compute error signal in time-domain with circular convolution constraint e = [0 | new] */
for (i = 0; i < gscabm->fftsize / 2; i++)
{
gscabm->e[gscabm->fftsize / 2 + i] = gscabm->xrefdline[i] - gscabm->ytmp[gscabm->fftsize / 2 + i];
}
The code is checked carefully and it is found that
"gscabm->e" is the output for the ABM module. "gscabm->xrefdline" is from the output of fixed beamforming output, "gscabm->ytmp" is from the ABM filter output with the input from the steering output.
However , I think that "gscabm->xrefdline" should be from the steering output, and "gscabm->ytmp" should be from the ABM filter output with the input from the fixed beamforming output. Maybe I got it wrong, and please help me to understand it. The reference paper is "A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters"
Thanks very much!
i found c code is not same with formula. the function dios_ssp_mvdr_cal_weights_adpmvdr computer w,but i only see R-1,i dont see vk and vkh etc. is this use other formula?
Hi, i found there is only support 8ms audio frame in this project, is it support 10ms frame?
Good luck.
Hi All
我发现athena signal AEC和webrtc的AEC相似,请问先他们有什么不同,那些地方有改进,谢谢
Dear Athena:
首先感谢贵团队的工作,经过测试,发现gsc存在高频缺失的现象,不知道有没有改进的方法
请问代码中的subband实现是参考下面文档中的 Version 4 吗 , D=128 M=256 L=768 吗? 这个是 filter bank 的 NPR 吗?
http://www.ws.binghamton.edu/fowler/fowler%20personal%20page/EE521_files/IV-08%20Uniform%20DFT%20Filter%20Bank_2007.pdf
文档中version 3 ,Analysis 中使用的窗函数系数是w[i],那么 Synthesis 中就是 1/w[i] ,为什么 version 4 中不是这么做了?使用的是恢复系数?
subband_filter_coef 是怎么产生的,比如(Hanning , Hamming和Kaiser )这些常见的窗函数都可以吗? 还是需要设计成满足NPR一些非线性公式? 有参考文档吗?看了很多文档都没看到老师这么做的,,看到的文档都是一堆公式求解非线性问题。
仔细看了视频的filter bank部分没讲太深,自己看了很多文档也没理解透彻这部分。
谢谢
Hello, the function of "dios_ssp_matrix_inv_process" is tested using the following code:
#include "dios_ssp_share_cinv.h"
int main(int argc, char *argv)
{
float R[18] = {
258919244, 0.000000000000000, -683065112, 0.000000000000000, 123292480, 0.000000000000000,
-683065112, 0.000000000000000, 1802021921,0.000000000000000, -325262900, 0.000000000000000,
123292480, 0.000000000000000, -325262900, 0.000000000000000, 58709685, 0.000000000000000,
};
float Rinv[18] = { 0.0 };
void handle = dios_ssp_matrix_inv_init(3);
dios_ssp_matrix_inv_process(handle, R, Rinv);
return 0;
}
However, the inverse matrix result is different from through using "inv" function in matlab. Is there anything that we should pay special attantion to? Because I found that if I using the mvdr matlab code(all the matlab code is same with the c code except the "inv" function ) , the final noise reduction performance is degrade.
Look forward to your reply, Thanks!
在 dios_ssp_aec_process_api 中的 做 2步 res 的时候(line 506-517 和 line 534-543) 调用dios_ssp_aec_res_process函数时 用ref_num 做了遍历,感觉没必要,因为输入都是一样的。est_echo 已经把所有的ref 都相加了。 在你们设计的res 中,应该和ref 的数目没关系的。
hello, 能否告知一下当前 repo 的 AEC 实现是具体参考哪篇或者几篇文献吗?谢谢
Hi, I want to know which paper referenced in AGC algorithm ? Is there any indicators for assessment AGC ?
on line: https://github.com/athena-team/athena-signal/blob/master/athena_signal/kernels/dios_ssp_ns/dios_ssp_ns_api.c#L281, m_norm_win is out of boundary when i >=m_shift_size,
is l276: m_fft_size->m_shift_size?
如果从语音质量评测方向看,客观的评估方法有两类:1 有参考质量评估, 2 无参考质量评估.
有参考的方法有PESQ, 这个方法应该只能在测试仿真阶段使用。
无参考的方法有P.563, 这个应该能在实际场景中使用,只是不知道适不适合评估这类语音增强的算法?还是有其他的方法能在实际场景中评估前端信号处理算法?
@songhui5561
谢谢!
If use AEC model, how to set ref_num's value?
mic_num = 1;
ref_num=?
Hi dear athena developer,
I'm a junior leaner of audio-processing, and I want to learn AEC related knowledges, could you tell me the algorithm of AEC and DTD that athena-signal used? What's papers that I can refer to?
Dear friends:
I tried your DOA feature, all your DOA's reports are 0 degree, I also tried mine, it is the same, I think there may be some problems, could you check your code please?
Thanks
python3 examples/athena_signal_test.py succeeded
But in the interactive environment of Python3, the following error occurs
from athena_signal.dios_ssp_api import athena_signal_process
Traceback (most recent call last):
File "", line 1, in
File "/Users/xxxx/athena-signal/athena_signal/init.py", line 18, in
from athena_signal import dios_ssp_api
File "/Users/xxxx/athena-signal/athena_signal/dios_ssp_api.py", line 19, in
from athena_signal.dios_signal import dios_ssp_v1
File "/Users/xxxx/athena-signal/athena_signal/diossignal.py", line 13, in
from . import _dios_signal
ImportError: cannot import name '_dios_signal' from 'athena_signal' (/Users/xxxx/athena-signal/athena_signal/init.py)
Dear friends:
1, I notice currently you only support 8ms buffer to process, now I want to change to 10ms buffer, I tried to change some parameters, but I find the programme runs crashed, and the output is all zero, so I want to know what should I change for 10ms buffer?
2, I notice you define subband_filter_coef[] in dios_ssp_share_subband.c file, but I do not know how you get this filter, could you give your algorithm? I see it is not just hanning or hamming windows, I think if I want to change to 10ms buffer, I have to change it, correctly?
Thanks
Can you provide some benchmarks for the modules?
When I try to run the script in example/, it says segmentation fault. Can anyone help me out? Thanks a lot!
The entire output is like below:
ubuntu@ip-172-31-21-152:~/Judy/athena-signal/examples$ python3 athena_signal_test.py
#################################################
The configurations are: add_AEC: 1, add_NS: 1, add_AGC: 0, add_HPF: 0, add_BF: 0, add_DOA: 0
The number of microphones is: 1
The number of reference channels is: 1
#################################################
Segmentation fault (core dumped)
Which angle does mvdr refer to? Which axis is the included angle of x, y, z?
mvdr中的角度是指哪个角度呢?是x、y、z哪个轴的夹角呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.