Coder Social home page Coder Social logo

speechalgorithms's Introduction

Speech Algorithms

There is an English README

下载单独文件夹

目录

语音前端算法

标题 代码
语音降噪初探——谱减法 Code
基于Mask的语音分离 Code
生成有噪声/回声/混响/啸叫的混合语音样本 Code
解析自适应滤波回声消除 Code
使用AMR编解码器生成VAD的标签 Code
使用TDOA进行声源定位 Code
以任意频率重采样语音信号 Code
音频数字水印的嵌入和提取 Code
语音变速和变调 Code
分帧,加窗和DFT Code
WebRTC VAD流程解析 Code
基于卡尔曼滤波器的回声消除算法 Code
WebRTC ANR流程解析 Code
WebRTC AGC流程解析 Code
WebRTC AEC流程解析 Code
使用互相关进行音频对齐 Code
基于音频指纹的听歌识曲系统 Code
戈泽尔算法 Code
DNN单通道语音增强 Code
使用LSTM进行端点检测 Code
CGMM-MVDR Code
AI降噪的N种数据扩增方法 Code
生成丰富的啸叫样本 Code
瞬态噪声抑制 Code

语音后端算法

标题 代码
使用CNN进行简单的指令识别 Code
说话人性别识别 Code
使用XGBoost进行环境声音分类 Code
生成下雨的声音 Code

语音编解码器

标题 代码
基于深度学习语音编解码器 Code
语音编解码器考古之G.711 Code

语音评价标准

标题 代码
语音客观评价标准——语音质量评价 Code
语音可懂度评估(一)——基于清晰度指数的方法 Code
语音相似度评价 Code

speechalgorithms's People

Contributors

ryuk17 avatar toughmanl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speechalgorithms's Issues

使用互相关进行音频对齐存在误差

拿音频手动做延时,计算对齐误差在0~2ms
播放音频并录音,这种情况下的延时,计算误差达到20ms
这种误差有何改进方法吗,尤其第二种情况,您对此还有研究吗?感谢!

AEC: kalman filter, post error matrix update

I have doubt with the post error matrix update formula ”Rmu = (IL - K @ X.T) * Rm“ . When I derive this formula, Rmu = (IL - K @ X.T) * Rm*(IL - K @ X.T)' + V(near noise).
Can you help me?

请问语音增强关于采用映射方法的

输入是5帧,输出是1帧,那么在增强阶段,对语音信号该怎么处理。是相当于一个5帧的窗口在语音上滑动吗,这样的话是不是前两针和最后两帧就舍弃了啊

praat在macos下使用与linux不同

我下载macos版本的praat之后,不知道怎么使用该软件
/Applications/AudioAlignment  ./praat estimate_delay ref.wav delay.wav 0 30

zsh: no such file or directory: ./praat
✘  /Applications/AudioAlignment  ./Praat.app estimate_delay ref.wav delay.wav 0 30

zsh: permission denied: ./Praat.app
✘  /Applications/AudioAlignment  sudo ./Praat.app estimate_delay ref.wav delay.wav 0 30

Password:
sudo: ./Praat.app: command not found

Mapping.py中的label范围与网络输出不一致问题

SpeechEnhancement/Mapping.py中train时,label直接取的是clean_mag幅值,范围没有规范在[0,1],但是model的最后一层使用了sigmoid激活函数,这样网络的输出与label的范围不一致?不知道是不是一个问题,还是我理解有误

AEC 延迟估计

文章AEC 流程解析中延迟估计方法为WebRtc_DelayEstimatorProcessFloat,但看代码只是用于logging
https://github.com/Ryuk17/SpeechAlgorithms/blob/master/WebRTC_AEC/src/aec_core.c#L896

if (aec->delay_logging_enabled) {
    int delay_estimate = 0;
    if (WebRtc_AddFarSpectrumFloat(
            aec->delay_estimator_farend, abs_far_spectrum, PART_LEN1) == 0) {
      delay_estimate = WebRtc_DelayEstimatorProcessFloat(
          aec->delay_estimator, abs_near_spectrum, PART_LEN1);
      if (delay_estimate >= 0) {
        // Update delay estimate buffer.
        aec->delay_histogram[delay_estimate]++;
      }
    }
  }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.