Topic: speech Goto Github

Some thing interesting about speech

👇 Here are 1555 public repositories matching this topic...

ahmetoner / whisper-asr-webservice

speech,OpenAI Whisper ASR Webservice API

User: ahmetoner

Home Page: https://ahmetoner.github.io/whisper-asr-webservice

automatic-speech-recognition speech-recognition speech-to-text openai-whisper docker asr speech

aigc-audio / audiogpt

speech,AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Organization: aigc-audio

Home Page: https://huggingface.co/spaces/AIGC-Audio/AudioGPT

audio gpt music sound speech talking-head

avinashkranjan / amazing-python-scripts

speech,🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

User: avinashkranjan

Home Page: https://amazing-python-scripts.avinashranjan.com

projects python-projects machine-learning artificial-intelligence python speech webcam python-scripts hacktoberfest

babysor / mockingbird

speech,🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

User: babysor

ai speech pytorch deep-learning text-to-speech tts

bytedance / salmonn

speech,SALMONN: Speech Audio Language Music Open Neural Network

Organization: bytedance

Home Page: https://bytedance.github.io/SALMONN/

audio audio-processing large-language-models multi-modal speech speech-recognition bytedance tsinghua-university music iclr2024

coqui-ai / tts

speech,🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Organization: coqui-ai

Home Page: http://coqui.ai

python text-to-speech deep-learning speech pytorch tts vocoder tacotron glow-tts melgan

csteinmetz1 / ai-audio-startups

speech,Community list of startups working with AI in audio and music technology

User: csteinmetz1

Home Page: https://csteinmetz1.github.io/ai-audio-startups/

audio list startups music speech

delta-ml / delta

speech,DELTA is a deep learning based natural language and speech processing platform.

Organization: delta-ml

Home Page: https://delta-didi.readthedocs.io/

asr custom-ops deep-learning emotion-recognition front-end inference nlp nlu ops seq2seq sequence-to-sequence serving speaker-verification speech speech-recognition tensorflow tensorflow-lite tensorflow-serving text-classification text-generation

dengbocong / nlp-paper

speech,自然语言处理领域下的相关论文（附阅读笔记），复现模型以及数据处理等（代码含TensorFlow和PyTorch两版本）

User: dengbocong

dialogue speech nlp-machine-learning paper tensorflow2 pytorch nlp bert

hahahumble / speechgpt

speech,💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

User: hahahumble

Home Page: https://speechgpt.app

chatbot chatgpt language-learning speech chat conversation

haoheliu / voicefixer

speech,General Speech Restoration

User: haoheliu

Home Page: https://haoheliu.github.io/demopage-voicefixer/

speech-processing speech-synthesis speech-enhancement speech-analysis speech tts declipping dereverberation denoise super-resolution

huggingface / datasets

speech,🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Organization: huggingface

Home Page: https://huggingface.co/docs/datasets

nlp datasets pytorch tensorflow pandas numpy natural-language-processing computer-vision machine-learning deep-learning

iahispano / applio

speech,VITS-based Voice Conversion focused on simplicity, quality and performance

Organization: iahispano

Home Page: https://applio.org

rvc vc vits voice ai voice-cloning voice-conversion applio voice-clone pytorch

idea-research / grounded-segment-anything

speech,Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Organization: idea-research

Home Page: https://arxiv.org/abs/2401.14159

open-vocabulary-detection open-vocabulary-segmentation data-generation automatic-labeling-system caption speech 3d-whole-body-pose-estimation image-editing

jarikomppa / soloud

speech,Free, easy, portable audio engine for games

User: jarikomppa

Home Page: http://soloud-audio.com

audio game-development engine sound sound-effects synthesizer game portable mp3 ogg flac opensl-es python c cpp ruby gamemaker blitzmax speech speech-to-text

jianchang512 / stt

speech,Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务，输出json、srt字幕带时间戳、纯文字格式

User: jianchang512

Home Page: https://v.wonyes.org

speech speech-recognition speech-to-text stt

jtkim-kaist / vad

speech,Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

User: jtkim-kaist

acam attention bdnn data dnn lstm speech speech-activity-detection speech-recognition vad voice-activity-detection voice-detection

julius-speech / julius

speech,Open-Source Large Vocabulary Continuous Speech Recognition Engine

Organization: julius-speech

speech recognition audio-processing speech-recognition

kaldi-asr / kaldi

speech,kaldi-asr/kaldi is the official location of the Kaldi project.

Organization: kaldi-asr

Home Page: http://kaldi-asr.org

kaldi c-plus-plus cuda shell speech-recognition speech-to-text speaker-verification speaker-id speech

kyubyong / dc_tts

speech,A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

User: kyubyong

speech speech-to-text tts

kyubyong / tacotron

speech,A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

User: kyubyong

tts tensorflow speech-synthesis-model speech

lhotse-speech / lhotse

speech,Tools for handling speech data in machine learning projects.

Organization: lhotse-speech

Home Page: https://lhotse.readthedocs.io/en/latest/

speech audio kaldi machine-learning ai deep-learning pytorch data python speech-recognition

linto-ai / whisper-timestamped

speech,Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Organization: linto-ai

deep-learning speech speech-recognition speech-to-text asr machine-learning python python3 pytorch attention-is-all-you-need

m-bain / whisperx

speech,WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

User: m-bain

asr speech speech-recognition speech-to-text whisper

mahmoudashraf97 / whisper-diarization

speech,Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

User: mahmoudashraf97

asr speaker-diarization speech speech-recognition speech-to-text whisper

metavoiceio / metavoice-src

speech,Foundational model for human-like, expressive TTS

Organization: metavoiceio

Home Page: https://themetavoice.xyz/

text-to-speech ai deep-learning pytorch speech speech-synthesis tts voice-clone zero-shot-tts

miteshputhran / speech-emotion-analyzer

speech,The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

User: miteshputhran

audio-files data-science deep-learning deep-neural-networks emotion emotion-recognition keras natural-language-processing natural-language-understanding neural-network python3 speech speech-emotion-recognition speech-recognition voice

modelscope / modelscope

speech,ModelScope: bring the notion of Model-as-a-Service to life.

Organization: modelscope

Home Page: https://www.modelscope.cn/

nlp cv speech multi-modal science deep-learning machine-learning python

mozilla / tts

speech,:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Organization: mozilla

deep-learning text-to-speech python pytorch tacotron tts speaker-encoder dataset-analysis tacotron2 tensorflow2

mravanelli / pytorch-kaldi

speech,pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

User: mravanelli

speech-recognition gru dnn kaldi rnn-model pytorch timit deep-learning deep-neural-networks recurrent-neural-networks

natspeech / natspeech

speech,A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

Organization: natspeech

speech-synthesis pytorch tts speech huggingface portaspeech diffsinger diffspeech

netease-youdao / emotivoice

speech,EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

User: netease-youdao

pytorch speech speech-synthesis tts multi-speaker text-to-speech deep-learning prompt emotivoice ai

ovidijusparsiunas / deep-chat

speech,Fully customizable AI chatbot component for your website

User: ovidijusparsiunas

Home Page: https://deepchat.dev

ai angular chat chatbot chatgpt component openai react solid svelte

paddlepaddle / models

speech,Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

Organization: paddlepaddle

paddlepaddle deep-learning neural-network computer-vision natural-language-processing recommendation speech nlp cv models

pndurette / gtts

speech,Python library and CLI tool to interface with Google Translate's text-to-speech API

User: pndurette

Home Page: http://gtts.readthedocs.org/

speech python tts text-to-speech gtts speech-api cli python-library pypi

praat / praat

speech,Praat: Doing Phonetics By Computer

Organization: praat

Home Page: http://www.praat.org

speech phonetics acoustics speech-analysis

pykaldi / pykaldi

speech,A Python wrapper for Kaldi

Organization: pykaldi

Home Page: https://pykaldi.github.io

python wrapper kaldi openfst asr speech-recognition speech language-model feature-extraction clif

pytorch / audio

speech,Data manipulation and transformation for audio signal processing, powered by PyTorch

Organization: pytorch

Home Page: https://pytorch.org/audio

audio python io speech machine-learning pytorch audio-processing

r9y9 / wavenet_vocoder

speech,WaveNet vocoder

User: r9y9

Home Page: https://r9y9.github.io/wavenet_vocoder/

wavenet speech-synthesis speech-processing pytorch python wavenet-vocoder neural-vocoder speech

readbeyond / aeneas

speech,aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

User: readbeyond

Home Page: http://www.readbeyond.it/aeneas/

speech alignment tts python linux macos windows nlp espeak espeak-ng

rikorose / deepfilternet

speech,Noise supression using deep filtering

User: rikorose

Home Page: https://huggingface.co/spaces/hshr/DeepFilterNet2

pytorch audio deep-learning speech-enhancement noise-suppression speech rust

roatienza / deep-learning-experiments

speech,Videos, notes and experiments to understand deep learning

User: roatienza

deep-learning deep-learning-tutorial artificial-intelligence pytorch vision speech nlp

santi-pdp / segan

speech,Speech Enhancement Generative Adversarial Network in TensorFlow

User: santi-pdp

speech gan tensorflow deep-learning deep-neural-networks generative-model generative-adversarial-networks

shu223 / ios-10-sampler

speech,Code examples for new APIs of iOS 10.

User: shu223

ios ios10 swift-3 swift-4 speech metal cnn image-recognition convolutional-neural-networks demo

snakers4 / silero-models

speech,Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

User: snakers4

speech-recognition speech-to-text stt asr pretrained-models english german spanish stt-benchmark pytorch

sooftware / conformer

speech,[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

User: sooftware

conformer transformer cnn transformer-xl asr speech-recognition pytorch conv convolution augmented

svc-develop-team / so-vits-svc

speech,SoftVC VITS Singing Voice Conversion

Organization: svc-develop-team

ai audio-analysis generative-adversarial-network singing-voice-conversion so-vits-svc sovits variational-inference vc vits voice

talater / annyang

speech,:speech_balloon: Speech recognition for your site

User: talater

Home Page: https://www.talater.com/annyang/

speech-recognition speech speech-to-text voice hacktoberfest

tensorflow / lingvo

speech,Lingvo

Organization: tensorflow

speech-recognition translation speech-to-text machine-translation mnist seq2seq language-model tts asr lm

yeyupiaoling / ppasr

speech,基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

User: yeyupiaoling

asr paddlepaddle deep-learning chinese speech-to-text speech speech-recognition streaming-asr conformer squeezeformer

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.