Topic: speech Goto Github
Some thing interesting about speech
Some thing interesting about speech
speech,🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Organization: coqui-ai
Home Page: http://coqui.ai
speech,🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
User: babysor
speech,SoftVC VITS Singing Voice Conversion
Organization: svc-develop-team
speech,🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
Organization: huggingface
Home Page: https://huggingface.co/docs/datasets
speech,WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
User: m-bain
speech,Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Organization: idea-research
Home Page: https://arxiv.org/abs/2401.14159
speech,kaldi-asr/kaldi is the official location of the Kaldi project.
Organization: kaldi-asr
Home Page: http://kaldi-asr.org
speech,AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Organization: aigc-audio
Home Page: https://huggingface.co/spaces/AIGC-Audio/AudioGPT
speech,:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Organization: mozilla
speech,ModelScope: bring the notion of Model-as-a-Service to life.
Organization: modelscope
Home Page: https://www.modelscope.cn/
speech,Silero VAD: pre-trained enterprise-grade Voice Activity Detector
User: snakers4
speech,EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
User: netease-youdao
speech,Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
Organization: paddlepaddle
speech,💬 Speech recognition for your site
User: talater
Home Page: https://www.talater.com/annyang/
speech,VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Organization: openbmb
speech,Silero Models: pre-trained text-to-speech models made embarrassingly simple
User: snakers4
speech,Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
User: mahmoudashraf97
speech,Build local voice agents with open-source models
Organization: huggingface
speech,Low-latency AI engine for mobile devices & wearables
Organization: cactus-compute
Home Page: https://cactuscompute.com
speech,Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
User: jianchang512
Home Page: https://pyvideotrans.com
speech,Foundational model for human-like, expressive TTS
Organization: metavoiceio
Home Page: https://themetavoice.xyz/
speech,An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Organization: modelscope
speech,Noise supression using deep filtering
User: rikorose
Home Page: https://huggingface.co/spaces/hshr/DeepFilterNet2
speech,🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.
User: avinashkranjan
Home Page: https://amazing-python-scripts.avinashranjan.com
speech,Code examples for new APIs of iOS 10.
User: shu223
speech,OpenAI Whisper ASR Webservice API
User: ahmetoner
Home Page: https://ahmetoner.github.io/whisper-asr-webservice
speech,A simple, high-quality voice conversion tool focused on ease of use and performance.
Organization: iahispano
Home Page: https://applio.org
speech,Lingvo
Organization: tensorflow
speech,Data manipulation and transformation for audio signal processing, powered by PyTorch
Organization: pytorch
Home Page: https://pytorch.org/audio
speech,MARS5 speech model (TTS) from CAMB.AI
Organization: camb-ai
Home Page: https://www.camb.ai
speech,aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
User: readbeyond
Home Page: http://www.readbeyond.it/aeneas/
speech,Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Organization: linto-ai
speech,💬 SpeechGPT is a web application that enables you to converse with ChatGPT.
User: hahahumble
Home Page: https://speechgpt.app
speech,Python library and CLI tool to interface with Google Translate's text-to-speech API
User: pndurette
Home Page: http://gtts.readthedocs.org/
speech,pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
User: mravanelli
speech,WaveNet vocoder
User: r9y9
Home Page: https://r9y9.github.io/wavenet_vocoder/
speech,Controllable and fast Text-to-Speech for over 7000 languages!
Organization: digitalphonetics
speech,Free, easy, portable audio engine for games
User: jarikomppa
Home Page: http://soloud-audio.com
speech,Voice Activity Detector (VAD) : low-latency, high-performance and lightweight
Organization: ten-framework
Home Page: https://huggingface.co/TEN-framework/ten-vad
speech,Open-Source Large Vocabulary Continuous Speech Recognition Engine
Organization: julius-speech
speech,Praat: Doing Phonetics By Computer
Organization: praat
Home Page: https://praat.org
speech,A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
User: kyubyong
speech,Community list of startups working with AI in audio and music technology
User: csteinmetz1
Home Page: https://csteinmetz1.github.io/ai-audio-startups/
speech,Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs
User: travisvn
Home Page: https://tts.travisvn.com
speech,DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/
Organization: delta-ml
Home Page: https://delta-didi.readthedocs.io/
speech,The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
User: miteshputhran
speech,SALMONN family: A suite of advanced multi-modal LLMs
Organization: bytedance
Home Page: https://bytedance.github.io/SALMONN/
speech,自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代码含TensorFlow和PyTorch两版本)
User: dengbocong
speech,General Speech Restoration
User: haoheliu
Home Page: https://haoheliu.github.io/demopage-voicefixer/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
Personal AI Assistant
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.