nangongmujd Goto Github PK
Type: User
Type: User
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
an open-source implementation of sequence-to-sequence based speech processing engine
🔊 Text-Prompted Generative Audio Model
vits2 backbone with bert
An implementation of DenoiseNet https://arxiv.org/pdf/1701.01687.pdf
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
3D face swapping implemented in Python
⏩ Generating speech in a single forward pass without any attention!
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
1 mins voice data can also be used to train a good TTS model!
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
Unoffical implementation of Megatts2
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
Command line utility for forced alignment using Kaldi
A Demo of Mandarin/Chinese TTS frontend
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy
A No-Recurrence Sequence-to-Sequence Model for Speech Recognition
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Multilingual text (NLP) processing toolkit
A python package to analyze and compare voices with deep learning
SoftVC VITS Singing Voice Conversion
This repo contains code for speech vs music vs noise classification
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)
Official implementation of Meta-StyleSpeech and StyleSpeech
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.