xxsuper,github

agents

Build real-time multimodal AI applications 🤖🎙️📹

aniportrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

ansj_seg

ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典

audio2photoreal

Code and dataset for photorealistic Codec Avatars driven from audio

bark-voice-cloning

Bark Voice Cloning and Voice Cloning for Chinese Speech

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).

chattts

ChatTTS is a generative speech model for daily dialogue.

cland-websitemanage

新闻管理后台

comfyui

The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.

conversational-ai-livekit

基于阿里云的tts, llm,stt模型构建的实时对话应用

cosyvoice

LLM based TTS model, providing inference/training/deployment full-stack ability.

crawl-xigua-video

爬取西瓜小视频

dagscheduler

deepfacelab

DeepFaceLab is the leading software for creating deepfakes.

diffsinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

digital_human_video_player

带HTTP API的数字人视频播放器，使用gradio api对接Easy-Wav2Lip、Sadtalker、GeneFacePlusPlus

dinet

The source code of "DINet: deformation inpainting network for realistic face visually dubbing on high resolution video."

dubbo

Apache Dubbo is a high-performance, java based, open source RPC framework.

dubbokeeper

dubbo服务管理以及监控系统

emotion2vec

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

er-nerf

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

example.v2

An example project for book 'Go Programming & Concurrency in Practice, 2nd edition' (《Go并发编程实战》第2版).

facefusion

Next generation face swapper and enhancer

fish-speech

Brand new TTS solution

funasr

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

genefaceplusplus

GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code

ginserver

基于go-gin的web服务框架

gpt-sovits

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

grok-1

Grok open release

xxsuper Goto Github PK

xxsuper's Projects

Recommend Projects

Recommend Topics

Recommend Org