kinghuin / AIGC-progress
Follow the rapid development of AIGC models and applications. | 跟上AIGC模型和应用快速发展的步伐 🚀
☆80Updated last year
Related projects ⓘ
Alternatives and complementary repositories for AIGC-progress
- a text-conditional diffusion probabilistic model capable of generating high fidelity audio.☆127Updated 5 months ago
- 实现基于4k视频的高分辨率人物换衣、虚拟试穿、物品替换☆51Updated 2 years ago
- Anim-400K: A dataset designed from the ground up for automated dubbing of video☆99Updated 5 months ago
- The project page repo for Neural Dubber.☆29Updated last year
- Implementation for the paper "Can Language Models Learn to Listen?"☆59Updated last year
- ☆13Updated 8 months ago
- ☆35Updated 5 months ago
- [NCMMSC'2024] Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech☆22Updated 3 months ago
- Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023☆83Updated last year
- The deme page of InstructTTS☆155Updated 9 months ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆75Updated 2 months ago
- ☆47Updated 4 months ago
- SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems☆77Updated 10 months ago
- official code for CVPR'24 paper Diff-BGM☆47Updated last month
- Keras implement of Finite Scalar Quantization☆64Updated last year
- ☆19Updated 5 months ago
- Code for Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction (ACL24))☆33Updated 3 months ago
- 单独维护的中文TTS☆35Updated 2 years ago
- Make-An-Audio-3: Transforming Text/Video into Audio via Flow-based Large Diffusion Transformers☆83Updated 3 weeks ago
- Voice Conversion Experiments for THUHCSI Course : <Digital Processing of Speech Signals>☆8Updated last year
- GPT-style network for phonemization with durations of text☆62Updated 8 months ago
- Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).☆122Updated 4 months ago
- Awesome Colab Projects Collection☆25Updated 10 months ago
- flow mirror models from JZX AI Labs☆40Updated last month
- ☆14Updated 2 months ago
- ☆166Updated 4 months ago
- 百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,时延低至800ms,低配置也可运行,支持打断☆43Updated last week
- Extension of ChatTTS, 3x Faster on Windows, Support Voice Cloning and Mobile Deployment☆26Updated 2 weeks ago
- ✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM☆104Updated 2 weeks ago
- ☆158Updated 3 weeks ago