fudan-generative-vision / hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
☆9,496Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for hallo
- Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance☆4,746Updated 4 months ago
- Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation☆4,195Updated 2 weeks ago
- V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.☆2,255Updated last week
- Unofficial Implementation of Animate Anyone☆2,940Updated 4 months ago
- MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators☆1,300Updated 3 months ago
- A UI-Focused Agent for Windows OS Interaction.☆7,921Updated 2 weeks ago
- Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".☆1,026Updated 3 months ago
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…☆3,092Updated 2 weeks ago
- MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising☆2,465Updated 4 months ago
- [ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion …☆1,450Updated 3 months ago
- MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.☆7,135Updated 2 weeks ago
- The open source platform for AI-native application development.☆6,216Updated this week
- MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone☆12,642Updated 3 weeks ago
- Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language m…☆3,896Updated this week
- MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation☆2,278Updated 3 months ago
- BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…☆8,905Updated this week
- [ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"☆1,440Updated 4 months ago
- "LightRAG: Simple and Fast Retrieval-Augmented Generation"☆8,824Updated this week
- MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting☆2,859Updated this week
- High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.☆4,827Updated 3 months ago
- AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation☆4,652Updated 4 months ago
- Accepted as [NeurIPS 2024] Spotlight Presentation Paper☆5,955Updated last month
- Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities。☆1,565Updated 2 weeks ago
- ☆759Updated last month
- Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"☆7,251Updated this week
- Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models☆1,617Updated 10 months ago
- Enjoy the magic of Diffusion models!☆6,589Updated this week
- [SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild☆6,652Updated 3 months ago
- EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning☆2,951Updated this week