SoulX-FlashTalk is the first 14B model to achieve sub-second start-up latency (0.87s) while maintaining a real-time throughput of 32 FPS on an 8xH800 node.
☆1,023Feb 27, 2026Updated last week
Alternatives and similar repositories for SoulX-FlashTalk
Users that are interested in SoulX-FlashTalk are comparing it to the libraries listed below
Sorting:
- SoulX-FlashHead: A unified 1.3B-parameter framework designed for high-fidelity, infinite-length, and real-time streaming portrait video g…☆458Updated this week
- ☆1,794Aug 6, 2025Updated 7 months ago
- Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"☆1,853Jan 30, 2026Updated last month
- Unlimited-length talking video generation that supports image-to-video and video-to-video generation☆4,937Dec 18, 2025Updated 2 months ago
- Align Anything: Training All-modality Model with Feedback☆4,635Nov 27, 2025Updated 3 months ago
- The next generation deep reinforcement learning tookit☆3,462Jun 16, 2023Updated 2 years ago
- [ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis☆1,622Jan 26, 2026Updated last month
- Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation☆8,645Sep 14, 2024Updated last year
- [ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation☆3,683Feb 27, 2025Updated last year
- [SIGGRAPH 2025] LAM: Large Avatar Model for One-shot Animatable Gaussian Head☆929Sep 11, 2025Updated 5 months ago
- 💰唯一正版💰 minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy minerproxy 矿池抽水 矿池代理 矿池中转 矿池抽…☆3,882Updated this week
- [CVPR 2025] Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer☆1,371Mar 13, 2025Updated 11 months ago
- FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers☆503Aug 20, 2025Updated 6 months ago
- 🔥minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,minerproxy,矿池抽水,矿池中转,矿场运维专用☆3,296Updated this week
- LinguaLinker: Audio-Driven Portraits Animation with Implicit Facial Control Enhancement☆75Jul 29, 2024Updated last year
- [AAAI 2026] EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation☆794Updated this week
- PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.☆3,546Jan 26, 2026Updated last month
- Official implementation of EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars☆395Apr 8, 2025Updated 10 months ago
- A2V: Next-Gen AI Value Compute Protocol.☆1,201Nov 12, 2025Updated 3 months ago
- The first open autoregressive foundational video AI model.☆2,891Oct 14, 2024Updated last year
- FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.☆242Feb 25, 2026Updated last week
- [ACM MM 2025] Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis☆714Nov 12, 2025Updated 3 months ago
- A docker free offline version for HeyGem; Python and Linux is all you need!☆429Jan 12, 2026Updated last month
- Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mo…☆7,766Feb 26, 2026Updated last week
- 数字底座是一款面向大型政府、企业数字化转型,基于身份认证、组织架构、岗位职务、应用系统、资源角色、数据目录、安全控制等功能构建的统一且安全的管理支撑平台。数字底座基于三员管理模式,具备微服务、多租户、容器化和国产化,支持用户利用代码生成器快速构建自己的业务应用,同时可关联诸…☆2,576Updated this week
- 🚀 The best real-time interactive AI avatar(digital human) with on-premise deployment and <1.5 s latency.☆7,871Dec 31, 2025Updated 2 months ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆23Feb 11, 2026Updated 3 weeks ago
- ✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM☆11Jun 16, 2025Updated 8 months ago
- ☆899Dec 11, 2024Updated last year
- Streaming Text to Speech Web UI☆22May 6, 2024Updated last year
- SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers☆584Jun 5, 2025Updated 9 months ago
- Memory-Guided Diffusion for Expressive Talking Video Generation☆1,073Aug 6, 2025Updated 7 months ago
- ICLR 2025 paper X-NeMo & Project X-Portrati2☆117Aug 7, 2025Updated 6 months ago
- A high-performance IM server.☆4,246Updated this week
- Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.☆3,150Dec 15, 2025Updated 2 months ago
- MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting☆5,366Sep 26, 2025Updated 5 months ago
- MOVA: Towards Scalable and Synchronized Video–Audio Generation☆793Updated this week
- Taming Stable Diffusion for Lip Sync!☆5,456Jun 20, 2025Updated 8 months ago
- ☆156Dec 23, 2025Updated 2 months ago