GuijiAI / duix.ai

☆4,199

Related projects: ⓘ

fudan-generative-vision / hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
☆9,151Updated this week
dataelement / bisheng
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI…
☆8,587Updated this week
BadToBest / EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
☆2,382Updated last month
modelscope / FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
☆3,267Updated 3 weeks ago
PeterH0323 / Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文…
☆2,325Updated this week
OpenBMB / MiniCPM
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
☆6,824Updated last week
fudan-generative-vision / champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
☆3,909Updated 2 months ago
lipku / metahuman-stream
Real time interactive streaming digital human
☆3,462Updated last week
TMElyralab / MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
☆2,318Updated 2 months ago
Kedreamix / Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LL…
☆1,772Updated 2 weeks ago
Tencent / HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
☆3,277Updated last month
FunAudioLLM / CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
☆4,768Updated last week
OpenBMB / MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
☆11,907Updated this week
TeamWiseFlow / wiseflow
Wiseflow is an agile information mining tool that extracts concise messages from various sources such as websites, WeChat official accoun…
☆3,723Updated 2 weeks ago
TMElyralab / MuseTalk
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
☆2,406Updated last month
jianchang512 / ChatTTS-ui
一个简单的本地网页界面，使用ChatTTS将文字合成为语音，同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with su…
☆5,868Updated 3 weeks ago
xszyou / Fay
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent …
☆8,892Updated this week
1Panel-dev / MaxKB
🚀 基于大语言模型和 RAG 的知识库问答系统。开箱即用、模型中立、灵活编排，支持快速嵌入到第三方业务系统。
☆9,966Updated this week
6drf21e / ChatTTS_colab
🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。
☆1,897Updated 2 months ago
ali-vilab / MimicBrush
Official implementations for paper: Zero-shot Image Editing with Reference Imitation
☆1,064Updated 3 months ago
cs-lazy-tools / ChatGPT-On-CS
基于大模型的智能对话客服工具，支持微信、拼多多、千牛、哔哩哔哩、抖音企业号、抖音、抖店、微博聊天、小红书专业号运营、小红书、知乎等平台接入，可选择 GPT3.5/GPT4.0/ 懒人百宝箱（后续会支持更多平台），能处理文本、语音和图片，通过插件访问操作系统和互联网等外部资…
☆2,275Updated last week
modelscope / DiffSynth-Studio
Enjoy the magic of Diffusion models!
☆6,349Updated this week
LLM-Red-Team / kimi-free-api
🚀 KIMI AI 长文本大模型逆向API白嫖测试【特长：长文本解读整理】，支持高速流式输出、智能体对话、联网搜索、长文档解读、图像OCR、多轮对话，零配置部署，多路token支持，自动清理会话痕迹。
☆3,640Updated 2 months ago
jianchang512 / clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具，使用你的音色或任意声音来录制音频
☆7,202Updated 3 weeks ago
FunAudioLLM / SenseVoice
Multilingual Voice Understanding Model
☆2,625Updated 2 weeks ago
guoqincode / Open-AnimateAnyone
Unofficial Implementation of Animate Anyone
☆2,900Updated 2 months ago
CrazyBoyM / llama3-Chinese-chat
Llama3、Llama3.1 中文仓库（随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档）
☆3,921Updated last month
Zejun-Yang / AniPortrait
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
☆4,498Updated 2 months ago
gpt-omni / mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…
☆2,425Updated this week
BytedanceSpeech / seed-tts-eval
☆934Updated 3 months ago