Tencent / Tencent-Hunyuan-Large
☆1,353Updated last month
Alternatives and similar repositories for Tencent-Hunyuan-Large:
Users that are interested in Tencent-Hunyuan-Large are comparing it to the libraries listed below
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆797Updated this week
- Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding☆3,805Updated this week
- ☆1,044Updated this week
- ✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction☆1,932Updated this week
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆2,372Updated this week
- Next-Token Prediction is All You Need☆1,965Updated 2 months ago
- SEED-Story: Multimodal Long Story Generation with Large Language Model☆783Updated 3 months ago
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,204Updated 4 months ago
- Janus-Series: Unified Multimodal Understanding and Generation Models☆1,327Updated 2 months ago
- Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆4,207Updated this week
- FastVideo is a lightweight framework for accelerating large video diffusion models.☆859Updated this week
- Taming Stable Diffusion for Lip Sync!☆1,856Updated this week
- 📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion☆1,680Updated this week
- ☆902Updated 6 months ago
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,035Updated 3 months ago
- GLM-4-Voice | 端到端中英语音对话模型☆2,565Updated last month
- InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions☆2,709Updated 3 weeks ago
- Kolors Team☆4,108Updated 2 months ago
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,395Updated 5 months ago
- DeepSeek LLM: Let there be answers☆1,850Updated 11 months ago
- Code of Pyramidal Flow Matching for Efficient Video Generative Modeling☆2,701Updated 3 weeks ago
- An Open Large Reasoning Model for Real-World Solutions☆1,378Updated last month
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.☆1,908Updated 5 months ago
- VideoSys: An easy and efficient system for video generation☆1,875Updated 2 weeks ago
- open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…☆3,066Updated 2 months ago
- StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text☆1,481Updated last month
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆822Updated 6 months ago
- An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)☆4,158Updated this week
- High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance☆2,092Updated 3 months ago
- Scalable RL solution for advanced reasoning of language models☆873Updated this week