Tencent / Tencent-Hunyuan-Large
☆982Updated this week
Related projects ⓘ
Alternatives and complementary repositories for Tencent-Hunyuan-Large
- Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.☆1,110Updated 3 months ago
- ☆873Updated 4 months ago
- Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.☆514Updated 4 months ago
- The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.☆1,202Updated 2 months ago
- ☆781Updated 3 weeks ago
- 中文Mixtral-8x7B(Chinese-Mixtral-8x7B)☆641Updated 2 months ago
- LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs☆1,462Updated 2 weeks ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆960Updated 3 months ago
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆3,578Updated last month
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,001Updated 9 months ago
- Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation☆917Updated last week
- A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.☆509Updated last week
- An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)☆3,944Updated this week
- A series of math-specific large language models of our Qwen2 series.☆578Updated 2 weeks ago
- DataComp for Language Models☆1,150Updated 2 weeks ago
- [NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces in…☆776Updated this week
- An LLM-based Web Navigating Agent (KDD'24)☆734Updated last month
- Efficient AI Inference & Serving☆456Updated 10 months ago
- Empowering RAG with a memory-based data interface for all-purpose applications!☆1,185Updated this week
- DeepSeek LLM: Let there be answers☆1,438Updated 9 months ago
- O1 Replication Journey: A Strategic Progress Report – Part I☆1,261Updated 2 weeks ago
- MINT-1T: A one trillion token multimodal interleaved dataset.☆770Updated 3 months ago
- 🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)☆809Updated 4 months ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,121Updated this week
- From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)☆594Updated last week
- 中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)☆582Updated 6 months ago
- Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆3,000Updated last month
- VideoSys: An easy and efficient system for video generation☆1,759Updated this week
- GPT4V-level open-source multi-modal model based on Llama3-8B☆2,105Updated 2 months ago
- Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sour…☆1,223Updated 7 months ago