MoonshotAI / Kimi-k1.5
☆3,242Updated 3 weeks ago
Alternatives and similar repositories for Kimi-k1.5:
Users that are interested in Kimi-k1.5 are comparing it to the libraries listed below
- Democratizing Reinforcement Learning for LLMs☆2,113Updated last month
- Witness the aha moment of VLM with less than $3.☆3,376Updated 3 weeks ago
- verl: Volcano Engine Reinforcement Learning for LLMs☆5,693Updated this week
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,415Updated last week
- s1: Simple test-time scaling☆6,051Updated 3 weeks ago
- Fully open data curation for reasoning models☆1,576Updated last week
- Sky-T1: Train your own O1 preview model within $450☆3,149Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆1,389Updated this week
- This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data☆3,223Updated this week
- An Open Large Reasoning Model for Real-World Solutions☆1,475Updated 3 weeks ago
- Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆9,282Updated this week
- Clean, minimal, accessible reproduction of DeepSeek R1-Zero☆11,339Updated 2 weeks ago
- A live stream development of RL tunning for LLM agents☆1,883Updated this week
- ☆1,348Updated 4 months ago
- ☆3,340Updated last month
- A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.☆6,607Updated this week
- An open source deep research clone. AI Agent that reasons large amounts of web data extracted with Firecrawl☆5,145Updated last month
- DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding☆4,628Updated last month
- MoBA: Mixture of Block Attention for Long-Context LLMs☆1,687Updated 3 weeks ago
- Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)☆3,653Updated this week
- Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.☆4,715Updated last week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,210Updated this week
- Scalable RL solution for advanced reasoning of language models☆1,419Updated last week
- Official Repo for Open-Reasoner-Zero☆1,667Updated 3 weeks ago
- g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains☆4,201Updated 2 months ago
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,609Updated last year
- DeepEP: an efficient expert-parallel communication library☆7,289Updated last week
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.☆2,656Updated 2 weeks ago
- Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation☆6,926Updated 3 weeks ago
- Fully open reproduction of DeepSeek-R1☆23,242Updated this week