deepseek-ai / ESFTLinks
Expert Specialized Fine-Tuning
☆643Updated last month
Alternatives and similar repositories for ESFT
Users that are interested in ESFT are comparing it to the libraries listed below
Sorting:
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,741Updated last year
- ☆529Updated 10 months ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆2,790Updated last year
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆3,900Updated last year
- Muon is Scalable for LLM Training☆1,091Updated 3 months ago
- A curated list of open-source projects related to DeepSeek Coder☆709Updated last year
- LIMO: Less is More for Reasoning☆975Updated 3 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,502Updated last month
- Scalable RL solution for advanced reasoning of language models☆1,642Updated 3 months ago
- ☆1,356Updated 7 months ago
- ☆580Updated 2 months ago
- Large Reasoning Models☆805Updated 7 months ago
- Official Repo for Open-Reasoner-Zero☆1,983Updated last month
- OLMoE: Open Mixture-of-Experts Language Models☆792Updated 3 months ago
- AllenAI's post-training codebase☆3,033Updated this week
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆571Updated 2 weeks ago
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,989Updated 2 weeks ago
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆2,075Updated last week
- Releases from OpenAI Preparedness☆790Updated last month
- Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆561Updated 3 months ago
- Fully open data curation for reasoning models☆1,959Updated last month
- MoBA: Mixture of Block Attention for Long-Context LLMs☆1,813Updated 3 months ago
- DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model☆4,921Updated 9 months ago
- Analyze computation-communication overlap in V3/R1.☆1,075Updated 3 months ago
- Democratizing Reinforcement Learning for LLMs☆3,600Updated this week
- Expert Parallelism Load Balancer☆1,226Updated 3 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆342Updated 6 months ago
- Official codebase for "Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling".☆267Updated 4 months ago
- The official implementation of Self-Play Fine-Tuning (SPIN)☆1,167Updated last year
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,406Updated last month