deepseek-ai / ESFTLinks
Expert Specialized Fine-Tuning
☆710Updated 6 months ago
Alternatives and similar repositories for ESFT
Users that are interested in ESFT are comparing it to the libraries listed below
Sorting:
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,836Updated last year
- OLMoE: Open Mixture-of-Experts Language Models☆916Updated 2 months ago
- ☆543Updated last year
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆2,989Updated last year
- ☆819Updated 5 months ago
- An Open Large Reasoning Model for Real-World Solutions☆1,528Updated 5 months ago
- ☆1,344Updated 2 months ago
- [ICML 2025 Oral] CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆561Updated 6 months ago
- Muon is Scalable for LLM Training☆1,365Updated 3 months ago
- Large Reasoning Models☆807Updated 11 months ago
- A project to improve skills of large language models☆619Updated this week
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆272Updated last month
- Scalable toolkit for efficient model reinforcement☆1,048Updated this week
- ☆964Updated 10 months ago
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆4,008Updated last year
- ☆1,348Updated last year
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆456Updated 6 months ago
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆715Updated 4 months ago
- An Open-Source Large-Scale Reinforcement Learning Project for Search Agents☆497Updated last month
- Unleashing the Power of Reinforcement Learning for Math and Code Reasoners☆731Updated 5 months ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.☆750Updated last year
- A curated list of open-source projects related to DeepSeek Coder☆729Updated 2 weeks ago
- [NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention…☆1,157Updated last month
- [NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"☆623Updated 8 months ago
- Arena-Hard-Auto: An automatic LLM benchmark.☆960Updated 5 months ago
- Reaching LLaMA2 Performance with 0.1M Dollars☆987Updated last year
- [ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data …☆793Updated 8 months ago
- An Open Source Toolkit For LLM Distillation☆780Updated 4 months ago
- OpenSeek aims to unite the global open source community to drive collaborative innovation in algorithms, data and systems to develop next…☆240Updated 2 weeks ago
- Scalable RL solution for advanced reasoning of language models☆1,774Updated 8 months ago