deepseek-ai / ESFT
Expert Specialized Fine-Tuning
☆601Updated 7 months ago
Alternatives and similar repositories for ESFT:
Users that are interested in ESFT are comparing it to the libraries listed below
- ☆496Updated 8 months ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆2,654Updated last year
- DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models☆1,664Updated last year
- An Open Large Reasoning Model for Real-World Solutions☆1,483Updated last month
- Muon is Scalable for LLM Training☆1,029Updated last month
- A curated list of open-source projects related to DeepSeek Coder☆679Updated last year
- OLMoE: Open Mixture-of-Experts Language Models☆723Updated last month
- DeepSeek-VL: Towards Real-World Vision-Language Understanding☆3,802Updated last year
- CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction☆507Updated 2 months ago
- ☆716Updated last week
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,171Updated 2 weeks ago
- ☆1,355Updated 5 months ago
- Large Reasoning Models☆802Updated 4 months ago
- Scalable RL solution for advanced reasoning of language models☆1,504Updated last month
- Analyze computation-communication overlap in V3/R1.☆1,005Updated last month
- ☆922Updated 3 months ago
- ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates☆376Updated 3 weeks ago
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,557Updated 2 weeks ago
- Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM☆1,251Updated this week
- AN O1 REPLICATION FOR CODING☆334Updated 4 months ago
- Official Repo for Open-Reasoner-Zero☆1,887Updated 2 weeks ago
- Democratizing Reinforcement Learning for LLMs☆3,123Updated 2 weeks ago
- ☆519Updated last week
- Expert Parallelism Load Balancer☆1,153Updated last month
- ☆673Updated last week
- A fast communication-overlapping library for tensor/expert parallelism on GPUs.☆905Updated last week
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆161Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,076Updated 3 months ago
- [NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which r…☆985Updated this week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,462Updated this week