liangyuwang / Tiny-DeepSpeedLinks
Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
☆41Updated last week
Alternatives and similar repositories for Tiny-DeepSpeed
Users that are interested in Tiny-DeepSpeed are comparing it to the libraries listed below
Sorting:
- ☆113Updated 2 months ago
- qwen-nsa☆70Updated 3 months ago
- ☆65Updated 8 months ago
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆102Updated 3 months ago
- Efficient Mixture of Experts for LLM Paper List☆87Updated 7 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆331Updated last week
- DeepSeek Native Sparse Attention pytorch implementation☆83Updated 5 months ago
- ☆198Updated 3 months ago
- ☆140Updated last month
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs☆141Updated last week
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆85Updated 4 months ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆307Updated 3 months ago
- ☆205Updated 5 months ago
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆152Updated this week
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆56Updated this week
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆49Updated 9 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆171Updated last month
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆204Updated this week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆188Updated 4 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆282Updated 3 weeks ago
- [ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training☆221Updated last month
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆183Updated last month
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [arXiv '25]☆43Updated 3 weeks ago
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)☆43Updated 4 months ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆61Updated 3 weeks ago
- ☆263Updated 2 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆303Updated 3 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆245Updated 3 months ago
- L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning☆234Updated 2 months ago