keli-wen / AGI-Study
The blog, read report and code example for AGI/LLM related knowledge.
☆36Updated 2 months ago
Alternatives and similar repositories for AGI-Study:
Users that are interested in AGI-Study are comparing it to the libraries listed below
- ☆113Updated last week
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆45Updated 2 weeks ago
- Multi-Candidate Speculative Decoding☆35Updated 11 months ago
- ☆151Updated this week
- ☆75Updated 3 weeks ago
- VeOmni: Scaling any Modality Model Training to any Accelerators with PyTorch native Training Framework☆285Updated last week
- Efficient Mixture of Experts for LLM Paper List☆60Updated 4 months ago
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆75Updated this week
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆82Updated last week
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆102Updated last week
- SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs☆97Updated last week
- ☆130Updated last month
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆62Updated 2 months ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆71Updated 2 months ago
- ☆14Updated this week
- ☆104Updated last year
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆269Updated 4 months ago
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.☆222Updated last week
- LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification☆44Updated last month
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆153Updated 10 months ago
- Reproducing R1 for Code with Reliable Rewards☆167Updated last week
- Code release for book "Efficient Training in PyTorch"☆59Updated last week
- ☆74Updated 3 weeks ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆181Updated 2 months ago
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆120Updated last month
- ☆190Updated 5 months ago
- ☆62Updated 4 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆131Updated 10 months ago
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆92Updated last year
- 🔥 A minimal training framework for scaling FLA models☆101Updated this week