keli-wen / AGI-StudyLinks
The blog, read report and code example for AGI/LLM related knowledge.
☆40Updated 6 months ago
Alternatives and similar repositories for AGI-Study
Users that are interested in AGI-Study are comparing it to the libraries listed below
Sorting:
- ☆140Updated last month
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆102Updated 3 months ago
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆311Updated 3 weeks ago
- Efficient Mixture of Experts for LLM Paper List☆87Updated 7 months ago
- ☆198Updated 3 months ago
- qwen-nsa☆70Updated 3 months ago
- ☆145Updated 5 months ago
- [NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.☆463Updated last year
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆199Updated 5 months ago
- Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…☆204Updated this week
- Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library☆38Updated last week
- ☆113Updated 2 months ago
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆152Updated this week
- Train speculative decoding models effortlessly and port them smoothly to SGLang serving.☆281Updated this week
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆87Updated last month
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆307Updated 3 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆299Updated 3 months ago
- ☆268Updated 3 weeks ago
- Implement some method of LLM KV Cache Sparsity☆35Updated last year
- ☆78Updated 3 months ago
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆36Updated 3 weeks ago
- Multi-Candidate Speculative Decoding☆36Updated last year
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆55Updated 4 months ago
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆124Updated 2 months ago
- Source code of paper ''KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing''☆27Updated 9 months ago
- A Comprehensive Survey on Long Context Language Modeling☆169Updated 3 weeks ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆172Updated last week
- VeOmni: Scaling any Modality Model Training to any Accelerators with PyTorch native Training Framework☆399Updated this week
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [arXiv '25]☆43Updated 3 weeks ago
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆99Updated last year