thunlp / FR-Spec
FR-Spec: Frequency-Ranked Speculative Sampling
☆12Updated 2 weeks ago
Alternatives and similar repositories for FR-Spec:
Users that are interested in FR-Spec are comparing it to the libraries listed below
- [ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration☆45Updated last month
- ☆13Updated last month
- ☆39Updated 4 months ago
- Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**☆177Updated last month
- TokenSkip: Controllable Chain-of-Thought Compression in LLMs☆103Updated 3 weeks ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation☆20Updated 2 weeks ago
- Multi-Candidate Speculative Decoding☆35Updated 11 months ago
- ☆125Updated 8 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆43Updated 3 weeks ago
- The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference☆68Updated 2 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆23Updated last month
- ☆20Updated this week
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆90Updated 2 weeks ago
- [ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length☆67Updated 3 weeks ago
- The repo for In-context Autoencoder☆118Updated 10 months ago
- ☆22Updated last week
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆171Updated last week
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models☆54Updated 8 months ago
- Code for "Retaining Key Information under High Compression Rates: Query-Guided Compressor for LLMs" (ACL 2024)☆17Updated 9 months ago
- ☆50Updated 10 months ago
- official code for GliDe with a CaPE☆13Updated 7 months ago
- ☆73Updated 2 weeks ago
- [EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions☆107Updated 6 months ago
- ☆72Updated last week
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆171Updated 3 weeks ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆74Updated 2 months ago
- ☆47Updated 3 months ago
- The HELMET Benchmark☆123Updated 2 weeks ago
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆40Updated 5 months ago
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆15Updated 2 months ago