real-absolute-AI/RAPID

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/real-absolute-AI/RAPID)

real-absolute-AI / RAPID

[ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding

☆23

Alternatives and similar repositories for RAPID

Users that are interested in RAPID are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

66RING / CritiPrefill
View on GitHub
Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".
☆17Sep 15, 2024Updated last year
DAMO-NLP-SG / RemeMo
View on GitHub
[EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning
☆17Oct 31, 2023Updated 2 years ago
Anonymous1252022 / Megatron-DeepSpeed
View on GitHub
☆18Sep 22, 2024Updated last year
norakassner / mlama
View on GitHub
☆25Jan 22, 2024Updated 2 years ago
naimengye / speculative-action
View on GitHub
☆30Mar 9, 2026Updated 4 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
amazon-science / graph-lm-ensemble
View on GitHub
☆15Jun 2, 2025Updated last year
jenni-ai / T2FW
View on GitHub
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
☆20Oct 9, 2022Updated 3 years ago
real-absolute-AI / LongRLVR
View on GitHub
[ICLR 2026] LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards.
☆19Mar 16, 2026Updated 4 months ago
huangyuxiang03 / Locret
View on GitHub
☆14Oct 3, 2024Updated last year
IntelliSys-Lab / FineMoE-EuroSys26
View on GitHub
☆15Sep 25, 2025Updated 9 months ago
furiosa-ai / draft-based-approx-llm
View on GitHub
[ICLR 2026] Draft-based Approximate Inference for LLMs
☆21Mar 10, 2026Updated 4 months ago
yuezhouhu / adaspec
View on GitHub
A selective knowledge distillation algorithm for efficient speculative decoders
☆39Nov 27, 2025Updated 7 months ago
ChengZhang-98 / LQER
View on GitHub
Official implementation of ICML'24 paper "LQER: Low-Rank Quantization Error Reconstruction for LLMs"
☆19Jul 11, 2024Updated 2 years ago
DAMO-NLP-SG / LongPO
View on GitHub
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆43Feb 27, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
mansicer / Q-Adapter
View on GitHub
Implementation of ICLR 2025 paper "Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation"
☆18Oct 5, 2024Updated last year
xvyaward / owq
View on GitHub
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆72Mar 7, 2024Updated 2 years ago
krafton-ai / lexico
View on GitHub
KV cache compression via sparse coding
☆17Oct 26, 2025Updated 8 months ago
kdu4108 / semiring-backprop-exps
View on GitHub
☆16Jul 10, 2023Updated 3 years ago
LINs-lab / M3
View on GitHub
[ICLR 2024] Towards Robust Multi-Modal Reasoning via Model Selection
☆14Mar 7, 2024Updated 2 years ago
Prithiviraj7R / Chat-with-PDF
View on GitHub
☆14Feb 28, 2024Updated 2 years ago
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
zhzihao / Learning-to-Draft
View on GitHub
Official implementation of "Learning To Draft: Adaptive Speculative Decoding with Reinforcement Learning" (ICLR 2026)
☆19Mar 1, 2026Updated 4 months ago
penhunt / full-quantization-DNN
View on GitHub
PyTorch code for full quantization of DNN using BCGD
☆14Jul 24, 2019Updated 6 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
yxin98 / EMNLP_2022
View on GitHub
☆13Jun 7, 2022Updated 4 years ago
Jingyu6 / speculative_prefill
View on GitHub
☆63May 19, 2025Updated last year
aaai17 / geo_teaser
View on GitHub
☆10Sep 17, 2016Updated 9 years ago
nikikilbertus / blind-justice
View on GitHub
Blind Justice Code for the paper "Blind Justice: Fairness with Encrypted Sensitive Attributes", ICML 2018
☆14Mar 20, 2019Updated 7 years ago
D3Mlab / cr-lt-kgqa
View on GitHub
CR-LT KGQA Dataset Repository
☆10Jun 1, 2025Updated last year
GraphPKU / CoI
View on GitHub
Chain of Images for Intuitively Reasoning
☆10Nov 29, 2023Updated 2 years ago
aaronserianni / attention-iou
View on GitHub
[CVPR'25] Attention IoU: Examining Biases in CelebA using Attention Maps
☆13Mar 26, 2025Updated last year
Yifan-Gao / open_retrieval_conversational_machine_reading
View on GitHub
Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset
☆13Nov 19, 2022Updated 3 years ago
hemingkx / Whisper
View on GitHub
[ACL 2026] Enabling Efficient Reasoning in LLMs via Black-box Persuasive Prompting
☆22Jan 9, 2026Updated 6 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
microsoft / DGT
View on GitHub
Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent
☆16Sep 8, 2022Updated 3 years ago
xyliugo / ODMT
View on GitHub
[MM 2023 Oral] Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation
☆17Jan 10, 2024Updated 2 years ago
charbel-sakr / Fixed-Point-Training
View on GitHub
Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.
☆10Jan 24, 2019Updated 7 years ago
GATECH-EIC / LaCache
View on GitHub
[ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
☆17Nov 4, 2025Updated 8 months ago
nemanja-rakicevic / conference_historical_data_analysis
View on GitHub
Analysing ML conference data and plotting interesting statistics.
☆11Aug 4, 2023Updated 2 years ago
SUSTechBruce / LOOK-M
View on GitHub
[EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…
☆103Nov 9, 2024Updated last year
ywh187 / FitPrune
View on GitHub
☆68Jan 23, 2026Updated 5 months ago