dvlab-research / Q-LLMLinks

This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"

☆55

Alternatives and similar repositories for Q-LLM

Users that are interested in Q-LLM are comparing it to the libraries listed below

Sorting:

HKUNLP / critic-rl
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆118Updated 7 months ago
DAMO-NLP-SG / LongPO
[ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
☆43Updated 9 months ago
hemingkx / SWIFT
[ICLR 2025] SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
☆61Updated 10 months ago
dvlab-research / Mr-Ben
This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"
☆52Updated last year
zjunlp / LightThinker
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
☆126Updated 8 months ago
SihengLi99 / SEALONG
Large Language Models Can Self-Improve in Long-context Reasoning
☆73Updated last year
VITA-Group / Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…
☆31Updated last year
GeniusHTX / TALE
☆140Updated 3 months ago
SalesforceAIResearch / GemFilter
☆85Updated last month
Zanette-Labs / SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆52Updated last year
DRSY / EasyKV
Easy control for Key-Value Constrained Generative LLM Inference(https://arxiv.org/abs/2402.06262)
☆63Updated last year
yyDing1 / ScaleQuest
[ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…
☆68Updated last year
GAIR-NLP / weak-to-strong-reasoning
☆58Updated last year
RM-R1-UIUC / RM-R1
RM-R1: Unleashing the Reasoning Potential of Reward Models
☆154Updated 6 months ago
NuoJohnChen / JudgeLRM
JudgeLRM: Large Reasoning Models as a Judge
☆40Updated 2 weeks ago
TemporaryLoRA / Block-Attention
☆41Updated 9 months ago
YangLing0818 / SuperCorrect-llm
[ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
☆86Updated 9 months ago
wwxu21 / CUT
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
☆58Updated last year
Infini-AI-Lab / gsm_infinite
☆60Updated 6 months ago
TIGER-AI-Lab / General-Reasoner
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆210Updated 3 weeks ago
GAIR-NLP / OctoThinker
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆182Updated 5 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆73Updated 5 months ago
open-compass / Ada-LEval
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
☆55Updated 7 months ago
chenllliang / MMEvalPro
[NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs
☆24Updated last year
tianyi-lab / MoE-Embedding
[ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"
☆83Updated last year
TIGER-AI-Lab / LongICLBench
Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]
☆110Updated 10 months ago
Shwai-He / MEO
The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":
☆44Updated last year
Kwai-Klear / KlearReasoner
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆80Updated 2 months ago
jiwonsong-dev / ReasoningPathCompression
[NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"
☆26Updated 2 months ago
RUCAIBox / JiuZhang3.0
The code and data for the paper JiuZhang3.0
☆49Updated last year