xinzhel / LLM-SearchLinks
Survey on LLM Inference via Search (TMLR 2025)
☆14Updated 8 months ago
Alternatives and similar repositories for LLM-Search
Users that are interested in LLM-Search are comparing it to the libraries listed below
Sorting:
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆85Updated 6 months ago
- ☆201Updated 2 weeks ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆61Updated 3 months ago
- ☆55Updated 2 years ago
- [TMLR 2025] Efficient Reasoning Models: A Survey☆290Updated last week
- ☆34Updated 8 months ago
- [AI4MATH@ICML2025] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆41Updated 7 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆88Updated 10 months ago
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆44Updated 4 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆153Updated 6 months ago
- ICLR 2025 Agent-Related Papers☆74Updated last year
- Accepted LLM Papers in NeurIPS 2024☆37Updated last year
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆32Updated 9 months ago
- A lightweight Inference Engine built for block diffusion models☆39Updated last month
- A Sober Look at Language Model Reasoning☆92Updated last month
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆83Updated last week
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆253Updated last week
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆24Updated 10 months ago
- Official Repository of "Learning what reinforcement learning can't"☆75Updated last week
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibrati…☆46Updated last year
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆30Updated 5 months ago
- ☆83Updated last year
- Must-read papers and blogs about parametric knowledge mechanism in LLMs.☆34Updated 8 months ago
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆67Updated 5 months ago
- Code for the paper "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"☆145Updated 5 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆191Updated last month
- [arXiv:2508.00410] "Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models"☆30Updated 3 months ago
- 😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond☆325Updated last week
- ☆60Updated 5 months ago
- This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…☆17Updated last year