xinzhel / LLM-SearchLinks
Survey on LLM Inference via Search (TMLR 2025)
☆14Updated 5 months ago
Alternatives and similar repositories for LLM-Search
Users that are interested in LLM-Search are comparing it to the libraries listed below
Sorting:
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆20Updated 7 months ago
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concen…☆79Updated 3 months ago
- ☆171Updated 5 months ago
- Code for the paper "VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use"☆131Updated 2 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆164Updated last month
- [TMLR 2025] Efficient Reasoning Models: A Survey☆271Updated this week
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆126Updated 3 months ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆29Updated 2 months ago
- [ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models☆135Updated 3 months ago
- [ICLR 2025 Workshop] "Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models"☆36Updated 2 months ago
- A lightweight Inference Engine built for block diffusion models☆30Updated last week
- "what, how, where, and how well? a survey on test-time scaling in large language models" repository☆71Updated last week
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆86Updated 8 months ago
- [AI4MATH@ICML2025] Do Not Let Low-Probability Tokens Over-Dominate in RL for LLMs☆40Updated 4 months ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasks—evaluating performance, accuracy, …☆32Updated last month
- ☆53Updated 2 years ago
- Paper List of Inference/Test Time Scaling/Computing☆313Updated last month
- ☆129Updated 7 months ago
- [ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆111Updated 3 months ago
- One-shot Entropy Minimization☆185Updated 4 months ago
- [ICLR 2025] DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference☆38Updated 4 months ago
- A Sober Look at Language Model Reasoning☆84Updated last week
- Accepted LLM Papers in NeurIPS 2024☆37Updated last year
- repo for paper https://arxiv.org/abs/2504.13837☆199Updated 3 months ago
- This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…☆17Updated 11 months ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆53Updated 2 weeks ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆170Updated last week
- PhyX: Does Your Model Have the "Wits" for Physical Reasoning?☆46Updated this week
- ☆297Updated 4 months ago
- A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset…☆58Updated 9 months ago