hao-ai-lab / LookaheadReasoningView external linksLinks
[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning
☆65Oct 31, 2025Updated 3 months ago
Alternatives and similar repositories for LookaheadReasoning
Users that are interested in LookaheadReasoning are comparing it to the libraries listed below
Sorting:
- Release doc/tutorial/wheels for poseidon-tf☆10Jan 18, 2018Updated 8 years ago
- [NeurIPS 2025, Spotlight]: Ambient-o: Training Good models with Bad Data.☆30Jan 21, 2026Updated 3 weeks ago
- A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and cach…☆56Oct 27, 2025Updated 3 months ago
- Distributed DRL by Ray and TensorFlow Tutorial.☆10Dec 26, 2019Updated 6 years ago
- 🔥 LLM-powered GPU kernel synthesis: Train models to convert PyTorch ops into optimized Triton kernels via SFT+RL. Multi-turn compilation…☆116Nov 10, 2025Updated 3 months ago
- ☆22Mar 7, 2025Updated 11 months ago
- ☆19Nov 22, 2017Updated 8 years ago
- [NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.☆220May 31, 2025Updated 8 months ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Apr 15, 2022Updated 3 years ago
- Slidev implementation☆19Dec 15, 2025Updated last month
- Xmixers: A collection of SOTA efficient token/channel mixers☆28Sep 4, 2025Updated 5 months ago
- ☆52May 19, 2025Updated 8 months ago
- yyb粉丝站点,一起膜拜yyb!☆21Aug 3, 2019Updated 6 years ago
- Codeforces Rating System (third party implementation)☆19Feb 26, 2018Updated 7 years ago
- High Performance KV Cache Store for LLM☆45Feb 7, 2026Updated last week
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆69Nov 4, 2024Updated last year
- Implementation for FP8/INT8 Rollout for RL training without performence drop.☆290Nov 7, 2025Updated 3 months ago
- ☆22Sep 4, 2024Updated last year
- Deadline-based hyperparameter tuning on RayTune.☆32Jan 16, 2020Updated 6 years ago
- 基于Napcat的全自动水群/Bot框架☆22Jan 2, 2026Updated last month
- Forked from https://gitlab.com/MatejB/PrePoMax☆12Jan 8, 2024Updated 2 years ago
- Battery data analysis tools☆14Aug 1, 2024Updated last year
- [WIP] Better (FP8) attention for Hopper☆32Feb 24, 2025Updated 11 months ago
- ☆84Feb 6, 2026Updated last week
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆36Aug 29, 2025Updated 5 months ago
- Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]☆49Mar 5, 2025Updated 11 months ago
- An exact algorithm for the maximum clique problem (MCP) which improves over state-of-the-art approaches in some cases by orders of magnit…☆14Nov 15, 2025Updated 2 months ago
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆35Jan 9, 2023Updated 3 years ago
- d3LLM: Ultra-Fast Diffusion LLM 🚀☆91Feb 4, 2026Updated last week
- ☆84Dec 2, 2022Updated 3 years ago
- An OnlineJudge System for OI and ACM/icpc☆34Sep 17, 2018Updated 7 years ago
- ☆131May 29, 2025Updated 8 months ago
- rabitq rust implementation☆10Feb 4, 2026Updated last week
- A linter for the ruby language for VS Code☆11May 14, 2016Updated 9 years ago
- Datacenter simulation toolkit for the OpenDC project☆10Aug 24, 2020Updated 5 years ago
- Data-driven Battery Model Identification in LPV Framework using Python☆11Dec 12, 2025Updated 2 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 7 months ago
- ☆75Jun 28, 2025Updated 7 months ago