☆40Sep 13, 2025Updated 6 months ago
Alternatives and similar repositories for Dynamic-Speculative-Planning
Users that are interested in Dynamic-Speculative-Planning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆40Feb 6, 2026Updated last month
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]☆67Oct 2, 2025Updated 5 months ago
- Microbenchmark that unveals the mechanisms behind power readings reported by nvidia-smi on your NVIDIA GPU.☆14Dec 12, 2024Updated last year
- [ACL 2025 main] FR-Spec: Frequency-Ranked Speculative Sampling☆54Jul 15, 2025Updated 8 months ago
- A standalone CXL-enabled system simulator.☆20Jan 10, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- ☆10Feb 22, 2023Updated 3 years ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆14Apr 17, 2025Updated 11 months ago
- ☆27May 30, 2025Updated 9 months ago
- ScalaAFA: Constructing User-Space All-Flash Array Engine with Holistic Designs (USENIX ATC 2024).☆16Nov 25, 2024Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆55Oct 29, 2024Updated last year
- [SatML 2024] Shake to Leak: Fine-tuning Diffusion Models Can Amplify the Generative Privacy Risk☆15Mar 15, 2025Updated last year
- [ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter☆157Feb 27, 2026Updated last month
- [HPCA'24] Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System☆53Jul 21, 2025Updated 8 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The official repository for the experiments included in the paper titled "Patch-level Routing in Mixture-of-Experts is Provably Sample-ef…☆14Feb 12, 2026Updated last month
- ☆13May 9, 2024Updated last year
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 4 months ago
- SJTU 中文简约 LaTeX 报告模板☆10Jun 7, 2021Updated 4 years ago
- This repo implements an interface to GTAV for SCENIC language.☆11Dec 7, 2019Updated 6 years ago
- Exploring CXL on QEMU Emulation☆37Mar 4, 2025Updated last year
- A list of post-GPT-era (2022-2026) Best Paper award winners from ICLR/NeurIPS/ICML/ACL/EMNLP/NAACL/AAAI/CVPR/ECCV.☆51Feb 23, 2026Updated last month
- Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths☆17Jul 10, 2025Updated 8 months ago
- Official pytorch code for "APP: Anytime Progressive Pruning" (DyNN @ ICML, 2022; CLL @ ACML, 2022, SNN @ ICML, 2022 and SlowDNN 2023)☆16Nov 22, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- B站下载姬~召唤你喜爱的番剧吧!~☆12Feb 23, 2020Updated 6 years ago
- [CVPRW2024, Official Code] for paper "Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribu…☆14Jun 14, 2024Updated last year
- Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".☆17Dec 15, 2021Updated 4 years ago
- ☆12Oct 16, 2022Updated 3 years ago
- ☆23Dec 25, 2025Updated 3 months ago
- Code accompanying our EMNLP 2019 paper: "Revisiting the Evaluation of Theory of Mind through Question Answering"☆27Aug 9, 2020Updated 5 years ago
- ☆26Apr 14, 2025Updated 11 months ago
- [WWW] Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learn …☆54Mar 2, 2026Updated 3 weeks ago
- Official Implementation of "GRIFFIN: Effective Token Alignment for Faster Speculative Decoding"[NeurIPS 2025]☆18May 12, 2025Updated 10 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated last year
- ☆14Nov 25, 2024Updated last year
- Official repo of dataset-decomposition paper [NeurIPS 2024]☆21Jan 8, 2025Updated last year
- TokenSim is a tool for simulating the behavior of large language models (LLMs) in a distributed environment.☆20Sep 20, 2025Updated 6 months ago
- GPU TopK Benchmark☆18Dec 19, 2024Updated last year
- PyTorch implementation of "Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization"☆11Jul 24, 2023Updated 2 years ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆77Nov 4, 2024Updated last year