This repository is the official implementation of "Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE" [ACL 2026 Main Accepted]
☆37Oct 5, 2025Updated 7 months ago
Alternatives and similar repositories for Jakiro
Users that are interested in Jakiro are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆68Updated this week
- ☆39Nov 18, 2025Updated 6 months ago
- ☆20Jun 17, 2024Updated last year
- ☆16Jul 31, 2025Updated 9 months ago
- [ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆83Jul 14, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆46May 27, 2025Updated last year
- ☆12Jan 12, 2016Updated 10 years ago
- Official Implementation of LANTERN (ICLR'25) and LANTERN++(ICLRW-SCOPE'25)☆21Mar 5, 2025Updated last year
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆1,234May 11, 2026Updated 2 weeks ago
- Image reconstruction from human brain activity by VAE and adversarial learning☆12May 21, 2022Updated 4 years ago
- (ACL2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation☆35May 28, 2025Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- [NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…☆69Jun 26, 2024Updated last year
- Aligning Agentic World Models via Knowledgeable Experience Learning☆35May 15, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Riemannian Geometry-Based Spatial Filtering (RSF)☆21Jun 22, 2025Updated 11 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 9 months ago
- ☆13Aug 23, 2017Updated 8 years ago
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated last year
- GPU topology-aware scheduler☆13Jul 7, 2017Updated 8 years ago
- Rhetorical sentence classification using LLMs☆11Oct 26, 2025Updated 7 months ago
- ☆22Oct 10, 2025Updated 7 months ago
- ☆51Mar 20, 2026Updated 2 months ago
- LaTeX template for dissertation proposals in Peking University Shenzhen.☆16Feb 23, 2022Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An ITK implementation of the GraphCut framework. See 'Graph cuts and efficient ND image segmentation' by Boykov and Funka-Lea and 'Intera…☆12Sep 18, 2017Updated 8 years ago
- Intelligent Resource Requirement Estimation and Scheduling for Deep Learning Jobs on Distributed GPU Clusters☆15Nov 18, 2021Updated 4 years ago
- Reading notes on Speculative Decoding papers☆34Apr 16, 2026Updated last month
- [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding☆278Aug 31, 2024Updated last year
- ☆31Jul 21, 2025Updated 10 months ago
- Information extraction from unstructured text to build a knowledge graph using techniques from traditional NLP to pre-trained transformer…☆16Jan 13, 2026Updated 4 months ago
- This project leverages advanced AI agents from crewAI to assist doctors in diagnosing medical conditions and recommending treatment plans…☆15Nov 16, 2024Updated last year
- A curated list of research papers, resources, and advancements on Diffusion Cache and related efficient diffusion model acceleration tech…☆83Nov 4, 2025Updated 6 months ago
- ☆13Jan 14, 2020Updated 6 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- [Main EMNLP'25] LLMs do Multi-Label Classification Differently☆15Feb 28, 2026Updated 3 months ago
- Fast inference from large lauguage models via speculative decoding☆915Aug 22, 2024Updated last year
- ITKGrowCut is a remote module for ITK. It segments a 3D image from user-provided foreground and background seeds.☆16Updated this week
- ☆18May 14, 2024Updated 2 years ago
- [ICLR2025] Code and data for paper: Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasonin…☆43Mar 10, 2025Updated last year
- ☆13Mar 6, 2023Updated 3 years ago
- ☆27Jun 22, 2024Updated last year