DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
☆17Mar 4, 2025Updated 11 months ago
Alternatives and similar repositories for DuoDecoding
Users that are interested in DuoDecoding are comparing it to the libraries listed below
Sorting:
- code for [ACL23] An AMR-based Link Prediction Approach for Document-level Event Argument Extraction☆24Oct 2, 2023Updated 2 years ago
- Towards Systematic Measurement for Long Text Quality☆37Sep 5, 2024Updated last year
- This is the code repo for the paper <UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction>☆15Aug 10, 2023Updated 2 years ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 8 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆30Oct 20, 2025Updated 4 months ago
- [EMNLP 2023] Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts☆27Nov 4, 2023Updated 2 years ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆13Apr 29, 2025Updated 10 months ago
- Notes of my introduction about NLP in Fudan University☆37Jul 6, 2021Updated 4 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 8 months ago
- MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…☆126Feb 13, 2026Updated 2 weeks ago
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback☆40Aug 14, 2023Updated 2 years ago
- [CVPR 2025] QuartDepth☆17Mar 24, 2025Updated 11 months ago
- [EMNLP 2022] RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees☆11Jul 15, 2023Updated 2 years ago
- ☆15Jan 12, 2026Updated last month
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆11May 5, 2025Updated 9 months ago
- Source code of our TNNLS paper "Boosting Convolutional Neural Networks with Middle Spectrum Grouped Convolution"☆12Apr 14, 2023Updated 2 years ago
- Official implementation of "Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent".☆21May 23, 2025Updated 9 months ago
- ☆11Apr 5, 2023Updated 2 years ago
- ☆13Jul 14, 2025Updated 7 months ago
- Official Implementation of Robustifying and Boosting Training-Free Neural Architecture Search☆10Mar 12, 2024Updated last year
- [ICML 2025] MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design☆22Jul 4, 2025Updated 7 months ago
- [ICML 2025] Official PyTorch implementation of "NegMerge: Sign-Consensual Weight Merging for Machine Unlearning"☆14Nov 25, 2025Updated 3 months ago
- Understanding ComfyUI seed☆16May 25, 2024Updated last year
- This is the official implement for LiveHPS++: Robust and Coherent Motion Capture in Dynamic Free Environment.☆14Sep 6, 2024Updated last year
- An implementation for MetGen: A Module-Based Entailment Tree Generation Framework for Answer Explanation.☆13Jul 21, 2022Updated 3 years ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆12Sep 22, 2025Updated 5 months ago
- First Latency-Aware Competitive LLM Agent Benchmark☆26Jun 3, 2025Updated 8 months ago
- ☆11Sep 20, 2024Updated last year
- [ICCAD 2025] Squant☆15Jul 3, 2025Updated 7 months ago
- The implementation for FREE-Merging: Fourier Transform for Model Merging with Lightweight Experts (ICCV25)☆14Jun 26, 2025Updated 8 months ago
- ☆17Mar 10, 2025Updated 11 months ago
- ☆13Jul 25, 2024Updated last year
- ☆12Jul 6, 2023Updated 2 years ago
- fastNLP reimplementation of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction"☆11Dec 11, 2020Updated 5 years ago
- Official PyTorch implementation of the paper entitled 'Self Attentive Pooling for Efficient Deep Learning'.☆13May 3, 2024Updated last year
- ☆12Jul 30, 2025Updated 7 months ago
- ☆26Feb 27, 2025Updated last year
- ☆16Dec 9, 2023Updated 2 years ago
- MICRO 2024 Evaluation Artifact for FuseMax☆16Aug 26, 2024Updated last year