Repo for SpecEE: Accelerating Large Language Model Inference with Speculative Early Exiting (ISCA25)
☆72Apr 25, 2025Updated 11 months ago
Alternatives and similar repositories for SpecEE
Users that are interested in SpecEE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆124Dec 25, 2025Updated 3 months ago
- [HPCA 2026 Best Paper Candidate] Official implementation of "Focus: A Streaming Concentration Architecture for Efficient Vision-Language …☆41Feb 8, 2026Updated last month
- ☆13Sep 30, 2023Updated 2 years ago
- How to plot for papers, slides, demos, etc.☆10Apr 7, 2022Updated 3 years ago
- PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity☆121Mar 16, 2026Updated last week
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- including compiler to encode DGL GNN model to instructions, runtime software to transfer data and control the accelerator, and hardware v…☆14Nov 19, 2023Updated 2 years ago
- ☆11Jul 1, 2025Updated 8 months ago
- Extending BookSim2.0 and HotSpot6.0 for Power, Performance and Thermal evaluation of 3D NoC Architectures☆13Aug 9, 2019Updated 6 years ago
- GPU-accelerated LLM Training Simulator☆18Jun 26, 2025Updated 9 months ago
- ☆12Jan 9, 2026Updated 2 months ago
- A tool for checking the contract satisfaction for hardware designs☆12Nov 4, 2025Updated 4 months ago
- ☆17May 10, 2024Updated last year
- Sample programs for the LLVM PTX back-end☆41Aug 27, 2015Updated 10 years ago
- ☆43Oct 15, 2025Updated 5 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆118Nov 17, 2023Updated 2 years ago
- A benchmark suited especially for deep learning operators☆42Feb 13, 2023Updated 3 years ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆76Jul 14, 2025Updated 8 months ago
- ☆11Apr 16, 2023Updated 2 years ago
- DeepGate3 for ICCAD2024☆13May 26, 2025Updated 10 months ago
- analyse problems of AI with Math and Code☆27Jul 28, 2025Updated 7 months ago
- ☆16Dec 9, 2023Updated 2 years ago
- ☆18Jan 27, 2025Updated last year
- Sys, but no longer in Haskell☆19Mar 14, 2022Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Tool for compiling Lean to WASM☆24Mar 17, 2024Updated 2 years ago
- Venus Collective Communication Library, supported by SII and Infrawaves.☆141Mar 4, 2026Updated 3 weeks ago
- The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]☆19Aug 4, 2022Updated 3 years ago
- A Direct Memory Access Controller (DMAC) with AHB-lite bus interface☆17Oct 6, 2024Updated last year
- [HPCA 2022] GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design☆39Mar 30, 2022Updated 3 years ago
- The wafer-native AI accelerator simulation platform and inference engine.☆52Jan 1, 2026Updated 2 months ago
- This repository contains the artifact for the SOSP'23 paper: Sishuai Gong, Dinglan Peng, Deniz Altınbüken, Pedro Fonseca, Petros Maniati…☆15Oct 24, 2023Updated 2 years ago
- PyTorch code for our paper "Progressive Binarization with Semi-Structured Pruning for LLMs"☆13Mar 11, 2026Updated 2 weeks ago
- SpecLLM: Exploring Generation and Review of VLSI Design Specification with Large Language Model☆16Jan 29, 2024Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ALPS: An Adaptive Learning, Priority OS Scheduler for Serverless Functions (USENIX ATC'24)☆13Jun 20, 2024Updated last year
- [DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"☆104Dec 15, 2025Updated 3 months ago
- [ICLR25] STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs☆19Jun 3, 2025Updated 9 months ago
- MICRO 2023 Evaluation Artifact for TeAAL☆10Oct 26, 2023Updated 2 years ago
- λFS: an elastic, high-performance, serverless-function-based metadata service for large-scale distributed file systems (ACM ASPLOS'23)☆14Apr 2, 2025Updated 11 months ago
- a high performance server framework☆12Dec 11, 2022Updated 3 years ago
- Simulator for HDD/SSD, derived from the CMU PDL DiskSim, with the SSD-add-on patch from Microsoft Research applied.☆15Dec 30, 2019Updated 6 years ago