☆29Feb 3, 2026Updated 3 months ago
Alternatives and similar repositories for SpecOffload-public
Users that are interested in SpecOffload-public are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.☆11Sep 19, 2024Updated last year
- ☆21Jun 9, 2025Updated 11 months ago
- ☆39Nov 28, 2024Updated last year
- Experimental repository for GSoC 2024.☆15Aug 29, 2024Updated last year
- [DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"☆116Dec 15, 2025Updated 5 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆15Jan 28, 2024Updated 2 years ago
- ☆15Jun 26, 2024Updated last year
- FlexZNS: Building High-Performance ZNS SSDs with Size-Flexible and Parity-Protected Zones (ICCD'23)☆14Dec 28, 2023Updated 2 years ago
- ☆13Mar 6, 2023Updated 3 years ago
- ☆19Feb 18, 2025Updated last year
- [ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration☆265Nov 18, 2024Updated last year
- The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221☆31Apr 22, 2025Updated last year
- ☆10May 14, 2023Updated 3 years ago
- [NeurIPS 2022] ASPiRe: Adaptive Skill Priors for Reinforcement Learning☆13Oct 19, 2022Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This repository is the accompanying code for the paper CFVFP. This paper presents a new algorithm for solving incomplete information game…☆15Feb 23, 2025Updated last year
- ☆17Nov 9, 2024Updated last year
- Mamba-Spike——CGI2024☆14Dec 3, 2025Updated 5 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆23Mar 15, 2024Updated 2 years ago
- Code for Federated Neuromorphic Learning of Spiking Neural Networks for Low-Power Edge Intelligence☆18Dec 9, 2020Updated 5 years ago
- Library to interface Compilers and ML models for ML-Enabled Compiler Optimizations☆20Oct 19, 2025Updated 7 months ago
- [MLSys 2026] AccelOpt: Self-improving Agents for AI Accelerator Kernel Optimization☆44May 15, 2026Updated 2 weeks ago
- Whisper inference with TensorRT-LLM☆25Sep 22, 2023Updated 2 years ago
- Code for training binary and WTA SNNs☆17Mar 25, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- ☆30Jul 22, 2024Updated last year
- ☆24Oct 7, 2025Updated 7 months ago
- CS294 AI Systems Class Website☆18Apr 25, 2022Updated 4 years ago
- ☆44Oct 16, 2025Updated 7 months ago
- Topic models for microblogging content☆10Sep 23, 2015Updated 10 years ago
- ☆26Mar 29, 2025Updated last year
- iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.☆39Jun 11, 2024Updated last year
- Large-scale exact string matching tool☆17Mar 7, 2025Updated last year
- 基于pytorch_rnn的古诗词生成☆11Oct 24, 2021Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- 华中科技大学计算机学院系统能力培养2019-虚拟机☆19Dec 12, 2019Updated 6 years ago
- Repo for my Master Thesis at ULiège in 2019 (Machine learning under resource constraints)☆10Jun 29, 2019Updated 6 years ago
- Implemented transformer NN block for Machine translation, text classfication, Natural language inference as well as Machine reading compr…☆11Mar 1, 2026Updated 2 months ago
- ☆30Sep 29, 2021Updated 4 years ago
- ☆59Mar 11, 2025Updated last year
- SPEC-RL: Accelerating On-Policy Reinforcement Learning via Speculative Rollouts☆65Dec 1, 2025Updated 5 months ago
- Convolutional Neural Network for Text Classification in Tensorflow☆10Apr 3, 2017Updated 9 years ago