MobiSense/SpecOffload-public

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MobiSense/SpecOffload-public)

MobiSense / SpecOffload-public

☆30

Alternatives and similar repositories for SpecOffload-public

Users that are interested in SpecOffload-public are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MobiSense / Ziggo-Device
View on GitHub
Time sensitive network performance evaluation toolkit, based on Zynq7000 FPGA architecture.
☆30May 21, 2024Updated 2 years ago
Brett-z / LayerEditing
View on GitHub
A Model Agnostic function to directly remove specified layers from the LLM
☆10May 23, 2024Updated 2 years ago
AISys-01 / vllm-CachedAttention
View on GitHub
The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.
☆11Sep 19, 2024Updated last year
GoodwillComputingLab / CLITE
View on GitHub
☆10Mar 14, 2020Updated 6 years ago
MobiSense / Ziggo-CaaS-Switch
View on GitHub
HW&SW Switch implementation enabling Control-as-a-Service industrial network paradigm
☆17Aug 22, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
PKU-SEC-Lab / HybriMoE
View on GitHub
[DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"
☆118Dec 15, 2025Updated 7 months ago
yandex-research / specexec
View on GitHub
☆68Nov 4, 2024Updated last year
Leosang-lx / FlowSpec
View on GitHub
Continuous Pipelined Speculative Decoding
☆21May 25, 2026Updated 2 months ago
eddiegaoo / Apt-Serve
View on GitHub
☆21Jun 9, 2025Updated last year
ZiyueHuang / SimpleDB
View on GitHub
A simple database including Common Operators, Query Optimization, Transactions, Rollback and Recovery
☆13Aug 8, 2017Updated 8 years ago
zjregee / ovfs
View on GitHub
Experimental repository for GSoC 2024.
☆15Aug 29, 2024Updated last year
icloud-ecnu / igniter
View on GitHub
iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.
☆39Jun 11, 2024Updated 2 years ago
JL-Cheng / SERE
View on GitHub
[ICLR 2026] SERE: Similarity-Based Expert Re-routing for Efficient Batch Decoding in MoE Models
☆18Feb 4, 2026Updated 5 months ago
uservan / speculative_thinking
View on GitHub
☆34Oct 13, 2025Updated 9 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
yuantailing / tunet-python
View on GitHub
TUNet 2018 认证协议的纯 python 实现，含 auth4 / auth6 / net 认证。适用于服务器在无人交互时自动认证
☆153Jul 29, 2024Updated 2 years ago
EfficientLLMSys / MuxServe
View on GitHub
☆15Jun 26, 2024Updated 2 years ago
CASE-Lab-UMD / Capacity-Aware-MoE
View on GitHub
The official implementation of the paper "Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts" (ICLR 2026).
☆20May 31, 2026Updated last month
lm-playpen / playpen
View on GitHub
All you need to get started with the LM Playpen Environment for Learning in Interaction.
☆16Jun 22, 2026Updated last month
richardzhuang0412 / EmbedLLM
View on GitHub
Repo for EmbedLLM: Learning Compact Representations of Large Language Models
☆32Sep 25, 2025Updated 10 months ago
caoshiyi / artifacts
View on GitHub
☆40Nov 28, 2024Updated last year
TrustAIRLab / HarmfulSkillBench
View on GitHub
The Official Repository for Paper "HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?"
☆15May 2, 2026Updated 2 months ago
LMCache / lmcache-agent-trace
View on GitHub
Agent application/benchmark/workload traces should be placed here.
☆15Apr 13, 2026Updated 3 months ago
ucamrl / xrlflow
View on GitHub
☆13Mar 6, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
hipersys-team / TopoOpt
View on GitHub
[NSDI 2023] TopoOpt: Optimizing the Network Topology for Distributed DNN Training
☆42Sep 10, 2024Updated last year
efeslab / fiddler
View on GitHub
[ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration
☆267Nov 18, 2024Updated last year
vuhpdc / jellyfish
View on GitHub
Source code for Jellyfish, a soft real-time inference serving system
☆15Dec 20, 2022Updated 3 years ago
Infini-AI-Lab / MagicDec
View on GitHub
[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
☆156Dec 4, 2024Updated last year
xpan413 / FSMoE
View on GitHub
☆16Jan 14, 2025Updated last year
deepinsight / some-resources
View on GitHub
☆10May 14, 2023Updated 3 years ago
reconfigurable-ml-pipeline / ipa
View on GitHub
Source code of IPA, https://escholarship.org/uc/item/2p0805dq
☆12Jun 27, 2024Updated 2 years ago
wu-lichao / NeuroStrike-Neuron-Level-Attacks-on-Aligned-LLMs
View on GitHub
☆17Jan 9, 2026Updated 6 months ago
SLDGroup / LBP-WHT
View on GitHub
☆13Apr 27, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
kclip / FL-SNN
View on GitHub
Code for Federated Neuromorphic Learning of Spiking Neural Networks for Low-Power Edge Intelligence
☆18Dec 9, 2020Updated 5 years ago
aifinlab / Spider-Sense
View on GitHub
☆21Feb 6, 2026Updated 5 months ago
LiuYuHan31 / FPS
View on GitHub
☆21Jul 16, 2024Updated 2 years ago
Zhou-David / HUST-DataStructure-CourseDesign
View on GitHub
华中科技大学数据结构课程设计，基于SAT的二进制数独求解
☆12Jun 28, 2021Updated 5 years ago
uw-mad-dash / decoding-speculative-decoding
View on GitHub
☆16Aug 19, 2024Updated last year
DS3Lab / Decentralized_FM_alpha
View on GitHub
☆18May 4, 2023Updated 3 years ago
DS-100 / sp17
View on GitHub
Spring 2017 Course Website
☆10Apr 1, 2026Updated 3 months ago