☆29Feb 3, 2026Updated 3 months ago
Alternatives and similar repositories for SpecOffload-public
Users that are interested in SpecOffload-public are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.☆11Sep 19, 2024Updated last year
- ☆21Jun 9, 2025Updated 11 months ago
- ☆10Mar 14, 2020Updated 6 years ago
- Ever wondered how popular your GitHub repo is compared to others?☆16Feb 14, 2026Updated 2 months ago
- ☆18Apr 11, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- It's an experiment based on 09 KDD paper, Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs☆10Jan 8, 2019Updated 7 years ago
- ☆21Oct 2, 2024Updated last year
- ☆16Aug 9, 2025Updated 9 months ago
- [DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"☆113Dec 15, 2025Updated 4 months ago
- [USENIX Security '25] My ZIP isn’t your ZIP: Identifying and Exploiting Semantic Gaps Between ZIP Parsers☆38Mar 20, 2026Updated last month
- ☆11Apr 13, 2026Updated 3 weeks ago
- ☆15Jan 28, 2024Updated 2 years ago
- ☆15Jun 26, 2024Updated last year
- ☆20Oct 29, 2025Updated 6 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆22Jul 16, 2024Updated last year
- ☆13Mar 6, 2023Updated 3 years ago
- ☆19Feb 18, 2025Updated last year
- [ICLR'25] Fast Inference of MoE Models with CPU-GPU Orchestration☆265Nov 18, 2024Updated last year
- ArXiv Today: Get arXiv daily papers right in your Lark (飞书) via bot.☆38Sep 17, 2025Updated 7 months ago
- The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221☆31Apr 22, 2025Updated last year
- [NeurIPS 2022] ASPiRe: Adaptive Skill Priors for Reinforcement Learning☆13Oct 19, 2022Updated 3 years ago
- ☆28Oct 11, 2022Updated 3 years ago
- 该资源为作者AI安全相关论文的分享知识,包括PPT和PDF版本及原文,希望对您有所帮助。加油~☆32Jan 9, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Mamba-Spike——CGI2024☆14Dec 3, 2025Updated 5 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆23Mar 15, 2024Updated 2 years ago
- AccelOpt: Self-improving Agents for AI Accelerator Kernel Optimization☆37Apr 18, 2026Updated 3 weeks ago
- Code for training binary and WTA SNNs☆17Mar 25, 2022Updated 4 years ago
- ☆30Jul 22, 2024Updated last year
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation☆32Nov 16, 2024Updated last year
- Code and data for the paper: On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents☆46Dec 15, 2025Updated 4 months ago
- LLM Inference with Microscaling Format☆34Nov 12, 2024Updated last year
- CS294 AI Systems Class Website☆18Apr 25, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Deep Neural Network Compression based on Student-Teacher Network☆14Jul 6, 2023Updated 2 years ago
- ☆42Oct 16, 2025Updated 6 months ago
- PyTorch code for full quantization of DNN using BCGD☆14Jul 24, 2019Updated 6 years ago
- Topic models for microblogging content☆10Sep 23, 2015Updated 10 years ago
- ☆27Mar 29, 2025Updated last year
- iGniter, an interference-aware GPU resource provisioning framework for achieving predictable performance of DNN inference in the cloud.☆39Jun 11, 2024Updated last year
- finetune chinese bert with sentence-transformers☆11May 8, 2021Updated 5 years ago