eddiegaoo / Apt-ServeView external linksLinks
☆20Jun 9, 2025Updated 8 months ago
Alternatives and similar repositories for Apt-Serve
Users that are interested in Apt-Serve are comparing it to the libraries listed below
Sorting:
- The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.☆11Sep 19, 2024Updated last year
- A record of reading list on some MLsys popular topic☆21Mar 20, 2025Updated 10 months ago
- A fast text search engine built for SSDs, written in C++.☆11Aug 29, 2022Updated 3 years ago
- [NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank☆70Nov 4, 2024Updated last year
- Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …☆46Jun 1, 2024Updated last year
- (ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…☆32Jun 21, 2025Updated 7 months ago
- Self-host LLMs with LMDeploy and BentoML☆22Dec 26, 2025Updated last month
- ☆165Jul 15, 2025Updated 7 months ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆25Nov 21, 2024Updated last year
- A benchmark suite for evaluating FaaS scheduler.☆23Nov 5, 2022Updated 3 years ago
- A user-friendly Command & Control (C&C) web platform for remote monitoring, management, and task automation across multiple devices.☆14Dec 15, 2024Updated last year
- ☆26Mar 31, 2022Updated 3 years ago
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated 2 weeks ago
- Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]☆41May 13, 2025Updated 9 months ago
- Tutorial for Ray☆36Mar 31, 2024Updated last year
- ☆85Apr 18, 2025Updated 9 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 3 months ago
- d3LLM: Ultra-Fast Diffusion LLM 🚀☆91Feb 4, 2026Updated last week
- A simple MIPS CPU for BUAA CO course (and now NSCSCC).☆10May 15, 2021Updated 4 years ago
- LITS: An Optimized Learned Index for Strings☆13Jun 18, 2025Updated 7 months ago
- This is the code of a agentic rag method with dynamic workflow.☆13Jan 22, 2026Updated 3 weeks ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- ☆14Jun 10, 2025Updated 8 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆89Jan 29, 2026Updated 2 weeks ago
- Source code for Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory☆39Jan 7, 2023Updated 3 years ago
- A simple OperatingSystem☆10Sep 9, 2022Updated 3 years ago
- UnitEval is a benchmarking and evaluation tools for AutoDev Coder.☆13Jan 2, 2024Updated 2 years ago
- CV approach aimed to remove moving objects in videos (dynamic and static camera)☆11Mar 21, 2021Updated 4 years ago
- A 100% locally run AI web tool for generating WeChat replies using the RWKV runner☆10Oct 29, 2024Updated last year
- 提供了一个极简的发电文案接口和一些云崽插件☆11Jan 17, 2025Updated last year
- paper and code for New Directions in Cloud Programming, CIDR 2021☆11Feb 17, 2021Updated 4 years ago
- ACM Class 2017 Computer Architecture☆10Jan 11, 2018Updated 8 years ago
- A simple Mali 6xx/7xx register interface model that doesn't do any rendering.☆13Jan 29, 2016Updated 10 years ago
- 🤖 It is basic tools build for scraping & embedding text. The main technologies included OpenAI embeddings, Supabase and Next.js.☆16Apr 12, 2023Updated 2 years ago
- VIVOTO is an android simple video and photo editor that can remove anything that you want to remove object. In this app, you can use trim…☆11Jun 16, 2020Updated 5 years ago
- API server for F5-TTS☆19Jan 24, 2026Updated 3 weeks ago
- Containerized self-hosted REST API for vision classification, utilizing Hugging Face transformers.☆10Dec 5, 2024Updated last year
- ☆11Sep 12, 2023Updated 2 years ago
- A library for simplifying training with multi gpu setups in the HuggingFace / PyTorch ecosystem.☆16Jan 9, 2026Updated last month