thustorage / Medusa
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆16Updated 2 months ago
Alternatives and similar repositories for Medusa:
Users that are interested in Medusa are comparing it to the libraries listed below
- PetPS: Supporting Huge Embedding Models with Tiered Memory☆30Updated 8 months ago
- A Progam-Behavior-Guided Far Memory System☆34Updated last year
- This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …☆17Updated 3 months ago
- ☆23Updated last year
- Artifacts of EuroSys'24 paper "Exploring Performance and Cost Optimization with ASIC-Based CXL Memory"☆23Updated 11 months ago
- Scaling Up Memory Disaggregated Applications with SMART☆26Updated 9 months ago
- Canvas: Isolated and Adaptive Swapping for Multi-Applications on Remote Memory☆37Updated last year
- This is the implementation repository of our FAST'23 paper: FUSEE: A Fully Memory-Disaggregated Key-Value Store.☆54Updated 2 years ago
- ☆14Updated 7 months ago
- ☆31Updated 8 months ago
- GPU-accelerated vector query processing system that supports large vector datasets beyond GPU memory.☆25Updated 10 months ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆18Updated 2 months ago
- This is the implementation repository of our OSDI'23 paper: SMART: A High-Performance Adaptive Radix Tree for Disaggregated Memory.☆58Updated 3 months ago
- The Artifact Evaluation Version of SOSP Paper #19☆44Updated 5 months ago
- [OSDI 2024] Motor: Enabling Multi-Versioning for Distributed Transactions on Disaggregated Memory☆47Updated 11 months ago
- TeRM: Extending RDMA-Attached Memory with SSD [FAST'24]☆40Updated 3 months ago
- [HotStorage '24] Can ZNS SSDs be Better Storage Devices for Persistent Cache?☆12Updated 8 months ago
- Johnny Cache: the End of DRAM Cache Conflicts (in Tiered Main Memory Systems)☆18Updated last year
- Code for "Baleen: ML Admission & Prefetching for Flash Caches" (FAST 2024).☆23Updated 11 months ago
- Website for Artifact Evaluation at EuroSys, SOSP, OSDI, ATC☆34Updated last week
- ☆26Updated 2 years ago
- ☆11Updated 10 months ago
- Exploring the Design Space of Page Management for Multi-Tiered Memory Systems (USENIX ATC '21)☆43Updated 2 years ago
- Universal Presentation: A Header-only C++ Library to Cout STL containers and more☆19Updated last year
- Tiered memory management☆69Updated 5 months ago
- Hermit: Low-Latency, High-Throughput, and Transparent Remote Memory via Feedback-Directed Asynchrony☆33Updated 8 months ago
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆40Updated 2 years ago
- ROLEX: A Scalable RDMA-oriented Learned Key-Value Store for Disaggregated Memory Systems☆71Updated last year
- ☆53Updated 4 years ago