☆54Sep 18, 2025Updated 5 months ago
Alternatives and similar repositories for CS854-F24
Users that are interested in CS854-F24 are comparing it to the libraries listed below
Sorting:
- A Hybrid Framework to Build High-performance Adaptive Neural Networks for Kernel Datapath☆28May 15, 2023Updated 2 years ago
- ☆13May 30, 2024Updated last year
- APEX+ is an LLM Serving Simulator☆43Jun 16, 2025Updated 8 months ago
- Deduplication over dis-aggregated memory for Serverless Computing☆14Mar 21, 2022Updated 3 years ago
- A record of reading list on some MLsys popular topic☆22Mar 20, 2025Updated 11 months ago
- ☆89Dec 11, 2019Updated 6 years ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆78Oct 15, 2025Updated 4 months ago
- Tempo is a system for declarative, efficient, end-to-end compiled dynamic deep learning☆28Oct 21, 2025Updated 4 months ago
- SJTU CS473 Project: Implementation of Deep Closest Point in TensorFlow, and its comparison with other registration methods.☆12Jun 14, 2020Updated 5 years ago
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading☆85Jun 16, 2025Updated 8 months ago
- ☆18Apr 21, 2024Updated last year
- ☆16Apr 22, 2025Updated 10 months ago
- ☆21Apr 2, 2023Updated 2 years ago
- A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.☆37Updated this week
- [Long Term Support] [SIGCOMM 2023] Lightning: A Reconfigurable Photonic-Electronic SmartNIC for Fast and Energy-Efficient Inference☆21Sep 20, 2024Updated last year
- KFunca: A minimalist, high-performance GPU-based automatic differentiation framework☆29Aug 14, 2025Updated 6 months ago
- ☆146Dec 19, 2025Updated 2 months ago
- This is the implementation repository of our SOSP'24 paper: Aceso: Achieving Efficient Fault Tolerance in Memory-Disaggregated Key-Value …☆23Oct 20, 2024Updated last year
- ☆88Jan 22, 2026Updated last month
- FastDCS is a distributed computing system.☆35May 11, 2018Updated 7 years ago
- ☆23Apr 28, 2024Updated last year
- ☆23Oct 31, 2023Updated 2 years ago
- ⚡ Bring some magic to i.sjtu.edu.cn☆22Jan 3, 2020Updated 6 years ago
- Arbitrary offloads for RDMA NICs☆99Apr 25, 2022Updated 3 years ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆24Nov 21, 2024Updated last year
- ☆28Sep 17, 2024Updated last year
- ☆41Jun 30, 2025Updated 8 months ago
- OSDI 2023 Welder, deeplearning compiler☆32Nov 24, 2023Updated 2 years ago
- PipeRAG: Fast Retrieval-Augmented Generation via Algorithm-System Co-design (KDD 2025)☆30Jun 14, 2024Updated last year
- Artifacts for our ASPLOS'23 paper dRAID☆29Feb 24, 2023Updated 3 years ago
- ☆30Updated this week
- ☆63Jun 29, 2022Updated 3 years ago
- ☆76Dec 29, 2025Updated 2 months ago
- An Automated Performance Optimization Framework for P4-Programmable SmartNICs☆28Nov 18, 2023Updated 2 years ago
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 10 months ago
- ☆131Nov 11, 2024Updated last year
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆315Jun 10, 2025Updated 9 months ago
- The prototype for NSDI paper "NetHint: White-Box Networking for Multi-Tenant Data Centers"☆26Feb 2, 2024Updated 2 years ago
- An auxiliary project analysis of the characteristics of KV in DiT Attention.☆33Nov 29, 2024Updated last year