☆79Mar 7, 2022Updated 3 years ago
Alternatives and similar repositories for Awesome-Machine-Learning-System-Papers
Users that are interested in Awesome-Machine-Learning-System-Papers are comparing it to the libraries listed below
Sorting:
- A distributed in-memory store for temporal knowledge graphs☆10Mar 20, 2024Updated last year
- Implementation of a Deep Reinforcement Learning agent that is capable to share the last-level-cache of a multi-core system, between a Lat…☆10Nov 10, 2021Updated 4 years ago
- ☆38Jun 27, 2025Updated 8 months ago
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated last month
- Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…☆41Mar 17, 2024Updated last year
- AI model training on heterogeneous, geo-distributed resources☆38Nov 24, 2025Updated 3 months ago
- A curated list of awesome projects and papers for distributed training or inference☆266Oct 8, 2024Updated last year
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆19May 28, 2024Updated last year
- ☆21Apr 2, 2023Updated 2 years ago
- Prefix-Aware Attention for LLM Decoding☆29Jan 23, 2026Updated last month
- ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).☆40Sep 10, 2024Updated last year
- Repo for OSDI 2023 paper: "Ship your Critical Section Not Your Data: Enabling Transparent Delegation with TCLocks"☆21Nov 6, 2024Updated last year
- ☆631Jan 14, 2026Updated last month
- Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)☆23May 9, 2024Updated last year
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆58Aug 21, 2024Updated last year
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆135Feb 22, 2024Updated 2 years ago
- A high-performance distributed deep learning system targeting large-scale and automated distributed training.☆333Dec 13, 2025Updated 2 months ago
- A unified programming framework for high and portable performance across FPGAs and GPUs☆11Mar 23, 2025Updated 11 months ago
- paper and its code for AI System☆351Feb 10, 2026Updated 3 weeks ago
- APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…☆51Oct 11, 2025Updated 4 months ago
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆24Nov 21, 2024Updated last year
- A benchmark suite for evaluating FaaS scheduler.☆23Nov 5, 2022Updated 3 years ago
- 这里收录比较实用的计算机相关技术书籍,可以在短期之内入门的简单实用教程、一些技术网站以及一些写的比较好的博文,欢迎Fork,你也可以通过Pull Request参与编辑。☆10Jul 21, 2016Updated 9 years ago
- Nex Venus Communication Library☆72Nov 17, 2025Updated 3 months ago
- The Next-gen Language & Compiler Powering Efficient Hardware Design☆36Jan 16, 2025Updated last year
- Paper list of federated learning: About system design☆13Apr 13, 2022Updated 3 years ago
- This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance im…☆11Jan 19, 2026Updated last month
- ☆19Jun 1, 2025Updated 9 months ago
- Utilities for paper writing.☆12Jan 11, 2026Updated last month
- Tutorials for NVIDIA CUPTI samples☆55Nov 3, 2025Updated 4 months ago
- A record of reading list on some MLsys popular topic☆22Mar 20, 2025Updated 11 months ago
- Programming system for NIC-accelerated network applications☆29Oct 5, 2018Updated 7 years ago
- Large Language Model (LLM) Systems Paper List☆1,849Updated this week
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆21Feb 9, 2026Updated 3 weeks ago
- [AFK] Hardware router in Chisel (THU Network Joint Lab 2020)☆14Oct 8, 2020Updated 5 years ago
- 训练营训练方向项目☆26Jan 28, 2026Updated last month
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Sep 21, 2023Updated 2 years ago
- Code released to accompany the ISCA paper: "T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware"☆28Feb 18, 2022Updated 4 years ago