Hsword/Awesome-Machine-Learning-System-Papers

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Hsword/Awesome-Machine-Learning-System-Papers)

Hsword / Awesome-Machine-Learning-System-Papers

☆80

Alternatives and similar repositories for Awesome-Machine-Learning-System-Papers

Users that are interested in Awesome-Machine-Learning-System-Papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SJTU-IPADS / wukong-cube
View on GitHub
A distributed in-memory store for temporal knowledge graphs
☆10Mar 20, 2024Updated 2 years ago
nmandilaras / Resource-Allocation-Reinforcement-Learning
View on GitHub
Implementation of a Deep Reinforcement Learning agent that is capable to share the last-level-cache of a multi-core system, between a Lat…
☆10Nov 10, 2021Updated 4 years ago
Shenggan / awesome-distributed-ml
View on GitHub
A curated list of awesome projects and papers for distributed training or inference
☆279Oct 8, 2024Updated last year
eth-easl / cachew
View on GitHub
ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).
☆41Sep 10, 2024Updated last year
Hsword / SpotServe
View on GitHub
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆135Feb 22, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Raphael-Hao / Abacus
View on GitHub
☆38Jun 27, 2025Updated last year
AmadeusChan / Awesome-LLM-System-Papers
View on GitHub
☆646Jan 14, 2026Updated 6 months ago
PKU-DAIR / Hetu
View on GitHub
A high-performance distributed deep learning system targeting large-scale and automated distributed training.
☆339Dec 13, 2025Updated 7 months ago
AFDWang / Hetu-Galvatron
View on GitHub
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you hav…
☆25Oct 22, 2025Updated 9 months ago
XpuOS / xsched
View on GitHub
A scheduling framework for multitasking over diverse XPUs, including GPUs, NPUs, ASICs, and FPGAs
☆175Updated this week
SJTU-IPADS / disb
View on GitHub
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆58Aug 21, 2024Updated last year
rs3lab / TCLocks
View on GitHub
Repo for OSDI 2023 paper: "Ship your Critical Section Not Your Data: Enabling Transparent Delegation with TCLocks"
☆21Nov 6, 2024Updated last year
AmberLJC / LLMSys-PaperList
View on GitHub
Large Language Model (LLM) Systems Paper List
☆2,204Updated this week
sosson97 / msh
View on GitHub
☆20Jul 11, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
YukeWang96 / MGG_OSDI23
View on GitHub
Artifact for OSDI'23: MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Mult…
☆40Mar 17, 2024Updated 2 years ago
HKUST-SING / herald
View on GitHub
Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)
☆23May 9, 2024Updated 2 years ago
lambda7xx / awesome-AI-system
View on GitHub
paper and its code for AI System
☆377May 14, 2026Updated 2 months ago
HKUST-SING / srnic-simulation-public
View on GitHub
☆22Apr 2, 2023Updated 3 years ago
All-less / faas-scheduling-benchmark
View on GitHub
A benchmark suite for evaluating FaaS scheduler.
☆23Nov 5, 2022Updated 3 years ago
oliverYoung2001 / UltraAttn
View on GitHub
SC'25 UltraAttn: Efficiently Parallelizing Attention through Hierarchical Context-Tiling
☆16Aug 14, 2025Updated 11 months ago
smartnets / dataloader-benchmarks
View on GitHub
DL Dataloader Benchmarks
☆20Jan 27, 2025Updated last year
redbird-arch / isca2025-chimera-artifact
View on GitHub
Artifact of Chimera
☆18May 6, 2025Updated last year
RLsys-Foundation / APRIL
View on GitHub
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…
☆60Oct 11, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
microsoft / vattention
View on GitHub
Dynamic Memory Management for Serving LLMs without PagedAttention
☆506Jul 17, 2026Updated last week
pku-liang / popa
View on GitHub
A unified programming framework for high and portable performance across FPGAs and GPUs
☆11Mar 23, 2025Updated last year
IST-DASLab / torch_cgx
View on GitHub
Pytorch distributed backend extension with compression support
☆17Mar 24, 2025Updated last year
Hsword / Hetu
View on GitHub
A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …
☆126Dec 18, 2023Updated 2 years ago
ifromeast / AI_analysis
View on GitHub
analyse problems of AI with Math and Code
☆31Jul 28, 2025Updated last year
lyj20071013 / Triton-FlashAttention
View on GitHub
This repository contains multiple implementations of Flash Attention optimized with Triton kernels, showcasing progressive performance im…
☆13Mar 26, 2026Updated 4 months ago
xinhaoc / ferret
View on GitHub
Autonomous CUDA kernel optimization agent with structured task specs and per-config scoring
☆17Jun 17, 2026Updated last month
SJTU-IPADS / copier
View on GitHub
Copy as an OS Service
☆27Nov 20, 2025Updated 8 months ago
STAR-Laboratory / Accelerating-RecSys-Training
View on GitHub
Accelerating Recommender model training by leveraging popular choices -- VLDB 2022
☆30Sep 15, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
SJTU-ReArch-Group / Paper-Reading-List
View on GitHub
☆154Updated this week
VIA-Research / AgentBench
View on GitHub
The set of AI agent model implementations, benchmarks, and others used in our paper "The Cost of Dynamic Reasoning: Demystifying AI Agent…
☆44Mar 26, 2026Updated 4 months ago
galeselee / Awesome_LLM_System-PaperList
View on GitHub
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…
☆286Mar 6, 2025Updated last year
LLMNativeOS / specfs-ae
View on GitHub
Artifact Evaluation for SpecFS [FAST'26]
☆34Dec 28, 2025Updated 7 months ago
llm-db / FineInfer
View on GitHub
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)
☆19May 28, 2024Updated 2 years ago
LiuXiaoxuanPKU / Cost-Model-papers
View on GitHub
☆13Feb 22, 2023Updated 3 years ago