Here are my personal paper reading notes (including machine learning systems, AI infrastructure, and other interesting stuffs).
โ166Jan 27, 2026Updated last month
Alternatives and similar repositories for awesome-papers
Users that are interested in awesome-papers are comparing it to the libraries listed below
Sorting:
- ๐ป Terminal-Agent with Human-in-the-Loop Learningโ35Jan 16, 2026Updated last month
- paper and its code for AI Systemโ351Feb 10, 2026Updated 3 weeks ago
- NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloadingโ85Jun 16, 2025Updated 8 months ago
- โ44Jul 4, 2024Updated last year
- MetaOpt: Towards efficient heuristic design with quantifiable and confident performanceโ21Jan 20, 2026Updated last month
- Microsoft Collective Communication Libraryโ66Nov 23, 2024Updated last year
- A simple calculation for LLM MFU.โ69Sep 10, 2025Updated 5 months ago
- LLMA = LLM + Arithmetic coder, which use LLM to do insane text data compression. LLMA=ๅคงๆจกๅ+็ฎๆฏ็ผ็ ๏ผๅฎ่ฝไฝฟ็จLLMๅฏนๆๆฌๆฐๆฎ่ฟ่กๆดๅ็ๅ็ผฉ๏ผ่พพๅฐๆ้ซ็ๅ็ผฉ็ใโ22Nov 24, 2024Updated last year
- Large Language Model (LLM) Systems Paper Listโ1,849Feb 27, 2026Updated last week
- Lucid: A Non-Intrusive, Scalable and Interpretable Scheduler for Deep Learning Training Jobsโ59May 21, 2023Updated 2 years ago
- Repository for MLCommons Chakra schema and toolsโ38Dec 24, 2023Updated 2 years ago
- An artificial matrix generator in Cโ12Feb 16, 2023Updated 3 years ago
- Cluster Far Mem, framework to execute single job and multi job experiments using fastswapโ21Jan 12, 2024Updated 2 years ago
- GPU-scheduler-for-deep-learningโ210Nov 5, 2020Updated 5 years ago
- โ11Mar 13, 2023Updated 2 years ago
- โ16Jan 14, 2025Updated last year
- Towards Hardware and Software Continuous Integrationโ13Jun 8, 2020Updated 5 years ago
- Nu is a new datacenter system that enables developers to build fungible applications that can use datacenter resources wherever they are.โ41May 14, 2024Updated last year
- FPGA-based HyperLogLog Acceleratorโ12Jul 13, 2020Updated 5 years ago
- Fast OS-level support for GPU checkpoint and restoreโ273Sep 28, 2025Updated 5 months ago
- Cluster simulator with far memoryโ12Apr 28, 2020Updated 5 years ago
- Website for CSE 234, Winter 2025โ13Mar 24, 2025Updated 11 months ago
- โ13Jul 25, 2024Updated last year
- โ11Sep 20, 2024Updated last year
- TACCL: Guiding Collective Algorithm Synthesis using Communication Sketchesโ80Jul 25, 2023Updated 2 years ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variableโ210Sep 21, 2024Updated last year
- Microsoft Collective Communication Libraryโ386Sep 20, 2023Updated 2 years ago
- โ50Jun 27, 2019Updated 6 years ago
- Learning TileLang with 10 puzzles!โ151Feb 25, 2026Updated last week
- Distributed Compiler based on Triton for Parallel Systemsโ1,371Feb 13, 2026Updated 3 weeks ago
- Model-less Inference Servingโ94Nov 4, 2023Updated 2 years ago
- โ198Aug 31, 2019Updated 6 years ago
- โ14Feb 14, 2022Updated 4 years ago
- โ11Dec 20, 2024Updated last year
- MICRO 2024 Evaluation Artifact for FuseMaxโ16Aug 26, 2024Updated last year
- BiSUNA framework specialized to compile for the Xilinx Alveo U50โ13Dec 3, 2020Updated 5 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMsโ122Jul 4, 2025Updated 8 months ago
- โ84Dec 2, 2022Updated 3 years ago
- Dynamic Memory Management for Serving LLMs without PagedAttentionโ465May 30, 2025Updated 9 months ago