openmlsys / openmlsys-enLinks
《Machine Learning Systems: Design and Implementation》- English Version
☆37Updated last year
Alternatives and similar repositories for openmlsys-en
Users that are interested in openmlsys-en are comparing it to the libraries listed below
Sorting:
- A curated list of awesome projects and papers for distributed training or inference☆265Updated last year
- ☆452Updated 3 weeks ago
- Systems for GenAI☆157Updated last week
- fmchisel: Efficient Compression and Training Algorithms for Foundation Models☆83Updated 3 months ago
- Materials for learning SGLang☆738Updated last month
- ☆234Updated last year
- A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS☆249Updated 9 months ago
- Cataloging released Triton kernels.☆291Updated 4 months ago
- ☆179Updated 2 years ago
- A large-scale simulation framework for LLM inference☆528Updated 6 months ago
- ☆56Updated 5 months ago
- ☆626Updated 3 weeks ago
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆77Updated 5 years ago
- paper and its code for AI System☆347Updated last month
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆312Updated 7 months ago
- Learn CUDA with PyTorch☆193Updated this week
- An early research stage expert-parallel load balancer for MoE models based on linear programming.☆495Updated 2 months ago
- a minimal cache manager for PagedAttention, on top of llama3.☆135Updated last year
- A minimal implementation of vllm.☆66Updated last year
- ☆222Updated last year
- Code release for book "Efficient Training in PyTorch"☆125Updated 9 months ago
- 🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.☆248Updated 2 weeks ago
- JAX backend for SGL☆234Updated this week
- torchcomms: a modern PyTorch communications API☆327Updated this week
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆283Updated 11 months ago
- Review automated kernel generation in the era of LLMs☆80Updated 2 weeks ago
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆68Updated last year
- ☆286Updated this week
- The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)☆321Updated last year
- [ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation☆248Updated last year