mlsys-seo / ooo-backprop
☆21Updated last year
Related projects: ⓘ
- ☆101Updated last year
- Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)☆51Updated 5 months ago
- FriendliAI Model Hub☆88Updated 2 years ago
- ☆25Updated 5 years ago
- Study Group of Deep Learning Compiler☆149Updated last year
- Welcome to PeriFlow CLI ☁︎☆12Updated last year
- ☆38Updated this week
- ☆20Updated this week
- Network Contention-Aware Cluster Scheduling with Reinforcement Learning (IEEE ICPADS 2023)☆14Updated 9 months ago
- ☆36Updated last week
- ☆16Updated 3 months ago
- Study parallel programming - CUDA, OpenMP, MPI, Pthread☆54Updated 2 years ago
- FastFlow is a system that automatically detects CPU bottlenecks in deep learning training pipelines and resolves the bottlenecks with dat…☆25Updated last year
- ☆15Updated 3 years ago
- one-shot-tuner☆8Updated last year
- ☆19Updated last year
- SOTA Learning-augmented Systems☆32Updated 2 years ago
- LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale☆32Updated last month
- NEST Compiler☆114Updated 2 months ago
- A performance library for machine learning applications.☆178Updated 11 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆34Updated this week
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆33Updated last year
- PyTorch-UVM on super-large language models.☆13Updated 3 years ago
- Experimental deep learning framework written in Rust☆13Updated last year
- ☆10Updated last year
- Research and development for optimizing transformers☆121Updated 3 years ago
- ☆48Updated 3 years ago
- ☆31Updated last year
- Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines.☆41Updated 9 months ago
- A Tool for Automatic Parallelization of Deep Learning Training in Distributed Multi-GPU Environments.☆130Updated 2 years ago