ruipeterpan / paper_notes
Personal blog + reading notes on system-ish papers
☆15Updated 10 months ago
Related projects: ⓘ
- Code for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]☆38Updated last year
- A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup☆31Updated last year
- Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.☆46Updated last year
- Primo: Practical Learning-Augmented Systems with Interpretable Models☆18Updated 8 months ago
- ☆48Updated 3 years ago
- Artifacts for our ASPLOS'23 paper ElasticFlow☆51Updated 4 months ago
- ☆19Updated last year
- SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training☆28Updated last year
- Analyze network performance in distributed training☆16Updated 3 years ago
- SOTA Learning-augmented Systems☆32Updated 2 years ago
- ☆45Updated last year
- Artifacts for our SIGCOMM'22 paper Muri☆38Updated 8 months ago
- ☆13Updated 2 years ago
- ☆11Updated last year
- DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.☆46Updated last month
- Vector search with bounded performance.☆33Updated 7 months ago
- Cupcake: A Compression Scheduler for Scalable Communication-Efficient Distributed Training (MLSys '23)☆8Updated last year
- ☆16Updated 4 months ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆124Updated 2 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Updated 2 years ago
- My paper/code reading notes in Chinese☆44Updated 4 months ago
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆39Updated 2 years ago
- Stateful LLM Serving☆25Updated last month
- ☆19Updated 2 years ago
- Artifacts for our SIGCOMM'23 paper Ditto☆16Updated 11 months ago
- Surrogate-based Hyperparameter Tuning System☆26Updated last year
- Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling☆8Updated 6 months ago
- ☆15Updated this week
- ☆41Updated 3 years ago