eth-easl / modynLinks
Modyn is a research-platform for training ML models on growing datasets.
☆48Updated last month
Alternatives and similar repositories for modyn
Users that are interested in modyn are comparing it to the libraries listed below
Sorting:
- ☆13Updated 8 months ago
- ☆30Updated 2 years ago
- ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).☆39Updated 10 months ago
- A resilient distributed training framework☆95Updated last year
- VQPy: An object-oriented approach to modern video analytics☆42Updated 8 months ago
- LLM Serving Performance Evaluation Harness☆79Updated 4 months ago
- ☆94Updated 3 years ago
- Multi-Instance-GPU profiling tool☆60Updated 2 years ago
- Stateful LLM Serving☆76Updated 4 months ago
- Model-less Inference Serving☆88Updated last year
- ☆25Updated last year
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆123Updated last year
- ☆45Updated 3 years ago
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆168Updated 9 months ago
- ☆64Updated last year
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆158Updated 3 weeks ago
- 🏙 Interactive performance profiling and debugging tool for PyTorch neural networks.☆64Updated 5 months ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆40Updated 2 years ago
- Resource-adaptive cluster scheduler for deep learning training.☆445Updated 2 years ago
- Measure and optimize the energy consumption of your AI applications!☆274Updated this week
- Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]☆25Updated 7 months ago
- A schedule language for large model training☆149Updated last year
- ☆54Updated 4 years ago
- Microsoft Collective Communication Library☆64Updated 7 months ago
- Simple Distributed Deep Learning on TensorFlow☆133Updated last month
- ☆43Updated last year
- ☆27Updated last year
- Efficient Compute-Communication Overlap for Distributed LLM Inference☆22Updated 2 weeks ago
- ☆251Updated 11 months ago
- Machine learning on serverless platform☆9Updated 3 years ago