eth-easl / modyn
Modyn is a research-platform for training ML models on growing datasets.
☆25Updated this week
Related projects: ⓘ
- Surrogate-based Hyperparameter Tuning System☆26Updated last year
- ☆30Updated 3 months ago
- Stateful LLM Serving☆25Updated last month
- Python package for rematerialization-aware gradient checkpointing☆22Updated 10 months ago
- ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).☆35Updated last week
- ☆15Updated 2 months ago
- ☆30Updated 2 years ago
- PyTorch-Direct code on top of PyTorch-1.8.0nightly (e152ca5) for Large Graph Convolutional Network Training with GPU-Oriented Data Commun…☆45Updated last year
- A resilient distributed training framework☆78Updated 5 months ago
- ☆19Updated last year
- How much energy do LLMs consume?☆40Updated last week
- ☆22Updated 3 years ago
- ☆30Updated this week
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆13Updated 9 months ago
- (NeurIPS 2022) Automatically finding good model-parallel strategies, especially for complex models and clusters.☆33Updated last year
- ☆47Updated 3 weeks ago
- Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)☆11Updated 3 months ago
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆76Updated last year
- Releasing the spot availability traces used in "Can't Be Late" paper.☆14Updated 5 months ago
- SpotServe: Serving Generative Large Language Models on Preemptible Instances☆92Updated 6 months ago
- Dorylus: Affordable, Scalable, and Accurate GNN Training☆77Updated 3 years ago
- Graphiler is a compiler stack built on top of DGL and TorchScript which compiles GNNs defined using user-defined functions (UDFs) into ef…☆58Updated last year
- ☆17Updated last year
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆110Updated last month
- LLM Serving Performance Evaluation Harness☆45Updated 3 weeks ago
- Official Repo for "LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization"☆25Updated 6 months ago
- [MLSys 2023] Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models☆16Updated last year
- ☆35Updated 2 months ago
- Multi-Instance-GPU profiling tool☆51Updated last year
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable☆89Updated last week