eth-easl / cachewLinks

ML Input Data Processing as a Service. This repository contains the source code for Cachew (built on top of TensorFlow).

☆39

Alternatives and similar repositories for cachew

Users that are interested in cachew are comparing it to the libraries listed below

Sorting:

msr-fiddle / CoorDL
☆24Updated 2 years ago
msr-fiddle / DS-Analyzer
☆38Updated 4 years ago
rkhan055 / SHADE
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
☆35Updated 2 years ago
msr-fiddle / CheckFreq
☆55Updated 4 years ago
WukLab / preble
Stateful LLM Serving
☆79Updated 4 months ago
SymbioticLab / Oobleck
A resilient distributed training framework
☆95Updated last year
danyangz / lightning
Lightning In-Memory Object Store
☆47Updated 3 years ago
dywsjtu / apparate
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆25Updated 8 months ago
suquark / hoplite
☆45Updated 3 years ago
thustorage / Medusa
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆28Updated 2 months ago
suquark / ExoFlow
A universal workflow system for exactly-once DAGs
☆23Updated 2 years ago
pkusys / Auncel
Vector search with bounded performance.
☆36Updated last year
cirquit / presto
☆15Updated 2 years ago
Hsword / SpotServe
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆125Updated last year
CGCL-codes / streambox
☆13Updated last year
uclasystem / dorylus
Dorylus: Affordable, Scalable, and Accurate GNN Training
☆76Updated 4 years ago
microsoft / SuperScaler
An experimental parallel training platform
☆54Updated last year
jasperzhong / swift
☆15Updated 3 years ago
uclasystem / bamboo
Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.
☆50Updated 2 years ago
James-QiuHaoran / LLM-serving-with-proxy-models
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …
☆38Updated last year
Azure / msccl
Microsoft Collective Communication Library
☆63Updated 8 months ago
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
☆41Updated 2 years ago
ds2-lab / SFS
SFS: A Smart OS Scheduler for Serverless Function Workloads (SC'22)
☆13Updated 2 years ago
alpa-projects / mms
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆83Updated 2 years ago
stanford-mast / INFaaS
Model-less Inference Serving
☆90Updated last year
SJTU-IPADS / disb
DISB is a new DNN inference serving benchmark with diverse workloads and models, as well as real-world traces.
☆53Updated 11 months ago
ut-osa / nightcore
Nightcore: Efficient and Scalable Serverless Computing for Latency-Sensitive, Interactive Microservices [ASPLOS '21]
☆105Updated 3 years ago
uw-mad-dash / shockwave
Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]
☆44Updated 2 years ago
LiuXiaoxuanPKU / Cost-Model-papers
☆13Updated 2 years ago
nicexlab / GeminiFS
GeminiFS: A Companion File System for GPUs
☆37Updated 5 months ago