stanford-mast/INFaaS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/stanford-mast/INFaaS)

stanford-mast / INFaaS

Model-less Inference Serving

☆94

Alternatives and similar repositories for INFaaS

Users that are interested in INFaaS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

SymbioticLab / Salus
View on GitHub
Fine-grained GPU sharing primitives
☆149Jul 28, 2025Updated last year
jashwantraj92 / cocktail
View on GitHub
☆16Aug 15, 2024Updated last year
pkusys / ElasticFlow
View on GitHub
Artifacts for our ASPLOS'23 paper ElasticFlow
☆56May 10, 2024Updated 2 years ago
Sys-KU / DeepPlan
View on GitHub
[ACM EuroSys 2023] Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access
☆56Aug 6, 2025Updated 11 months ago
netx-repo / PipeSwitch
View on GitHub
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127May 9, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
systems-seminar-uiuc / systems-seminar-uiuc.github.io
View on GitHub
Website for Systems Research Seminar at UIUC
☆21May 7, 2026Updated 2 months ago
SJTU-IPADS / reef
View on GitHub
REEF is a GPU-accelerated DNN inference serving system that enables instant kernel preemption and biased concurrent execution in GPU sche…
☆108Dec 24, 2022Updated 3 years ago
lynnliu030 / berkeley-os-prelim
View on GitHub
Berkeley OS Prelim Reading Notes
☆15Sep 20, 2023Updated 2 years ago
rickypinci / BATCH
View on GitHub
BATCH: Adaptive Batching for Efficient MachineLearning Serving on Serverless Platforms
☆11Aug 7, 2021Updated 4 years ago
Raphael-Hao / Abacus
View on GitHub
☆38Jun 27, 2025Updated last year
stanford-futuredata / gavel
View on GitHub
Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020
☆139Jul 25, 2024Updated 2 years ago
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
casys-kaist / glet
View on GitHub
☆53Dec 26, 2024Updated last year
alpa-projects / mms
View on GitHub
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆94Jul 14, 2023Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MegEngine / cutlass-bak
View on GitHub
modified cutlass
☆16Oct 26, 2020Updated 5 years ago
GPUPeople / GPUMemManSurvey
View on GitHub
Evaluating different memory managers for dynamic GPU memory
☆26Dec 16, 2020Updated 5 years ago
edgerun / galileo
View on GitHub
🪐 A framework for distributed load testing experiments
☆12Mar 18, 2024Updated 2 years ago
microsoft / nnfusion
View on GitHub
A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.
☆1,002Sep 19, 2024Updated last year
qianl15 / this
View on GitHub
Thousand Island Scanner: Scaling Video Analysis on AWS Lambda
☆13Oct 25, 2019Updated 6 years ago
uwsampl / nexus
View on GitHub
☆85Feb 5, 2026Updated 5 months ago
uclasystem / dorylus
View on GitHub
Dorylus: Affordable, Scalable, and Accurate GNN Training
☆76May 31, 2021Updated 5 years ago
NetSys / kappa
View on GitHub
Serverless for all computation
☆42Feb 14, 2023Updated 3 years ago
flexflow / flexflow-train
View on GitHub
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
☆1,898Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
thu-pacman / PET
View on GitHub
PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections
☆126Jun 23, 2022Updated 4 years ago
Azure / AzurePublicDataset
View on GitHub
Microsoft Azure Traces
☆1,165Jun 3, 2026Updated last month
zhuohan123 / terapipe
View on GitHub
☆79May 4, 2021Updated 5 years ago
ganler / ResearchReading
View on GitHub
General system research material (not limited to paper) reading notes.
☆22Mar 17, 2021Updated 5 years ago
Froot-NetSys / Arya
View on GitHub
Arya: Arbitrary Graph Pattern Mining with Decomposition-based Sampling
☆18Sep 27, 2023Updated 2 years ago
UMass-LIDS / Proteus
View on GitHub
Proteus: A High-Throughput Inference-Serving System with Accuracy Scaling
☆13Mar 7, 2024Updated 2 years ago
kzhang28 / Optimus
View on GitHub
An Efficient Dynamic Resource Scheduler for Deep Learning Clusters
☆41Oct 28, 2017Updated 8 years ago
snuspl / nimble
View on GitHub
Lightweight and Parallel Deep Learning Framework
☆263Nov 26, 2022Updated 3 years ago
romilbhardwaj / cilantro
View on GitHub
Source code for OSDI 2023 paper titled "Cilantro - Performance-Aware Resource Allocation for General Objectives via Online Feedback"
☆41Jul 6, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
casys-kaist / EnvPipe
View on GitHub
☆27Aug 31, 2023Updated 2 years ago
Hsword / SpotServe
View on GitHub
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆135Feb 22, 2024Updated 2 years ago
brucechin / HardwareTest
View on GitHub
hardware test for CPU，GPU，I/O，memory bandwidth performance
☆25Sep 21, 2018Updated 7 years ago
alibaba / GPU-scheduler-for-deep-learning
View on GitHub
GPU-scheduler-for-deep-learning
☆213Nov 5, 2020Updated 5 years ago
in-ATP / ATP
View on GitHub
☆87Dec 13, 2021Updated 4 years ago
jiazhihao / TASO
View on GitHub
The Tensor Algebra SuperOptimizer for Deep Learning
☆743Jan 26, 2023Updated 3 years ago
SJTU-IPADS / ServerlessBench
View on GitHub
A benchmark suite for serverless computing
☆234Feb 24, 2025Updated last year