llm-db/FineInfer

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/llm-db/FineInfer)

llm-db / FineInfer

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)

☆19

Alternatives and similar repositories for FineInfer

Users that are interested in FineInfer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DS3Lab / CocktailSGD
View on GitHub
☆27Aug 25, 2023Updated 2 years ago
zhangjiong724 / autoassist-exp
View on GitHub
Code accompanying the NeurIPS 2019 paper AutoAssist: A Framework to Accelerate Training of Deep Neural Networks.
☆14Oct 3, 2022Updated 3 years ago
eth-easl / pccheck
View on GitHub
☆12Apr 23, 2026Updated 2 months ago
UbiquitousLearning / FeS
View on GitHub
Federated Few-shot Learning for Mobile NLP. Conditionally accepted by MobiCom'23.
☆17Aug 18, 2023Updated 2 years ago
hiddenlayer2020 / ML-Job-Scheduler-MLFS
View on GitHub
☆13Dec 18, 2020Updated 5 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
SJTU-IPADS / cocytus
View on GitHub
Cocytus is an efficient and available in-memory K/V-store through hybrid erasure coding and replication
☆31Mar 7, 2016Updated 10 years ago
Manuscrit / SelectiveBackPropagation
View on GitHub
Implementation of the paper: Selective_Backpropagation from paper Accelerating Deep Learning by Focusing on the Biggest Losers
☆15Feb 2, 2020Updated 6 years ago
thustorage / deft
View on GitHub
Deft: A Scalable Tree Index for Disaggregated Memory
☆22Apr 23, 2025Updated last year
siasosp23 / artifacts
View on GitHub
☆24Aug 15, 2023Updated 2 years ago
uw-mad-dash / shockwave
View on GitHub
Artifact for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]
☆46Nov 24, 2022Updated 3 years ago
uw-mad-dash / Accordion
View on GitHub
Code for reproducing experiments performed for Accoridon
☆13Jun 11, 2021Updated 5 years ago
google / iopddl
View on GitHub
Supplemental materials for The ASPLOS 2025 / EuroSys 2025 Contest on Intra-Operator Parallelism for Distributed Deep Learning
☆25May 12, 2025Updated last year
UbiquitousLearning / FwdLLM
View on GitHub
☆36May 28, 2024Updated 2 years ago
hku-systems / naspipe
View on GitHub
☆14Jan 12, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
lzhangbv / acpsgd
View on GitHub
[ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
☆10Apr 28, 2023Updated 3 years ago
bertmaher / tf32_gemm
View on GitHub
Example of binding a TF32 CUTLASS GEMM kernel to PyTorch
☆12Jun 7, 2024Updated 2 years ago
MayDomine / Seq1F1B
View on GitHub
Sequence-level 1F1B schedule for LLMs.
☆19Jun 4, 2024Updated 2 years ago
James-QiuHaoran / Tools
View on GitHub
This repository consists of useful tools or guides for system software development or anything interesting.
☆11Feb 27, 2026Updated 4 months ago
UbiquitousLearning / FedAdapter
View on GitHub
"Efficient Federated Learning for Modern NLP", to appear at MobiCom 2023.
☆35Aug 18, 2023Updated 2 years ago
tonyzhao-jt / LLM-PQ
View on GitHub
Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …
☆39Aug 29, 2025Updated 10 months ago
SymbioticLab / ModelKeeper
View on GitHub
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
☆36Jan 9, 2023Updated 3 years ago
alibaba / llm-scheduling-artifact
View on GitHub
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆64Jun 5, 2024Updated 2 years ago
RuifMaxx / Paper-List-of-cloud-resource-management
View on GitHub
☆15Dec 29, 2022Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
wangrunji0408 / rjrouter
View on GitHub
[AFK] Hardware router in Chisel (THU Network Joint Lab 2020)
☆14Oct 8, 2020Updated 5 years ago
LouisYZK / mit6.824-2023
View on GitHub
mit-6.824 distributed system labs demo in golang & python
☆11Nov 20, 2023Updated 2 years ago
BigDataAnalyticsGroup / GENE
View on GitHub
Code used for VLDB paper "The next 50 Years in Database Indexing or: The Case for Automatically Generated Index Structures"
☆14Mar 31, 2022Updated 4 years ago
lwangbm / Metis
View on GitHub
Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters with at Scale
☆19May 27, 2020Updated 6 years ago
stanford-futuredata / POP
View on GitHub
Code for "Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP", which appeared at SOSP 2021
☆28Dec 15, 2021Updated 4 years ago
SukerZ / MADDPG-on-PyTorch
View on GitHub
在PyTorch上重构multi-agent deep deterministic policy gradient(MADDPG)，将https://github.com/xuemei-ye/maddpg-mpe 修改到自己电脑上可运行。因为本人笔记本没有CUDA，实验速度…
☆14May 10, 2019Updated 7 years ago
UofT-EcoSystem / rlscope
View on GitHub
RL-Scope: Cross-Stack Profiling for Deep Reinforcement Learning Workloads
☆48Apr 7, 2021Updated 5 years ago
microsoft / SelfTune
View on GitHub
SelfTune is an RL framework that enables systems and service developers to automatically tune various configuration parameters and other …
☆46May 31, 2024Updated 2 years ago
wangcityboy / web-blog-php
View on GitHub
个人博客-网页版-PHP开发
☆12Mar 9, 2017Updated 9 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
harvard-cns / Harvard-CNS-Seminar
View on GitHub
Reading seminar in Harvard Cloud Networking and Systems Group
☆16Aug 29, 2022Updated 3 years ago
Raphael-Hao / brainstorm
View on GitHub
Compiler for Dynamic Neural Networks
☆45Nov 13, 2023Updated 2 years ago
excelart / auto
View on GitHub
☆15Oct 26, 2018Updated 7 years ago
huaweicloud / trace_generation_rnn
View on GitHub
This repository contains code for the paper: Bergsma S., Zeyl T., Senderovich A., and Beck J. C., "Generating Complex, Realistic Cloud Wo…
☆41Nov 11, 2021Updated 4 years ago
thustorage / TeRM
View on GitHub
TeRM: Extending RDMA-Attached Memory with SSD [FAST'24]
☆46Oct 21, 2024Updated last year
gudiandian / ElasticFlow
View on GitHub
☆17May 10, 2024Updated 2 years ago
MIT-REALM / dcrl
View on GitHub
Density Constrained Reinforcement Learning
☆12Mar 24, 2023Updated 3 years ago