Ying1123/VTC-artifact

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Ying1123/VTC-artifact)

Ying1123 / VTC-artifact

☆48

Alternatives and similar repositories for VTC-artifact

Users that are interested in VTC-artifact are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

YaoJiayi / CacheBlend
View on GitHub
☆193Jul 15, 2025Updated 11 months ago
stonet-research / cheops25-IO-characterization-of-LLM-model-kv-cache-offloading-nvme
View on GitHub
☆19Apr 15, 2025Updated last year
casys-kaist / casys-kaist.github.io
View on GitHub
☆38Jun 4, 2026Updated last month
LoongServe / LoongServe
View on GitHub
☆134Nov 11, 2024Updated last year
casys-kaist / EnvPipe
View on GitHub
☆27Aug 31, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
AutonomicPerfectionist / PipeInfer
View on GitHub
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
☆32Nov 16, 2024Updated last year
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
Hsword / SpotServe
View on GitHub
SpotServe: Serving Generative Large Language Models on Preemptible Instances
☆134Feb 22, 2024Updated 2 years ago
Jeffwan / serverless-research
View on GitHub
Serverless Paper Reading and Discussion
☆38Jan 9, 2023Updated 3 years ago
eth-easl / deltazip
View on GitHub
Compression for Foundation Models
☆36Jul 21, 2025Updated 11 months ago
UbiquitousLearning / MobileFM
View on GitHub
One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…
☆30Mar 5, 2024Updated 2 years ago
llumnix-project / llumnix-ray
View on GitHub
Efficient and easy multi-instance LLM serving
☆560Mar 12, 2026Updated 3 months ago
phoenix-dataplane / mCCS
View on GitHub
Managed collective communication service
☆24Sep 2, 2024Updated last year
microsoft / sarathi-serve
View on GitHub
A low-latency & high-throughput serving engine for LLMs
☆508Jan 8, 2026Updated 6 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
microsoft / vidur
View on GitHub
Accurate, large-scale, and extensible simulator for LLM inference Systems
☆639Jul 25, 2025Updated 11 months ago
astra-sim / astra-network-ns3
View on GitHub
☆14Mar 15, 2026Updated 3 months ago
EngineeringSoftware / time-segmented-evaluation
View on GitHub
Code and data for "Impact of Evaluation Methodologies on Code Summarization" in ACL 2022.
☆10Sep 6, 2022Updated 3 years ago
James-QiuHaoran / LLM-serving-with-proxy-models
View on GitHub
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | A tiny BERT model can tell you the verbosity of an …
☆52Jun 1, 2024Updated 2 years ago
LLMServe / DistServe
View on GitHub
Disaggregated serving system for Large Language Models (LLMs).
☆824Apr 6, 2025Updated last year
futurewei-cloud / QuantaDB
View on GitHub
☆14Apr 1, 2023Updated 3 years ago
InternLM / AcmeTrace
View on GitHub
☆179Mar 12, 2024Updated 2 years ago
camsas / qjump-ns2
View on GitHub
QJump NS patches and driver scripts
☆13Jun 29, 2015Updated 11 years ago
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
hao-ai-lab / MuxServe
View on GitHub
☆90Oct 17, 2025Updated 8 months ago
PluralisResearch / AsyncPP
View on GitHub
Asynchronous pipeline parallel optimization
☆22Feb 2, 2026Updated 5 months ago
ifromeast / AI_analysis
View on GitHub
analyse problems of AI with Math and Code
☆31Jul 28, 2025Updated 11 months ago
ShuaiGuo16 / LLM-guided-AutoML
View on GitHub
LLM-guided hyperparameter tuning
☆10Oct 7, 2023Updated 2 years ago
NEO-MLSys25 / NEO
View on GitHub
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆99Jun 16, 2025Updated last year
ByteDance-Seed / StragglerAnalysis
View on GitHub
☆55Apr 30, 2025Updated last year
WukLab / osworld-human
View on GitHub
OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents
☆27May 17, 2026Updated last month
UChi-JCL / CacheGen
View on GitHub
☆162Oct 9, 2024Updated last year
AIoT-MLSys-Lab / Efficient-Diffusion-Model-Survey
View on GitHub
[TMLR 2025] Efficient Diffusion Models: A Survey
☆184Dec 8, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
AISys-01 / vllm-CachedAttention
View on GitHub
The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.
☆11Sep 19, 2024Updated last year
Scientific-Computing-Lab / STREAMer
View on GitHub
STREAMer: Benchmarking remote volatile and non-volatile memory bandwidth
☆18Aug 21, 2023Updated 2 years ago
OrderLab / TrainCheck
View on GitHub
An Observability Framework for AI Training
☆71Jun 30, 2026Updated last week
zlwang-cs / OfficeBench
View on GitHub
OfficeBench: Benchmarking Language Agents across Multiple Applications for Office Automation
☆39Apr 1, 2026Updated 3 months ago
luzhixing12345 / klinux
View on GitHub
linux 内核技术文档
☆16Apr 27, 2026Updated 2 months ago
Cloud-and-Distributed-Systems / Erms
View on GitHub
☆27Nov 15, 2024Updated last year
xinliulab / gr-rfid
View on GitHub
Gen2-UHF-RFID-Reader based on Latest Ubuntu, GNU Radio, UHD and Python
☆17Jul 14, 2023Updated 2 years ago