LLMServe/dLoRA-artifact

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LLMServe/dLoRA-artifact)

LLMServe / dLoRA-artifact

☆32

Alternatives and similar repositories for dLoRA-artifact

Users that are interested in dLoRA-artifact are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LLMServe / hydraserve
View on GitHub
☆20May 11, 2026Updated 2 months ago
ss7krd / Usher
View on GitHub
☆14Nov 7, 2024Updated last year
casys-kaist / glet
View on GitHub
☆53Dec 26, 2024Updated last year
LMCache / lmcache-agent-trace
View on GitHub
Agent application/benchmark/workload traces should be placed here.
☆15Apr 13, 2026Updated 3 months ago
LoongServe / LoongServe
View on GitHub
☆135Nov 11, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
dsrhaslab / monarch
View on GitHub
Accelerating Deep Learning Training Through Transparent Storage Tiering (CCGrid'22)
☆19Dec 13, 2022Updated 3 years ago
LLMServe / DistServe
View on GitHub
Disaggregated serving system for Large Language Models (LLMs).
☆826Apr 6, 2025Updated last year
llm-router / DeepSeekRouter
View on GitHub
☆16Mar 11, 2025Updated last year
IBM / LLM-performance-prediction
View on GitHub
Predict the performance of LLM inference services
☆23Sep 18, 2025Updated 10 months ago
DS3Lab / Decentralized_FM_alpha
View on GitHub
☆18May 4, 2023Updated 3 years ago
zhengzangw / Sequence-Scheduling
View on GitHub
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆93May 23, 2023Updated 3 years ago
ServerlessLLM / ServerlessLLM
View on GitHub
Serverless LLM Serving for Everyone.
☆693May 4, 2026Updated 2 months ago
uwsampl / nexus
View on GitHub
☆85Feb 5, 2026Updated 5 months ago
microsoft / ParrotServe
View on GitHub
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆223Sep 21, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
aFuerst / faascache-sim
View on GitHub
☆18Oct 31, 2022Updated 3 years ago
Raphael-Hao / brainstorm
View on GitHub
Compiler for Dynamic Neural Networks
☆45Nov 13, 2023Updated 2 years ago
thustorage / deft
View on GitHub
Deft: A Scalable Tree Index for Disaggregated Memory
☆22Apr 23, 2025Updated last year
vdcores / vdcores
View on GitHub
Virtual Decoupled Cores: Composable Programming Framework and Runtime for Async GPUs
☆20Updated this week
fkokkinos / to_the_point_3d_reconstruction
View on GitHub
Code of "To The Point: Correspondence-driven monocular 3D category reconstruction (TTP)" Neurips 2021
☆11Jan 24, 2022Updated 4 years ago
alibaba / llm-scheduling-artifact
View on GitHub
Artifact of OSDI '24 paper, ”Llumnix: Dynamic Scheduling for Large Language Model Serving“
☆64Jun 5, 2024Updated 2 years ago
DBGroup-SUSTech / GPU-Merkle-Patricia-Trie
View on GitHub
An optimized Merkle Patricia Trie implementation on GPU, fully compatible with and integrable into Ethereum. The paper is published on VL…
☆14Apr 15, 2024Updated 2 years ago
xxcclong / GNN-Computing
View on GitHub
Artifact for PPoPP20 "Understanding and Bridging the Gaps in Current GNN Performance Optimizations"
☆42Nov 16, 2021Updated 4 years ago
SNU-ARC / flashneuron
View on GitHub
☆41Nov 28, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
chenhongyu2048 / LLM-inference-optimization-paper
View on GitHub
Summary of some awesome work for optimizing LLM inference
☆263Feb 14, 2026Updated 5 months ago
netx-repo / PipeSwitch
View on GitHub
PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications
☆127May 9, 2022Updated 4 years ago
sjtu-epcc / DVABatch
View on GitHub
☆21May 13, 2022Updated 4 years ago
llumnix-project / llumnix-ray
View on GitHub
Efficient and easy multi-instance LLM serving
☆563Mar 12, 2026Updated 4 months ago
Kaffaljidhmah2 / SpecDec_pp
View on GitHub
Repository for the COLM 2025 paper SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
☆19Jul 10, 2025Updated last year
Henning1 / dogqc
View on GitHub
A query compiler for GPUs that translates relational algebra to Cuda.
☆20Jan 2, 2024Updated 2 years ago
ruipeterpan / marconi
View on GitHub
Artifact for "Marconi: Prefix Caching for the Era of Hybrid LLMs" [MLSys '25 Outstanding Paper Award, Honorable Mention]
☆63Mar 5, 2025Updated last year
pkusys / Halfmoon
View on GitHub
Implementation of the logging layer of our SOSP '23 paper Halfmoon
☆11Jul 28, 2023Updated 2 years ago
Adaxry / Unified_Layer_Skipping
View on GitHub
☆15Apr 11, 2024Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sejoonoh / ATR
View on GitHub
Code and data for the ACM CIKM 2024 paper "Adversarial Text Rewriting for Text-aware Recommender Systems"
☆12Aug 1, 2024Updated last year
Floating-LY / HARMONY1
View on GitHub
☆10Dec 11, 2024Updated last year
SiriusInfTra / Sirius
View on GitHub
☆18Sep 21, 2025Updated 10 months ago
wanatpj / h_blind
View on GitHub
Extraction of watermark embedded with E_BLIND method on multiple digital works.
☆13Sep 14, 2016Updated 9 years ago
dywsjtu / apparate
View on GitHub
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
☆24Nov 21, 2024Updated last year
blitz-serving / trace-replayer
View on GitHub
Repo to replay Qwen trace
☆31Jan 9, 2026Updated 6 months ago
vtu81 / NaiveVQA
View on GitHub
A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper *Show, Ask, Attend, …
☆10Jul 27, 2021Updated 4 years ago