alibaba-edu/qwen-bailian-usagetraces-anon

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/alibaba-edu/qwen-bailian-usagetraces-anon)

alibaba-edu / qwen-bailian-usagetraces-anon

☆134

Alternatives and similar repositories for qwen-bailian-usagetraces-anon

Users that are interested in qwen-bailian-usagetraces-anon are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

thustorage / Medusa
View on GitHub
Medusa: Accelerating Serverless LLM Inference with Materialization [ASPLOS'25]
☆47May 13, 2025Updated last year
flashserve / PAT
View on GitHub
Prefix-Aware Attention for LLM Decoding
☆41May 26, 2026Updated last month
microsoft / ParrotServe
View on GitHub
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆221Sep 21, 2024Updated last year
SJTU-IPADS / wukong-cube
View on GitHub
A distributed in-memory store for temporal knowledge graphs
☆10Mar 20, 2024Updated 2 years ago
lineagech / GMT
View on GitHub
☆12Mar 26, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
DeepLink-org / DLSlime
View on GitHub
Composable and Embeddable Communication Runtime for Distributed AI Services
☆102Jun 5, 2026Updated last month
WukLab / preble
View on GitHub
Stateful LLM Serving
☆103Mar 11, 2025Updated last year
infinigence / FUSCO
View on GitHub
High-performance distributed data shuffling (all-to-all) library for MoE training and inference
☆123Mar 7, 2026Updated 4 months ago
alpa-projects / mms
View on GitHub
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)
☆94Jul 14, 2023Updated 2 years ago
rkhan055 / SHADE
View on GitHub
SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
☆36Mar 1, 2023Updated 3 years ago
UofT-EcoSystem / BPPSA-open
View on GitHub
The (open-source part of) code to reproduce "BPPSA: Scaling Back-propagation by Parallel Scan Algorithm".
☆13Jun 7, 2021Updated 5 years ago
axio-project / FuseLink
View on GitHub
Efficient GPU communication over multiple NICs.
☆29Nov 20, 2025Updated 7 months ago
aisoft9 / JYCache
View on GitHub
DRAM/SSD hybrid caching system
☆15Mar 13, 2025Updated last year
wu-kan / wuk_cupti_wrapper
View on GitHub
a simple API to use CUPTI
☆10Aug 19, 2025Updated 10 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
cds-ruc / SALI
View on GitHub
☆17Jan 12, 2024Updated 2 years ago
LoongServe / LoongServe
View on GitHub
☆134Nov 11, 2024Updated last year
sspec-project / SparseSpec
View on GitHub
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
☆114Dec 2, 2025Updated 7 months ago
nicexlab / GeminiFS
View on GitHub
GeminiFS: A Companion File System for GPUs
☆83Updated this week
YaoJiayi / CacheBlend
View on GitHub
☆194Jul 15, 2025Updated 11 months ago
lzhangbv / dear_pytorch
View on GitHub
[ICDCS 2023] DeAR: Accelerating Distributed Deep Learning with Fine-Grained All-Reduce Pipelining
☆12Dec 4, 2023Updated 2 years ago
RC4ML / RPCNIC
View on GitHub
RPCNIC: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator [HPCA2025]
☆15Dec 9, 2024Updated last year
All-less / faas-scheduling-benchmark
View on GitHub
A benchmark suite for evaluating FaaS scheduler.
☆23Nov 5, 2022Updated 3 years ago
osayamenja / FlashMoE
View on GitHub
Distributed MoE in a Single Kernel [NeurIPS '25]
☆271May 5, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
hao-ai-lab / vllm-ltr
View on GitHub
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆81Nov 4, 2024Updated last year
CGCL-codes / gengar
View on GitHub
Gengar, a distributed shared hybrid memory pool with RDMA support. Gengar allows applications to access remote DRAM/NVM in a large and gl…
☆24May 24, 2022Updated 4 years ago
microsoft / vidur
View on GitHub
Accurate, large-scale, and extensible simulator for LLM inference Systems
☆639Jul 25, 2025Updated 11 months ago
infinigence / FlashOverlap
View on GitHub
A lightweight design for computation-communication overlap.
☆242Jan 20, 2026Updated 5 months ago
AmberLJC / LLMSys-PaperList
View on GitHub
Large Language Model (LLM) Systems Paper List
☆2,167Jun 21, 2026Updated 3 weeks ago
Chen-Binghao / PilotFish
View on GitHub
PilotFish harvests the free GPU cycles of cloud gaming with deep learning training
☆14Jul 2, 2022Updated 4 years ago
leesou / Step-into-RISCV
View on GitHub
TA's implementation for the project of Computer Architecture and Intelligent Chip Design (23 Spring)
☆10May 20, 2023Updated 3 years ago
vickiegpt / OS2019-Labs
View on GitHub
PA + Labs for Operating Systems 2019 course in NJU taught by JYY.
☆12Aug 6, 2019Updated 6 years ago
Multi-V-VM / GPUOS
View on GitHub
Share your GPU without MIG or MPS
☆50Jan 27, 2026Updated 5 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
thomaschlt / mla.c
View on GitHub
Implementation from scratch in C of the Multi-head latent attention used in the Deepseek-v3 technical paper.
☆18Jan 15, 2025Updated last year
efeslab / Nanoflow
View on GitHub
A throughput-oriented high-performance serving framework for LLMs
☆967Mar 29, 2026Updated 3 months ago
open-neutrino / neutrino
View on GitHub
☆263Dec 25, 2025Updated 6 months ago
TreeAI-Lab / Awesome-KV-Cache-Management
View on GitHub
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…
☆336Dec 5, 2025Updated 7 months ago
InternLM / AcmeTrace
View on GitHub
☆179Mar 12, 2024Updated 2 years ago
HPMLL / BurstGPT
View on GitHub
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆277Jun 30, 2026Updated last week
google / rago
View on GitHub
☆31Jun 22, 2025Updated last year