HYIUYOU / UELLM
☆14Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for UELLM
- 【2024年新版】国科大 陈云霁 智能计算系统AICS实验代码☆175Updated 5 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆86Updated this week
- Disaggregated serving system for Large Language Models (LLMs).☆350Updated 2 months ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆451Updated this week
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆76Updated 4 months ago
- paper and its code for AI System☆210Updated 2 months ago
- ☆10Updated last week
- ☆12Updated 8 months ago
- A comprehensive guide for beginners in the field of data management and artificial intelligence.☆86Updated this week
- Survey Paper List - Efficient LLM and Foundation Models☆217Updated last month
- The dataset and baseline code for ASC23 LLM inference optimization challenge.☆32Updated 10 months ago
- Curated collection of papers in machine learning systems☆162Updated last month
- Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline mod…☆310Updated 2 months ago
- Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.☆96Updated this week
- High performance Transformer implementation in C++.☆80Updated 2 months ago
- ☆55Updated 2 years ago
- Large Language Model (LLM) Systems Paper List☆638Updated last week
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆36Updated this week
- Accommodating Large Language Model Training over Heterogeneous Environment.☆12Updated 2 weeks ago
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆126Updated 3 weeks ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆123Updated this week
- ☆43Updated last month
- Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you hav…☆14Updated 4 months ago
- Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).☆36Updated last week
- ☆82Updated 6 months ago
- 校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。☆220Updated last week
- LLM serving cluster simulator☆78Updated 6 months ago
- A low-latency & high-throughput serving engine for LLMs☆232Updated 2 months ago
- ☆51Updated last month
- A large-scale simulation framework for LLM inference☆271Updated last month