Relaxed-System-Lab / HKUST-COMP4901Y-2024spring

Course Material for the UG Course COMP4901Y

☆54

Alternatives and similar repositories for HKUST-COMP4901Y-2024spring:

Users that are interested in HKUST-COMP4901Y-2024spring are comparing it to the libraries listed below

fanlai0990 / CS598
Systems for GenAI
☆99Updated this week
Relaxed-System-Lab / COMP6211J_Course_HKUST
☆40Updated 2 months ago
hao-ai-lab / vllm-ltr
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆39Updated 3 months ago
October2001 / Awesome-KV-Cache-Compression
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
☆296Updated 2 weeks ago
hongzhangblaze / CS854-F24
☆35Updated 3 months ago
UChi-JCL / CacheGen
☆87Updated 4 months ago
Hsword / Awesome-Machine-Learning-System-Papers
☆64Updated 2 years ago
LiuXiaoxuanPKU / OSD
☆41Updated 2 months ago
Zefan-Cai / Awesome-LLM-KV-Cache
Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.
☆206Updated 2 months ago
zhengzangw / Sequence-Scheduling
PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
☆81Updated last year
openpsi-project / ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
☆217Updated last month
HPMLL / BurstGPT
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
☆145Updated 4 months ago
Ying1123 / VTC-artifact
☆20Updated 8 months ago
zcli-charlie / Awesome-KV-Cache
☆56Updated 4 months ago
dilab-zju / self-speculative-decoding
Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**
☆159Updated this week
microsoft / ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable
☆143Updated 4 months ago
MLSys-Learner-Resources / Awesome-MLSys-Blogger
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
☆175Updated last month
interestingLSY / swiftLLM
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …
☆136Updated 7 months ago
Thesys-lab / Helix-ASPLOS25
Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"
☆18Updated 2 months ago
pku-liang / ArkVale
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)
☆26Updated 2 months ago
wangqinsi1 / CoreInfer
This is the official Python version of CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Act…
☆15Updated 3 months ago
Guangxuan-Xiao / GSM8K-eval
☆30Updated last year
YaoJiayi / CacheBlend
☆76Updated last month
TreeAI-Lab / Awesome-KV-Cache-Management
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…
☆62Updated this week
galeselee / Awesome_LLM_System-PaperList
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…
☆220Updated last month
LoongServe / LoongServe
☆83Updated 3 months ago
hemingkx / Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
☆222Updated 3 months ago
henryzhongsc / longctx_bench
Official implementation for Yuan & Liu & Zhong et al., KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark o…
☆65Updated last month