PKU-DAIR / Starter-Guide
A comprehensive guide for beginners in the field of data management and artificial intelligence.
β86Updated this week
Related projects β
Alternatives and complementary repositories for Starter-Guide
- a curated list of high-quality papers on resource-efficient LLMs π±β77Updated 2 weeks ago
- Course Material for the UG Course COMP4901Yβ47Updated 6 months ago
- Awesome-LLM-KV-Cache: A curated list of πAwesome LLM KV Cache Papers with Codes.β96Updated this week
- εΈ¦δΈζε―Όθ―»ηPhDη³θ―·ζ»η₯ζΆιβ52Updated last year
- β20Updated last week
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papβ¦β168Updated last week
- Survey Paper List - Efficient LLM and Foundation Modelsβ217Updated last month
- This repository is established to store personal notes and annotated papers during daily research.β86Updated this week
- My learning notes/codes for ML SYS.β34Updated this week
- MagicPIG: LSH Sampling for Efficient LLM Generationβ45Updated 2 weeks ago
- Implement some method of LLM KV Cache Sparsityβ21Updated 5 months ago
- π° Must-read papers on KV Cache Compression (constantly updating π€).β123Updated this week
- A PyTorch-like deep learning framework. Just for fun.β134Updated last year
- β55Updated 2 years ago
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)β76Updated 4 months ago
- β43Updated last month
- β51Updated last month
- β63Updated last month
- Systems for GenAIβ67Updated this week
- β94Updated 9 months ago
- β12Updated 7 months ago
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systemsβ126Updated 3 weeks ago
- High performance Transformer implementation in C++.β80Updated last month
- Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).β36Updated this week
- [OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variableβ111Updated last month
- paper and its code for AI Systemβ210Updated 2 months ago
- β88Updated 3 years ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.β15Updated 6 months ago
- Explore Inter-layer Expert Affinity in MoE Model Inferenceβ5Updated 6 months ago
- ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agentsβ74Updated last month