dingfen / ParallelComputing
☆14Updated 4 years ago
Alternatives and similar repositories for ParallelComputing:
Users that are interested in ParallelComputing are comparing it to the libraries listed below
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆113Updated 8 months ago
- LLM serving cluster simulator☆93Updated 10 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆113Updated this week
- ☆100Updated 2 weeks ago
- Artifacts for our ASPLOS'23 paper ElasticFlow☆52Updated 10 months ago
- ☆19Updated last year
- HPC-Lab for High Performance Computing course, 2023 Spring , Tsinghua Universit. 高性能计算导论 @ THU.☆21Updated last year
- ☆11Updated 8 months ago
- Summary of some awesome work for optimizing LLM inference☆64Updated last week
- [SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference☆37Updated last month
- ☆35Updated 4 months ago
- Curated collection of papers in MoE model inference☆106Updated last month
- A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems☆153Updated 5 months ago
- ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)☆31Updated 3 months ago
- Open-source implementation for "Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow"☆27Updated 3 months ago
- Homework assignments of Fundamental of Artificial Intelligence (USTC 2020 spring)☆19Updated 4 years ago
- MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)☆49Updated 9 months ago
- team dontpanic in ustc-osh-2020☆9Updated 4 years ago
- [ICML 2024] Serving LLMs on heterogeneous decentralized clusters.☆22Updated 10 months ago
- Compiler for Dynamic Neural Networks☆45Updated last year
- ☆15Updated 11 months ago
- ☆88Updated 4 months ago
- 高级计算机体系结构2020,吴俊敏老师,中科大研究生课程☆61Updated last year
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆233Updated 2 weeks ago
- ☆10Updated this week
- AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)☆82Updated last year
- ☆82Updated 2 months ago
- USTC 体系结构 资料☆13Updated 2 years ago
- nnScaler: Compiling DNN models for Parallel Training☆101Updated last month
- Artifact for USENIX ATC'23: TC-GNN: Bridging Sparse GNN Computation and Dense Tensor Cores on GPUs.☆46Updated last year