linkedlist771 / UCAS-MOOC-AutoWatchLinks

☆21

Alternatives and similar repositories for UCAS-MOOC-AutoWatch

Users that are interested in UCAS-MOOC-AutoWatch are comparing it to the libraries listed below

Sorting:

PKUFlyingPig / MIT6.5940_TinyML
Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing
☆51Updated 6 months ago
ifromeast / AI_analysis
analyse problems of AI with Math and Code
☆18Updated last week
TreeAI-Lab / Awesome-KV-Cache-Management
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…
☆172Updated last week
pinchenjohnny / Distributed_System_Architecture
高级计算机体系结构2020，吴俊敏老师，中科大研究生课程
☆72Updated last year
PKU-SEC-Lab / AdapMoE
Code release for AdapMoE accepted by ICCAD 2024
☆29Updated 3 months ago
pku-liang / ArkVale
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)
☆42Updated 7 months ago
sjtu-zhao-lab / ClusterKV
ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression (DAC'25)
☆11Updated last month
HarryWu99 / llm_kvcache_sparsity
Implement some method of LLM KV Cache Sparsity
☆35Updated last year
FFY0 / AdaKV
The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference
☆87Updated last month
zyxxmu / cam
Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference
☆42Updated last year
hao-ai-lab / vllm-ltr
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
☆51Updated 9 months ago
PKU-SEC-Lab / HybriMoE
[DAC'25] Official implement of "HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference"
☆64Updated last month
mdy666 / mdy_triton
☆140Updated last month
GATECH-EIC / Edge-LLM
[DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive La…
☆61Updated last year
doongz / aics
智能计算系统 AI Computing Systems 陈云霁
☆164Updated 2 years ago
d-matrix-ai / keyformer-llm
☆54Updated last year
zcli-charlie / Awesome-KV-Cache
☆75Updated 9 months ago
smart-lty / ParallelSpeculativeDecoding
[ICLR 2025] PEARL: Parallel Speculative Decoding with Adaptive Draft Length
☆102Updated 3 months ago
YaoJiayi / CacheBlend
☆125Updated 3 weeks ago
TonyHu2001s / UCAS-AICS
国科大《智能计算系统》课程实验
☆23Updated last year
interestingLSY / swiftLLM
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …
☆238Updated last month
SuDIS-ZJU / llm-inference-all-in-one
☆13Updated 5 months ago
LiuXiaoxuanPKU / OSD
☆54Updated 8 months ago
jhdjames37 / RISC-V-CPU-simple
清华大学《计算机组成原理》大实验——五级流水线 RISC-V 处理器。「奋战三星期，造台计算机」
☆18Updated 2 years ago
66RING / LongShortTokenDecoding
Long short token decoding speed up 4x for long context LLM. A hundred lines of core code. Open source for learning.
☆8Updated last year
npu-cs / SurviveNWPU-CSManual
A Manual on Surviving in CS of NWPU
☆52Updated last year
Therock90421 / 21-22-Intelligent-Computing-Systems
21-22春智能计算系统实验中国科学院大学
☆33Updated 3 years ago
YJHMITWEB / ExFlow
Explore Inter-layer Expert Affinity in MoE Model Inference
☆12Updated last year
MoE-Inf / awesome-moe-inference
Curated collection of papers in MoE model inference
☆220Updated this week
cat538 / SKVQ
[COLM 2024] SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
☆23Updated 10 months ago