a2677331 / Stanford-CS142Links

Stanford CS142 Web-Applications

☆6

Alternatives and similar repositories for Stanford-CS142

Users that are interested in Stanford-CS142 are comparing it to the libraries listed below

Sorting:

MoE-Inf / awesome-moe-inference
Curated collection of papers in MoE model inference
☆213Updated 5 months ago
mosharaf / cse585
Advanced Scalable Systems for X
☆37Updated 7 months ago
KuangjuX / Paper-reading
My Paper Reading Lists and Notes.
☆20Updated 6 months ago
PKUFlyingPig / MIT6.5940_TinyML
Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing
☆51Updated 6 months ago
PrincetonUniversity / LLMCompass
☆169Updated last year
aschuh703 / ECE408
☆47Updated last year
caoshiyi / artifacts
☆15Updated 7 months ago
guanrenyang / Programming-Massively-Parallel-Processors
Solution of Programming Massively Parallel Processors
☆47Updated last year
YaoJiayi / CacheBlend
☆123Updated last week
hongzhangblaze / CS854-F24
☆42Updated 8 months ago
fanlai0990 / CS598
Systems for GenAI
☆142Updated 3 months ago
NamanMakkar / ECE5545-ML-Hardware-Systems
This repo contains the Assignments from Cornell Tech's ECE 5545 - Machine Learning Hardware and Systems offered in Spring 2023
☆32Updated 2 years ago
snu-comparch / InfiniGen
InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)
☆143Updated last year
pku-liang / ArkVale
ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction (NIPS'24)
☆42Updated 7 months ago
BUAA-CI-LAB / BHLA.README
☆15Updated 11 months ago
AIS-SNU / Smart-Infinity
[HPCA'24] Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
☆46Updated this week
ovg-project / kvcached
kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.
☆25Updated this week
kcxain / dlsys
My solutions to the assignments of CMU 10-714 Deep Learning Systems 2022
☆40Updated last year
interestingLSY / CUDA-From-Correctness-To-Performance-Code
Codes & examples for "CUDA - From Correctness to Performance"
☆102Updated 9 months ago
SJTU-ReArch-Group / Paper-Reading-List
☆114Updated 3 weeks ago
lipracer / cuda-rt-hook
☆38Updated last week
yifanlu0227 / LLaMA2-7B-on-laptop
Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.
☆17Updated last year
NEO-MLSys25 / NEO
NEO is a LLM inference engine built to save the GPU memory crisis by CPU offloading
☆47Updated last month
PKU-SEC-Lab / AdapMoE
Code release for AdapMoE accepted by ICCAD 2024
☆26Updated 2 months ago
Guangxuan-Xiao / SPMM-CUDA
☆12Updated 3 years ago
MLSys-Learner-Resources / Awesome-MLSys-Blogger
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
☆258Updated 6 months ago
wu-kan / GoPTX
GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving
☆17Updated 2 weeks ago
ranggihwang / Pregated_MoE
☆48Updated last year
PDZZXL / Awesome-LLM-Serving
Large Language Model (LLM) Serving Paper and Resource List
☆24Updated 2 months ago
thu-nics / UniNDP
Github repository of HPCA 2025 paper "UniNDP: A Unified Compilation and Simulation Tool for Near DRAM Processing Architectures"
☆13Updated 7 months ago