aschuh703 / ECE408
☆47Updated 9 months ago
Related projects: ⓘ
- ☆19Updated this week
- Homework solutions for CMU 10-414/714 – Deep Learning Systems: Algorithms and Implementation☆41Updated last year
- Learning materials for Stanford CS149 : Parallel Computing☆159Updated 3 years ago
- ☆81Updated 4 months ago
- Personal Notes for Learning HPC & Parallel Computation [Active Adding New Content]☆56Updated 2 years ago
- A PyTorch-like deep learning framework. Just for fun.☆128Updated 11 months ago
- Solution of Programming Massively Parallel Processors☆29Updated 8 months ago
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆113Updated 3 years ago
- Learning material for CMU10-714: Deep Learning System☆201Updated 4 months ago
- Xiao's CUDA Optimization Guide [Active Adding New Contents]☆223Updated last year
- DGEMM on KNL, achieve 75% MKL☆15Updated 2 years ago
- A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores☆31Updated 9 months ago
- HPC-Lab for High Performance Computing course, 2023 Spring , Tsinghua Universit. 高性能计算导论 @ THU.☆18Updated last year
- Codes & examples for "CUDA - From Correctness to Performance"☆45Updated 2 weeks ago
- UC Berkeley CS152 Computer Architecture and Engineering Labs☆20Updated 4 years ago
- IMPACT GPU Algorithms Teaching Labs☆55Updated last year
- paper and its code for AI System☆202Updated 3 weeks ago
- ☆9Updated 2 years ago
- ☆134Updated last year
- A Easy-to-understand TensorOp Matmul Tutorial☆265Updated this week
- Hands-On Practical MLIR Tutorial☆278Updated 11 months ago
- All Homeworks for TinyML and Efficient Deep Learning Computing 6.5940 • Fall • 2023 • https://efficientml.ai☆108Updated 9 months ago
- performance engineering☆26Updated 2 months ago
- InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management (OSDI'24)☆43Updated 2 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆78Updated last week
- A baseline repository of Auto-Parallelism in Training Neural Networks☆138Updated 2 years ago
- High performance Transformer implementation in C++.☆67Updated this week
- Curated collection of papers in machine learning systems☆123Updated last month
- ☆49Updated 2 years ago
- Here are my personal paper reading notes (including cloud computing, resource management, systems, machine learning, deep learning, and o…☆38Updated last month