dcaox / MIT6.5940Links

模型加速/模型压缩（已完成所有Lab）

☆11

Alternatives and similar repositories for MIT6.5940

Users that are interested in MIT6.5940 are comparing it to the libraries listed below

Sorting:

JackonYang / hands-on-tvm
hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.
☆49Updated 2 years ago
ArthurinRUC / cutlass-notes
From Minimal GEMM to Everything
☆101Updated last month
GetUpEarlier / minit
☆27Updated last year
iclementine / optimize_softmax
Optimize softmax in triton in many cases
☆22Updated last year
harleyszhang / lite_llama
A light llama-like llm inference framework based on the triton kernel.
☆171Updated last month
lzyrapx / LeetGPU
🌈 Solutions of LeetGPU
☆70Updated last week
yifanlu0227 / LLaMA2-7B-on-laptop
Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.
☆18Updated 2 years ago
interestingLSY / CUDA-From-Correctness-To-Performance-Code
Codes & examples for "CUDA - From Correctness to Performance"
☆121Updated last year
AdvancedCompiler / AdvancedCompiler
先进编译实验室的个人主页
☆197Updated 3 months ago
InfiniTensor / RefactorGraph
分层解耦的深度学习推理引擎
☆79Updated 11 months ago
xgqdut2016 / hpc_project
some hpc project for learning
☆26Updated last year
piDack / The-ans-for-Programming-Massively-Parallel-Processor
大规模并行处理器编程实战第二版答案
☆35Updated 3 years ago
caiwanxianhust / FasterLLaMA
使用 CUDA C++ 实现的 llama 模型推理框架
☆64Updated last year
violetDelia / MLIR-Tutorial
☆79Updated 3 months ago
mrzhuzhe / riven
CPU Memory Compiler and Parallel programing
☆26Updated last year
zjhellofss / KuiperCourse
b站上的课程
☆82Updated 2 years ago
muyuuuu / CUFX
晚上下班不刷手机，学点什么。系列一：CUDA 计算框架 CUFX (Cuda Framework eXtended)。
☆16Updated last year
RussWong / LLM-engineering
☆26Updated 6 months ago
l1nkr / DL-Compiler-Navigation
Machine Learning Compiler Road Map
☆46Updated 2 years ago
mit-han-lab / parallel-computing-tutorial
☆177Updated 2 years ago
zjhellofss / triton_course
☆40Updated 8 months ago
LDLINGLINGLING / nano_vllm_note
注释的nano_vllm仓库，并且完成了MiniCPM4的适配以及注册新模型的功能
☆158Updated 5 months ago
harleyszhang / llm_counts
llm theoretical performance analysis tools and support params, flops, memory and latency analysis.
☆115Updated 6 months ago
Pegessi / conv2d_direct
☆36Updated 2 years ago
doongz / mlc-ai
机器学习编译陈天奇
☆53Updated 3 years ago
Qwesh157 / conv_op_optimization
This project is about convolution operator optimization on GPU, include GEMM based (Implicit GEMM) convolution.
☆43Updated 4 months ago
CalvinXKY / BasicCUDA
A tutorial for CUDA&PyTorch
☆227Updated 2 weeks ago
Syencil / Programming_Massively_Parallel_Processors
CUDA 6大并行计算模式代码与笔记
☆61Updated 5 years ago
xgqdut2016 / cuda_code
easy cuda code
☆95Updated last year
openmlsys / openmlsys-cuda
Tutorials for writing high-performance GPU operators in AI frameworks.
☆136Updated 2 years ago