yifanlu0227 / MIT-6.5940
All Homeworks for TinyML and Efficient Deep Learning Computing 6.5940 • Fall • 2023 • https://efficientml.ai
☆132Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for MIT-6.5940
- List of papers related to neural network quantization in recent AI conferences and journals.☆451Updated last month
- Learning material for CMU10-714: Deep Learning System☆214Updated 5 months ago
- ☆142Updated last year
- Homework solutions for CMU 10-414/714 – Deep Learning Systems: Algorithms and Implementation☆41Updated last year
- Awesome LLM pruning papers all-in-one repository with integrating all useful resources and insights.☆36Updated this week
- Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of pap…☆166Updated this week
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.☆53Updated 5 months ago
- Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.☆10Updated 11 months ago
- A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including languag…☆152Updated last week
- flash attention tutorial written in python, triton, cuda, cutlass☆194Updated 4 months ago
- Awesome list for LLM pruning.☆159Updated 3 weeks ago
- Code Repository of Evaluating Quantized Large Language Models☆103Updated 2 months ago
- ☆85Updated 3 months ago
- A PyTorch-like deep learning framework. Just for fun.☆134Updated last year
- paper and its code for AI System☆210Updated 2 months ago
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆117Updated 3 years ago
- ☆283Updated 7 months ago
- This repository is established to store personal notes and annotated papers during daily research.☆86Updated this week
- A Easy-to-understand TensorOp Matmul Tutorial☆287Updated last month
- ☆82Updated 5 months ago
- learning how CUDA works☆162Updated 2 months ago
- A CUDA tutorial to make people learn CUDA program from 0☆195Updated 4 months ago
- hands on model tuning with TVM and profile it on a Mac M1, x86 CPU, and GTX-1080 GPU.☆39Updated last year
- ☆47Updated 11 months ago
- 📰 Must-read papers and blogs on Speculative Decoding ⚡️☆447Updated this week
- ☆164Updated 2 months ago
- ☆12Updated 7 months ago
- ☆79Updated 11 months ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆80Updated 2 months ago
- Summary of some awesome work for optimizing LLM inference☆34Updated this week