Unakar / Efficient_AILinks
此项目是我个人对MIT 6.5940 课程作业的答案,学习笔记和心得。
☆14Updated last year
Alternatives and similar repositories for Efficient_AI
Users that are interested in Efficient_AI are comparing it to the libraries listed below
Sorting:
- Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing☆47Updated 5 months ago
- The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)☆245Updated 5 months ago
- ☆54Updated 2 months ago
- Federated Learning Multi-Machine Simulator: A Docker-based federated learning framework for simulating multi-machine training☆9Updated last year
- Codes & examples for "CUDA - From Correctness to Performance"☆100Updated 8 months ago
- Summer Training 2023, SAST 9.☆42Updated last year
- A comprehensive guide for beginners in the field of data management and artificial intelligence.☆302Updated 2 months ago
- ☆137Updated last month
- A repository sharing the literatures about large language models☆94Updated 3 weeks ago
- Puzzles for learning Triton, play it with minimal environment configuration!☆367Updated 6 months ago
- Wiki fo HPC☆114Updated 5 months ago
- [NeurIPS 2024] The official implementation of ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification☆22Updated 2 months ago
- 本仓库是关于大模型面试中常见面试试题和面试经验的整理。这里收集了各类与大模型相关的面试题目,并提供详细的解答和分析。本仓库由上海交大交影社区维护☆95Updated 10 months ago
- A sparse attention kernel supporting mix sparse patterns☆238Updated 4 months ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆71Updated 2 months ago
- This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding co…☆154Updated last week
- my cs notes☆51Updated 8 months ago
- XAttention: Block Sparse Attention with Antidiagonal Scoring☆166Updated this week
- qwen-nsa☆67Updated 2 months ago
- Sharing my research toolchain☆84Updated last year
- A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of …☆224Updated 2 weeks ago
- TonyCrane's slide template for reveal-md☆99Updated 11 months ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆459Updated this week
- Implement Flash Attention using Cute.☆87Updated 6 months ago
- Learning material for CMU10-714: Deep Learning System☆256Updated last year
- Personal Transformer models training library☆22Updated this week
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆233Updated 2 weeks ago
- My solutions to the assignments of CMU 10-714 Deep Learning Systems 2022☆38Updated last year
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆15Updated 3 months ago
- CS149 xmake version☆41Updated last year