dlsyscourse / hw1Links
☆8Updated 11 months ago
Alternatives and similar repositories for hw1
Users that are interested in hw1 are comparing it to the libraries listed below
Sorting:
- ☆41Updated last year
- A high-performance distributed deep learning system targeting large-scale and automated distributed training. If you have any interests, …☆117Updated last year
- ☆207Updated 8 months ago
- A simple calculation for LLM MFU.☆42Updated 5 months ago
- A minimal implementation of vllm.☆51Updated last year
- Machine Learning Compiler Road Map☆43Updated last year
- [USENIX ATC '24] Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Paral…☆61Updated last year
- ☆19Updated last year
- A practical way of learning Swizzle☆23Updated 6 months ago
- GPTQ inference TVM kernel☆40Updated last year
- Code base and slides for ECE408:Applied Parallel Programming On GPU.☆128Updated 4 years ago
- Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity☆216Updated last year
- ☆96Updated 11 months ago
- High performance Transformer implementation in C++.☆129Updated 6 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆110Updated last year
- ATC23 AE☆46Updated 2 years ago
- ☆40Updated 4 years ago
- Allow torch tensor memory to be released and resumed later☆96Updated last month
- A lightweight design for computation-communication overlap.☆155Updated last month
- ☆92Updated 4 months ago
- ☆85Updated 3 years ago
- Simple PyTorch graph capturing.☆20Updated 2 years ago
- A baseline repository of Auto-Parallelism in Training Neural Networks☆144Updated 3 years ago
- ☆171Updated 2 years ago
- ☆75Updated 4 years ago
- llm theoretical performance analysis tools and support params, flops, memory and latency analysis.☆101Updated last month
- Systems for GenAI☆143Updated 3 months ago
- kvcached: Elastic KV cache for dynamic GPU sharing and efficient multi-LLM inference.☆53Updated this week
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆72Updated 4 years ago
- ☆43Updated last year