daemyung / practice-triton
삼각형의 실전! Triton
☆14Updated 7 months ago
Related projects: ⓘ
- A performance library for machine learning applications.☆178Updated 11 months ago
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆106Updated 6 months ago
- ☆101Updated last year
- Elixir: Train a Large Language Model on a Small GPU Cluster☆12Updated last year
- Easy and Efficient Quantization for Transformers☆172Updated 2 months ago
- ☆38Updated this week
- OSLO: Open Source for Large-scale Optimization☆172Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆72Updated last month
- Collection of kernels written in Triton language☆48Updated 2 weeks ago
- ☆21Updated last year
- FriendliAI Model Hub☆88Updated 2 years ago
- 🔮 LLM GPU Calculator☆20Updated last year
- ☆22Updated 8 months ago
- Boosting 4-bit inference kernels with 2:4 Sparsity☆47Updated 2 weeks ago
- Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆28Updated 2 months ago
- Official implementation of the ICLR 2024 paper AffineQuant☆16Updated 5 months ago
- [NeurIPS'23] Speculative Decoding with Big Little Decoder☆84Updated 7 months ago
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆156Updated this week
- BCQ tutorial for transformers☆15Updated last year
- Data processing system for polyglot☆88Updated last year
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆134Updated 2 months ago
- Simple implementation of muP, based on Spectral Condition for Feature Learning. The implementation is SGD only, dont use it for Adam☆66Updated last month
- ☆10Updated 5 months ago
- ☆66Updated 3 months ago
- The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"☆57Updated 5 months ago
- minimal C implementation of speculative decoding based on llama2.c☆16Updated 2 months ago
- ☆12Updated 2 months ago
- ☆11Updated 5 months ago
- 1-Click is all you need.☆58Updated 4 months ago