apd10 / RzLinear
A compressed alternative to matrix multiplication using state-of-the art compression ROBE-Z
☆9Updated last year
Related projects ⓘ
Alternatives and complementary repositories for RzLinear
- ☆15Updated 2 years ago
- Personal solutions to the Triton Puzzles☆16Updated 4 months ago
- Memory Optimizations for Deep Learning (ICML 2023)☆60Updated 8 months ago
- Efficient 2:4 sparse training algorithms and implementations☆21Updated 5 months ago
- ☆20Updated last year
- Reference implementation of "Softmax Attention with Constant Cost per Token" (Heinsen, 2024)☆24Updated 5 months ago
- Code for the paper: https://arxiv.org/pdf/2309.06979.pdf☆16Updated 3 months ago
- JAX implementations of RWKV☆19Updated last year
- extensible collectives library in triton☆72Updated last month
- Experiment of using Tangent to autodiff triton☆72Updated 10 months ago
- Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)☆18Updated last year
- Unit Scaling demo and experimentation code☆16Updated 8 months ago
- ☆12Updated last month
- ☆11Updated last year
- ☆31Updated 10 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆19Updated this week
- Implementation of Hyena Hierarchy in JAX☆10Updated last year
- ☆33Updated last year
- ☆24Updated last year
- APPy (Annotated Parallelism for Python) enables users to annotate loops and tensor expressions in Python with compiler directives akin to…☆20Updated last week
- sigma-MoE layer☆18Updated 10 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆35Updated 4 months ago
- ☆22Updated 11 months ago
- ☆45Updated 2 weeks ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆45Updated 2 years ago
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆90Updated 4 months ago
- Prototype routines for GPU quantization written using PyTorch.☆19Updated last week
- FlexAttention w/ FlashAttention3 Support☆27Updated last month
- ☆21Updated last month