ZixuanJiang / pre-rmsnorm-transformer
☆21Updated last year
Related projects ⓘ
Alternatives and complementary repositories for pre-rmsnorm-transformer
- Experiment of using Tangent to autodiff triton☆72Updated 10 months ago
- Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.☆35Updated 4 months ago
- seqax = sequence modeling + JAX☆133Updated 4 months ago
- Sparsity support for PyTorch☆31Updated this week
- A MAD laboratory to improve AI architecture designs 🧪☆95Updated 6 months ago
- ☆207Updated 6 months ago
- A Neural Operator-based Integrated Photonic Device Simulation Framework, NeurOLight NeurIPS 2022☆34Updated last year
- Small scale distributed training of sequential deep learning models, built on Numpy and MPI.☆107Updated last year
- ☆74Updated 11 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆90Updated 3 months ago
- Collection of kernels written in Triton language☆68Updated 3 weeks ago
- Butterfly matrix multiplication in PyTorch☆164Updated last year
- A fully modular framework for modeling and optimizing analog neural networks☆17Updated last month
- Accelerated First Order Parallel Associative Scan☆164Updated 3 months ago
- Triton-based implementation of Sparse Mixture of Experts.☆185Updated last month
- ☆268Updated this week
- ML/DL Math and Method notes☆57Updated 11 months ago
- ☆24Updated last year
- ☆57Updated 2 years ago
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated 5 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆43Updated last year
- Fast Matrix Multiplications for Lookup Table-Quantized LLMs☆187Updated this week
- PB-LLM: Partially Binarized Large Language Models☆148Updated last year
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆281Updated last month
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- Personal solutions to the Triton Puzzles☆16Updated 4 months ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆64Updated this week
- Cataloging released Triton kernels.☆138Updated 2 months ago
- ☆161Updated last year
- ☆132Updated last year