abhijangda / fastkronLinks
☆22Updated 10 months ago
Alternatives and similar repositories for fastkron
Users that are interested in fastkron are comparing it to the libraries listed below
Sorting:
- Sparsity support for PyTorch☆38Updated 9 months ago
- EquiTriton is a project that seeks to implement high-performance kernels for commonly used building blocks in equivariant neural networks…☆66Updated 3 weeks ago
- High-Performance SGEMM on CUDA devices☆115Updated 11 months ago
- Personal solutions to the Triton Puzzles☆20Updated last year
- Parallel framework for training and fine-tuning deep neural networks☆69Updated 2 months ago
- C++ and Python libraries for neural networks.☆18Updated last month
- A Data-Centric Compiler for Machine Learning☆85Updated 3 weeks ago
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…☆25Updated last year
- Experiment of using Tangent to autodiff triton☆81Updated last year
- Tokamax: A GPU and TPU kernel library.☆149Updated 2 weeks ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆66Updated 3 weeks ago
- ☆55Updated last year
- train with kittens!☆63Updated last year
- Material for the SC22 Deep Learning at Scale Tutorial☆41Updated 2 years ago
- Einsum-like high-level array sharding API for JAX☆34Updated last year
- Collection of kernels written in Triton language☆174Updated 9 months ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆338Updated last year
- ☆83Updated 2 years ago
- LLM training in simple, raw C/CUDA☆109Updated last year
- A stand-alone implementation of several NumPy dtype extensions used in machine learning.☆323Updated this week
- Landscaper is a comprehensive Python framework designed for exploring the loss landscapes of deep learning models.☆20Updated last month
- Physics-inspired transformer modules based on mean-field dynamics of vector-spin models in JAX☆46Updated 2 years ago
- Automatic differentiation for Triton Kernels☆29Updated 4 months ago
- ☆28Updated 11 months ago
- JAX implementation of the Mistral 7b v0.2 model☆35Updated last year
- GeoT: Tensor Centric Library for Graph Neural Network via Efficient Segment Reduction on GPU☆24Updated 9 months ago
- ☆62Updated last year
- Memory Optimizations for Deep Learning (ICML 2023)☆114Updated last year
- Proof-of-concept of global switching between numpy/jax/pytorch in a library.☆18Updated last year
- SC24 Deep Learning at Scale Tutorial Material☆33Updated 11 months ago