borjanG / 2023-transformers-rotf
Codes for the paper "A mathematical perspective on Transformers".
β28Updated 2 months ago
Related projects: β
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXβ74Updated 7 months ago
- A MAD laboratory to improve AI architecture designs π§ͺβ84Updated 4 months ago
- β25Updated 5 months ago
- Lightning-like training API for JAX with Flaxβ28Updated 4 months ago
- Open source code for EigenGame.β28Updated last year
- β53Updated 8 months ago
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch πβ106Updated last week
- GBRL-based Actor-Critic algorithms implemented in stable-baselines3β20Updated this week
- β42Updated 3 months ago
- A State-Space Model with Rational Transfer Function Representation.β61Updated 4 months ago
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)β61Updated last month
- β28Updated last week
- A simple library for scaling up JAX programsβ116Updated last month
- Simple (and cheap!) neural network uncertainty estimationβ47Updated last week
- About A collection of AWESOME things about information geometry Topicsβ137Updated 2 months ago
- Pytorch-like dataloaders in JAX.β52Updated last month
- β32Updated 9 months ago
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatioβ¦β18Updated 7 months ago
- Meta-learning inductive biases in the form of useful conserved quantities.β37Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β82Updated 3 weeks ago
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Scheduleβ57Updated last year
- Wraps PyTorch code in a JIT-compatible way for JAX. Supports automatically defining gradients for reverse-mode AutoDiff.β34Updated last month
- Evaluation of neuro-symbolic enginesβ29Updated last month
- β40Updated 2 months ago
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.β159Updated last year
- Flow-matching algorithms in JAXβ62Updated last month
- Scalable neural net training via automatic normalization in the modular norm.β108Updated last month
- β27Updated 2 months ago
- Implementation of GateLoop Transformer in Pytorch and Jaxβ86Updated 3 months ago
- Repository for code used in the xVal paperβ110Updated 5 months ago