borjanG / 2023-transformers-rotfLinks
Codes for the paper "A mathematical perspective on Transformers".
☆38Updated last year
Alternatives and similar repositories for 2023-transformers-rotf
Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below
Sorting:
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆173Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆88Updated last year
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆63Updated last year
- ☆33Updated last year
- JAX implementation of Kolmogorov Arnold Networks (KANs).☆10Updated last year
- A State-Space Model with Rational Transfer Function Representation.☆81Updated last year
- ☆52Updated last year
- Jax like function transformation engine but micro, microjax☆32Updated 11 months ago
- ☆58Updated last year
- ☆33Updated 10 months ago
- Repository for code used in the xVal paper☆144Updated last year
- A simple library for scaling up JAX programs☆143Updated 11 months ago
- Codes for the paper The emergence of clusters in self-attention dynamics.☆17Updated last year
- The Energy Transformer block, in JAX☆58Updated last year
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆74Updated 2 years ago
- A MAD laboratory to improve AI architecture designs 🧪☆129Updated 9 months ago
- Parallelizing non-linear sequential models over the sequence length☆54Updated 3 months ago
- Einsum-like high-level array sharding API for JAX☆35Updated last year
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆72Updated 3 months ago
- ☆41Updated 3 years ago
- Open source code for EigenGame.☆30Updated 2 years ago
- ☆34Updated last year
- Lightning-like training API for JAX with Flax☆42Updated 10 months ago
- Graph neural networks in JAX.☆67Updated last year
- About A collection of AWESOME things about information geometry Topics☆166Updated last year
- Explorations into the recently proposed Taylor Series Linear Attention☆100Updated last year
- Automatic gradient descent☆213Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- ☆120Updated 3 months ago
- LoRA for arbitrary JAX models and functions☆142Updated last year