borjanG / 2023-transformers-rotfLinks
Codes for the paper "A mathematical perspective on Transformers".
☆39Updated last year
Alternatives and similar repositories for 2023-transformers-rotf
Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below
Sorting:
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆173Updated 2 years ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆63Updated last year
- The Energy Transformer block, in JAX☆58Updated last year
- ☆41Updated 3 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆88Updated last year
- A State-Space Model with Rational Transfer Function Representation.☆82Updated last year
- ☆33Updated last year
- ☆53Updated last year
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆74Updated 4 months ago
- ☆58Updated last year
- Learning Universal Predictors☆80Updated last year
- Omnigrok: Grokking Beyond Algorithmic Data☆62Updated 2 years ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆79Updated 3 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)☆76Updated last year
- Hierarchical Associative Memory User Experience☆104Updated 3 months ago
- ☆166Updated 2 years ago
- Diffusion model derived evolutionary algorithm☆233Updated 4 months ago
- An AI benchmark for creative, human-like problem solving using Sudoku variants☆105Updated 3 months ago
- ☆116Updated this week
- Jax like function transformation engine but micro, microjax☆33Updated last year
- ☆34Updated 11 months ago
- Open source code for EigenGame.☆33Updated 2 years ago
- Neural Networks and the Chomsky Hierarchy☆211Updated last year
- ☆53Updated last year
- About A collection of AWESOME things about information geometry Topics☆169Updated last year
- Mamba training library developed by kotoba technologies☆69Updated last year
- Easy Hypernetworks in Pytorch and Jax☆105Updated 2 years ago
- ☆220Updated 10 months ago
- Automatic gradient descent☆215Updated 2 years ago