borjanG / 2023-transformers-rotf
Codes for the paper "A mathematical perspective on Transformers".
☆36Updated 10 months ago
Alternatives and similar repositories for 2023-transformers-rotf
Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below
Sorting:
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆61Updated 11 months ago
- Open source code for EigenGame.☆30Updated 2 years ago
- A State-Space Model with Rational Transfer Function Representation.☆78Updated last year
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆83Updated last year
- Lightning-like training API for JAX with Flax☆38Updated 5 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆78Updated 2 years ago
- Jax like function transformation engine but micro, microjax☆32Updated 6 months ago
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…☆24Updated last year
- The Energy Transformer block, in JAX☆57Updated last year
- ☆32Updated 7 months ago
- ☆31Updated last year
- ☆53Updated 7 months ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆124Updated last year
- ☆114Updated this week
- Flow-matching algorithms in JAX☆90Updated 9 months ago
- Evaluation of neuro-symbolic engines☆35Updated 9 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆99Updated 4 months ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆41Updated 3 months ago
- An AI benchmark for creative, human-like problem solving using Sudoku variants☆43Updated last week
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆170Updated last year
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆71Updated last year
- ☆27Updated 10 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆99Updated 8 months ago
- JAX implementation of Kolmogorov Arnold Networks (KANs).☆10Updated last year
- Official repository for the paper "Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules" (…☆22Updated 2 years ago
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)☆68Updated 9 months ago
- Code for "Training-free Graph Neural Networks and the Power of Labels as Features" (TMLR 2024)☆58Updated 9 months ago
- ☆34Updated 5 months ago
- Pytorch-like dataloaders for JAX.☆82Updated 2 weeks ago
- Multi-framework implementation of Deep Kernel Shaping and Tailored Activation Transformations, which are methods that modify neural netwo…☆70Updated 2 weeks ago