borjanG / 2023-transformers-rotf
Codes for the paper "A mathematical perspective on Transformers".
☆36Updated 9 months ago
Alternatives and similar repositories for 2023-transformers-rotf:
Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆83Updated last year
- ☆52Updated 6 months ago
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…☆24Updated last year
- ☆31Updated last year
- A State-Space Model with Rational Transfer Function Representation.☆78Updated 11 months ago
- ☆30Updated 5 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆61Updated 10 months ago
- Code for papers Linear Algebra with Transformers (TMLR) and What is my Math Transformer Doing? (AI for Maths Workshop, Neurips 2022)☆67Updated 8 months ago
- Open source code for EigenGame.☆30Updated last year
- ☆49Updated last year
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- Evaluation of neuro-symbolic engines☆35Updated 8 months ago
- ☆53Updated last year
- Official repository for the paper "Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules" (…☆22Updated 2 years ago
- Jax like function transformation engine but micro, microjax☆30Updated 6 months ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆40Updated 2 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆111Updated 4 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆78Updated 2 years ago
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 8 months ago
- ☆79Updated last year
- Generative cellular automaton-like learning environments for RL.☆19Updated 2 months ago
- ☆32Updated 6 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆81Updated last month
- Wraps PyTorch code in a JIT-compatible way for JAX. Supports automatically defining gradients for reverse-mode AutoDiff.☆51Updated 2 weeks ago
- Implementation of OpenAI's 'Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets' paper.☆36Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆98Updated 4 months ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆37Updated 2 years ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆77Updated 2 months ago
- ☆38Updated 2 years ago
- ☆31Updated 11 months ago