borjanG / 2023-transformers-rotfLinks
Codes for the paper "A mathematical perspective on Transformers".
☆39Updated last year
Alternatives and similar repositories for 2023-transformers-rotf
Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below
Sorting:
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆64Updated last year
- Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.☆173Updated 2 years ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆89Updated last year
- A State-Space Model with Rational Transfer Function Representation.☆83Updated last year
- The Energy Transformer block, in JAX☆61Updated last year
- Open source code for EigenGame.☆33Updated 2 years ago
- ☆41Updated 3 years ago
- ☆61Updated last year
- Diffusion model derived evolutionary algorithm☆237Updated 5 months ago
- Repository for code used in the xVal paper☆144Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆105Updated last month
- About A collection of AWESOME things about information geometry Topics☆171Updated last year
- Meta-Learning for Compositionality (MLC) for modeling human behavior☆145Updated last year
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆42Updated 2 months ago
- Mamba training library developed by kotoba technologies☆70Updated last year
- Interactive textbook on state-space models☆199Updated last year
- Flexible Inference for Predictive Coding Networks in JAX.☆59Updated this week
- Jax like function transformation engine but micro, microjax☆33Updated last year
- Official Implementation of the ICML 2023 paper: "Neural Wave Machines: Learning Spatiotemporally Structured Representations with Locally …☆76Updated 2 years ago
- ☆35Updated 11 months ago
- Code for "Training-free Graph Neural Networks and the Power of Labels as Features" (TMLR 2024)☆57Updated last year
- Automatic gradient descent☆215Updated 2 years ago
- Meta-learning inductive biases in the form of useful conserved quantities.☆38Updated 3 years ago
- ☆119Updated 5 months ago
- ☆33Updated last year
- Code for the "Cultural evolution in populations of Large Language Models" paper☆32Updated last year
- Swarming algorithms like PSO, Ant Colony, Sakana, and more in PyTorch 😊☆134Updated last month
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆127Updated 2 years ago
- JAX implementation of Kolmogorov Arnold Networks (KANs).☆10Updated last year
- Because we don't have enough time to read everything☆89Updated last year