borjanG / 2023-transformers-rotfLinks

Codes for the paper "A mathematical perspective on Transformers".

☆37

Alternatives and similar repositories for 2023-transformers-rotf

Users that are interested in 2023-transformers-rotf are comparing it to the libraries listed below

Sorting:

vvvm23 / mamba-jax
Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX
☆84Updated last year
KindXiaoming / BIMT
Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable.
☆172Updated 2 years ago
bhoov / energy-transformer-jax
The Energy Transformer block, in JAX
☆59Updated last year
ruke1ire / RTF
A State-Space Model with Rational Transfer Function Representation.
☆79Updated last year
luchris429 / DiscoPOP
Code for Discovering Preference Optimization Algorithms with and for Large Language Models
☆63Updated last year
AllanYangZhou / universal_neural_functional
☆51Updated last year
davisyoshida / lorax
LoRA for arbitrary JAX models and functions
☆140Updated last year
Lemon-cmd / energy-transformer-graph
This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…
☆24Updated last year
nocotan / awesome-information-geometry
About A collection of AWESOME things about information geometry Topics
☆164Updated last year
athms / mad-lab
A MAD laboratory to improve AI architecture designs 🧪
☆123Updated 7 months ago
Sea-Snell / grokking
unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
☆77Updated 3 years ago
enajx / HyperNCA
☆39Updated 3 years ago
nikhilvyas / SOAP
☆200Updated 7 months ago
google-deepmind / eigengame
Open source code for EigenGame.
☆30Updated 2 years ago
dvruette / barrel-rec-pytorch
☆53Updated last year
shikaiqiu / compute-better-spent
☆53Updated 9 months ago
modula-systems / modula
🧱 Modula software package
☆207Updated 3 months ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 5 months ago
SakanaAI / Sudoku-Bench
An AI benchmark for creative, human-like problem solving using Sudoku variants
☆77Updated 2 months ago
google-deepmind / spectral_ssm
☆32Updated last year
lindermanlab / elk
Scalable and Stable Parallelization of Nonlinear RNNS
☆17Updated 5 months ago
kvfrans / splus
☆111Updated last month
AndPotap / einsum-search
☆32Updated 9 months ago
KindXiaoming / Omnigrok
Omnigrok: Grokking Beyond Algorithmic Data
☆58Updated 2 years ago
brendenlake / MLC
Meta-Learning for Compositionality (MLC) for modeling human behavior
☆142Updated last year
iliao2345 / CompressARC
☆167Updated 3 months ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆99Updated 11 months ago
young-geng / scalax
A simple library for scaling up JAX programs
☆139Updated 8 months ago
hadivafaii / IterativeVAE
Brain-like variational inference
☆55Updated 2 months ago
clement-bonnet / lpn
Latent Program Network (from the "Searching Latent Program Spaces" paper)
☆91Updated 4 months ago