Lemon-cmd / energy-transformer-graphLinks
This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classification
☆25Updated last year
Alternatives and similar repositories for energy-transformer-graph
Users that are interested in energy-transformer-graph are comparing it to the libraries listed below
Sorting:
- The Energy Transformer block, in JAX☆59Updated last year
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14Updated 3 months ago
- Scalable and Stable Parallelization of Nonlinear RNNS☆20Updated this week
- Parallelizing non-linear sequential models over the sequence length☆54Updated 2 months ago
- A State-Space Model with Rational Transfer Function Representation.☆79Updated last year
- ☆32Updated 10 months ago
- ☆56Updated 10 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆125Updated 8 months ago
- Repository for code used in the xVal paper☆142Updated last year
- Code repository for Trajectory Flow Matching☆78Updated 9 months ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆87Updated last year
- Official Jax Implementation of MD4 Masked Diffusion Models☆123Updated 6 months ago
- Lightning-like training API for JAX with Flax☆42Updated 8 months ago
- 📄Small Batch Size Training for Language Models☆43Updated last week
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆42Updated last year
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆97Updated 2 weeks ago
- Official implementation of Stochastic Taylor Derivative Estimator (STDE) NeurIPS2024☆117Updated 9 months ago
- Physics-inspired transformer modules based on mean-field dynamics of vector-spin models in JAX☆41Updated last year
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Updated 7 months ago
- ☆207Updated 8 months ago
- Code for our paper "Generative Flow Networks for Discrete Probabilistic Modeling"☆84Updated 2 years ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster☆70Updated 3 months ago
- Code for the paper "Function-Space Learning Rates"☆23Updated 2 months ago
- DoG is SGD's Best Friend: A Parameter-Free Dynamic Step Size Schedule☆63Updated 2 years ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆90Updated last year
- nanoGPT-like codebase for LLM training☆102Updated 3 months ago
- Pytorch implementation of a simple way to enable (Stochastic) Frame Averaging for any network☆50Updated last year
- ☆30Updated 5 months ago
- Deep Networks Grok All the Time and Here is Why☆37Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆153Updated 2 months ago