alexiglad / EBTLinks
PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
☆224Updated this week
Alternatives and similar repositories for EBT
Users that are interested in EBT are comparing it to the libraries listed below
Sorting:
- ☆66Updated last year
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆140Updated last month
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆169Updated 4 months ago
- Official Jax Implementation of MD4 Masked Diffusion Models☆112Updated 4 months ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆95Updated last month
- H-Net: Hierarchical Network with Dynamic Chunking☆115Updated this week
- My take on Flow Matching☆66Updated 6 months ago
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge faster☆70Updated last month
- Code accompanying the paper "Generalized Interpolating Discrete Diffusion"☆91Updated last month
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆105Updated 9 months ago
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆287Updated last month
- The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.☆38Updated 3 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"☆101Updated 3 weeks ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆103Updated 2 months ago
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆167Updated 2 weeks ago
- ☆179Updated 7 months ago
- Implementation of a multimodal diffusion transformer in Pytorch☆102Updated last year
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆81Updated 2 months ago
- σ-GPT: A New Approach to Autoregressive Models☆65Updated 11 months ago
- Implementation of a framework for Genie2 in Pytorch☆149Updated 6 months ago
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆185Updated 3 months ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch☆127Updated 2 months ago
- Esoteric Language Models☆87Updated 3 weeks ago
- Normalized Transformer (nGPT)☆184Updated 7 months ago
- Official implementation of the paper: "ZClip: Adaptive Spike Mitigation for LLM Pre-Training".☆128Updated 2 weeks ago
- Pytorch implementation of Evolutionary Policy Optimization, from Wang et al. of the Robotics Institute at Carnegie Mellon University☆97Updated last week
- Explorations into the recently proposed Taylor Series Linear Attention☆99Updated 10 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆34Updated 4 months ago
- ☆164Updated 3 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆110Updated 10 months ago