alexiglad / EBTLinks
PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
☆563Updated last month
Alternatives and similar repositories for EBT
Users that are interested in EBT are comparing it to the libraries listed below
Sorting:
- This repo contains the code for the paper "Intuitive physics understanding emerges fromself-supervised pretraining on natural videos"☆203Updated 9 months ago
- ☆730Updated this week
- H-Net: Hierarchical Network with Dynamic Chunking☆793Updated 3 weeks ago
- RLP: Reinforcement as a Pretraining Objective☆210Updated 2 months ago
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆129Updated 3 months ago
- ☆78Updated last year
- Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI☆294Updated 6 months ago
- Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)☆523Updated 2 months ago
- ☆161Updated 3 months ago
- Normalized Transformer (nGPT)☆193Updated last year
- [ICLR2025 Spotlight🔥] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters☆580Updated 10 months ago
- ☆642Updated 8 months ago
- ☆303Updated 7 months ago
- Build your own visual reasoning model☆415Updated 3 weeks ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆106Updated last month
- ☆139Updated 2 months ago
- ☆201Updated 3 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆208Updated last month
- Pretraining and inference code for a large-scale depth-recurrent language model☆852Updated last month
- A minimal implementation of DeepMind's Genie world model☆1,061Updated 3 weeks ago
- A Reproduction of GDM's Nested Learning Paper☆407Updated last week
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆330Updated last year
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆340Updated last month
- ☆203Updated last year
- ⏰ AI conference deadline countdowns☆291Updated last week
- Getting crystal-like representations with harmonic loss☆192Updated 8 months ago
- [NeurIPS 2024] Simple and Effective Masked Diffusion Language Model☆580Updated 2 months ago
- dLLM: Simple Diffusion Language Modeling☆1,261Updated last week
- Automating the Search for Artificial Life with Foundation Models!☆445Updated last month
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆272Updated 2 weeks ago