PyTorch Code for Energy-Based Transformers paper -- generalizable reasoning and scalable learning
☆633Apr 21, 2026Updated last month
Alternatives and similar repositories for EBT
Users that are interested in EBT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆93Jun 14, 2024Updated 2 years ago
- minimal Energy-based transformer☆44Dec 11, 2025Updated 6 months ago
- Official implementation of Categorical Flow Maps on text.☆59Feb 16, 2026Updated 4 months ago
- Official repository for the paper "Flow Equivariant Recurrent Neural Networks"☆35Jul 2, 2025Updated 11 months ago
- ☆15Jul 9, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer Designs via Grafting☆74Jan 9, 2026Updated 5 months ago
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆128Sep 22, 2024Updated last year
- This repository contains the official code for Energy Transformer---an efficient Energy-based Transformer variant for graph classificatio…☆27Jan 28, 2024Updated 2 years ago
- [ICLR'25] Artificial Kuramoto Oscillatory Neurons☆125Oct 22, 2025Updated 7 months ago
- ☆13Jul 16, 2024Updated last year
- ☆69Oct 29, 2025Updated 7 months ago
- A pure and fast NumPy implementation of Mamba with cache support.☆18Jun 16, 2024Updated 2 years ago
- [NeurIPS 2024] Simple and Effective Masked Diffusion Language Model☆696Sep 29, 2025Updated 8 months ago
- Code for the Fractured Entangled Representation Hypothesis position paper!☆227Nov 6, 2025Updated 7 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆12Aug 26, 2025Updated 9 months ago
- Implementation of SOAR☆52Sep 17, 2025Updated 9 months ago
- [ICLR 2025] Implementation of "FACTS: A Factored State-Space Framework For World Modelling"☆30Jun 2, 2025Updated last year
- Massively-Parallel Natural Extension of Reference Frame☆34Jan 18, 2023Updated 3 years ago
- Work in progress.☆80Nov 25, 2025Updated 6 months ago
- Continuous Thought Machines, because thought takes time and reasoning is a process.☆1,950Dec 29, 2025Updated 5 months ago
- ☆17Aug 1, 2025Updated 10 months ago
- Gym-like environment to interact with manipulator robots☆42Mar 7, 2025Updated last year
- ICML 2022: Learning Iterative Reasoning through Energy Minimization☆48Feb 27, 2023Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for BLT research paper☆2,046Nov 3, 2025Updated 7 months ago
- ☆222Jan 5, 2026Updated 5 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆113Nov 25, 2025Updated 6 months ago
- The original Shared Recurrent Memory Transformer implementation☆36Jul 11, 2025Updated 11 months ago
- Data Synthesis for Deep Research Based on Semi-Structured Data☆213Dec 18, 2025Updated 6 months ago
- Python wrapper for lean-gym☆13Apr 5, 2023Updated 3 years ago
- Benchmarking Intelligence Efficiency of LM Inference☆58Updated this week
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆65Jan 5, 2026Updated 5 months ago
- Pretraining and inference code for a large-scale depth-recurrent language model☆894Dec 29, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated 2 years ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆61Oct 6, 2024Updated last year
- [ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models☆1,011Jul 10, 2025Updated 11 months ago
- PyTorch code and models for V-JEPA self-supervised learning from video.☆3,960Feb 27, 2025Updated last year
- PoE-World: Compositional World Modeling with Products of Programmatic Experts☆50Feb 5, 2026Updated 4 months ago
- ☆18Mar 9, 2023Updated 3 years ago
- Official Code for Paper "Think While You Generate: Discrete Diffusion with Planned Denoising" [ICLR 2025]☆85Apr 24, 2025Updated last year