khalil-research / ViTARC
☆19Updated last month
Alternatives and similar repositories for ViTARC:
Users that are interested in ViTARC are comparing it to the libraries listed below
- ☆59Updated 10 months ago
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆50Updated 4 months ago
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆97Updated 6 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆80Updated last month
- ☆26Updated 10 months ago
- Stick-breaking attention☆50Updated last month
- Bootstrapping ARC☆110Updated 4 months ago
- ☆48Updated last year
- ☆77Updated 8 months ago
- ☆31Updated 5 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆26Updated last year
- ☆134Updated last week
- GoldFinch and other hybrid transformer components☆45Updated 8 months ago
- [ICLR 2025] Code for the paper "Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning"☆47Updated 2 months ago
- ☆26Updated 8 months ago
- Mixture of A Million Experts☆43Updated 8 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆54Updated last month
- ARLC, a probabilistic abductive reasoner for solving Raven's progressive matrices.☆18Updated last month
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆28Updated 11 months ago
- ☆25Updated last year
- ☆53Updated last year
- ☆31Updated 3 months ago
- ☆51Updated 10 months ago
- ☆49Updated last year
- ☆53Updated 9 months ago
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆64Updated 7 months ago
- ☆22Updated 6 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 7 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆29Updated 3 weeks ago