glassroom / heinsen_routing
Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
☆169Updated 2 years ago
Alternatives and similar repositories for heinsen_routing:
Users that are interested in heinsen_routing are comparing it to the libraries listed below
- Official repository for the paper "A Modern Self-Referential Weight Matrix That Learns to Modify Itself" (ICML 2022 & NeurIPS 2021 Deep R…☆170Updated last year
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆632Updated last year
- Language Modeling with the H3 State Space Model☆520Updated last year
- ☆143Updated 2 years ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆92Updated 5 months ago
- Automatic gradient descent☆207Updated last year
- Amos optimizer with JEstimator lib.☆82Updated 11 months ago
- An interactive exploration of Transformer programming.☆263Updated last year
- A repository for log-time feedforward networks☆222Updated last year
- Use context-free grammars with an LLM☆168Updated last year
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆308Updated 7 months ago
- A case study of efficient training of large language models using commodity hardware.☆69Updated 2 years ago
- ☆256Updated 2 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆212Updated 8 months ago
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆207Updated 3 months ago
- Minimalistic, extremely fast, and hackable researcher's toolbench for GPT models in 307 lines of code. Reaches <3.8 validation loss on wi…☆345Updated 9 months ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- a small code base for training large models☆294Updated last week
- ☆252Updated last year
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆196Updated last year
- A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many different…☆287Updated 5 months ago
- RWKV model implementation☆37Updated last year
- [NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"☆69Updated 3 months ago
- Text generator prompting with Boolean operators☆180Updated 2 years ago
- Designing bridge trusses with Pytorch autograd☆61Updated last year
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆116Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆238Updated last year
- Convolutions for Sequence Modeling☆883Updated 10 months ago
- ☆166Updated last year
- GPT, but made only out of MLPs☆88Updated 3 years ago