glassroom / heinsen_routing
Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
☆169Updated last year
Alternatives and similar repositories for heinsen_routing:
Users that are interested in heinsen_routing are comparing it to the libraries listed below
- Official repository for the paper "A Modern Self-Referential Weight Matrix That Learns to Modify Itself" (ICML 2022 & NeurIPS 2021 Deep R…☆170Updated last year
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆92Updated 3 months ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆304Updated 6 months ago
- A repository for log-time feedforward networks☆220Updated 11 months ago
- ☆143Updated last year
- ☆253Updated last year
- Amos optimizer with JEstimator lib.☆81Updated 10 months ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆236Updated last year
- Implementation of Block Recurrent Transformer - Pytorch☆218Updated 7 months ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆632Updated last year
- Text generator prompting with Boolean operators☆180Updated 2 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆209Updated 7 months ago
- Language Modeling with the H3 State Space Model☆516Updated last year
- Neural Search☆327Updated 9 months ago
- Fast Text Classification with Compressors dictionary☆151Updated last year
- Revealing example of self-attention, the building block of transformer AI models☆130Updated last year
- Automatic gradient descent☆207Updated last year
- An interactive exploration of Transformer programming.☆261Updated last year
- A Detailed Introduction to My Favorite Statistical Measure, Hoeffding's D☆97Updated last year
- ☆126Updated last year
- Controlled Text Generation via Language Model Arithmetic☆216Updated 6 months ago
- [NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"☆67Updated last month
- An alternative to convolution in neural networks☆254Updated 11 months ago
- [NeurIPS 2023] Learning Transformer Programs☆159Updated 10 months ago
- Neural Networks and the Chomsky Hierarchy☆204Updated 11 months ago
- Experiments for efforts to train a new and improved t5☆77Updated 11 months ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated last year
- An implementation of the Nyströmformer, using Nystrom method to approximate standard self attention☆57Updated 2 years ago
- ☆253Updated 2 years ago