glassroom / heinsen_routingLinks
Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
☆171Updated 2 years ago
Alternatives and similar repositories for heinsen_routing
Users that are interested in heinsen_routing are comparing it to the libraries listed below
Sorting:
- Official repository for the paper "A Modern Self-Referential Weight Matrix That Learns to Modify Itself" (ICML 2022 & NeurIPS 2021 Deep R…☆173Updated 6 months ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆98Updated last year
- ☆255Updated 2 years ago
- A case study of efficient training of large language models using commodity hardware.☆68Updated 3 years ago
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆242Updated 2 years ago
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆323Updated last year
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆638Updated 2 years ago
- ☆144Updated 2 years ago
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆116Updated last year
- A repository for log-time feedforward networks☆224Updated last year
- ☆40Updated 3 years ago
- Fast Text Classification with Compressors dictionary☆150Updated 2 years ago
- Automatic gradient descent☆216Updated 2 years ago
- Amos optimizer with JEstimator lib.☆82Updated last year
- Implements the Tsetlin Machine, Coalesced Tsetlin Machine, Convolutional Tsetlin Machine, Regression Tsetlin Machine, and Weighted Tsetli…☆165Updated 4 months ago
- Language Modeling with the H3 State Space Model☆521Updated 2 years ago
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Updated 3 years ago
- A pure NumPy implementation of Mamba.☆222Updated last year
- An interactive exploration of Transformer programming.☆271Updated 2 years ago
- Simple scheduler for running jobs on GPUs☆183Updated 4 years ago
- My explorations into editing the knowledge and memories of an attention network☆35Updated 3 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆189Updated 3 years ago
- git extension for {collaborative, communal, continual} model development☆217Updated last year
- Experiments for efforts to train a new and improved t5☆76Updated last year
- A tool to analyze and debug neural networks in pytorch. Use a GUI to traverse the computation graph and view the data from many different…☆298Updated last year
- ☆20Updated 4 years ago
- [NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"☆80Updated 2 weeks ago
- Library that contains implementations of machine learning components in the hyperbolic space☆144Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Updated last year