glassroom / heinsen_routingLinks
Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
☆169Updated 2 years ago
Alternatives and similar repositories for heinsen_routing
Users that are interested in heinsen_routing are comparing it to the libraries listed below
Sorting:
- Official repository for the paper "A Modern Self-Referential Weight Matrix That Learns to Modify Itself" (ICML 2022 & NeurIPS 2021 Deep R…☆170Updated last year
- Revealing example of self-attention, the building block of transformer AI models☆130Updated 2 years ago
- Code implementing "Efficient Parallelization of a Ubiquitious Sequential Computation" (Heinsen, 2023)☆94Updated 5 months ago
- ☆252Updated last year
- Text generator prompting with Boolean operators☆180Updated 2 years ago
- ☆143Updated 2 years ago
- An interpreter for RASP as described in the ICML 2021 paper "Thinking Like Transformers"☆310Updated 8 months ago
- Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate …☆632Updated last year
- Fast Text Classification with Compressors dictionary☆149Updated last year
- A case study of efficient training of large language models using commodity hardware.☆69Updated 2 years ago
- An interactive exploration of Transformer programming.☆264Updated last year
- A BERT that you can train on a (gaming) laptop.☆207Updated last year
- Swarm training framework using Haiku + JAX + Ray for layer parallel transformer language models on unreliable, heterogeneous nodes☆239Updated 2 years ago
- Another attempt at a long-context / efficient transformer by me☆38Updated 3 years ago
- Language Modeling with the H3 State Space Model☆518Updated last year
- Named tensors with first-class dimensions for PyTorch☆331Updated last year
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆112Updated last year
- ☆376Updated last year
- Automatic gradient descent☆207Updated last year
- A repository for log-time feedforward networks☆220Updated last year
- Small deep learning library written from scratch in Python, using NumPy/CuPy.☆123Updated 2 years ago
- Neural Search☆331Updated last year
- Python library which enables complex compositions of language models such as scratchpads, chain of thought, tool use, selection-inference…☆208Updated 4 months ago
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆196Updated last year
- Absolute minimalistic implementation of a GPT-like transformer using only numpy (<650 lines).☆251Updated last year
- A library for incremental loading of large PyTorch checkpoints☆56Updated 2 years ago
- A playground to make it easy to try crazy things☆33Updated last month
- A pure NumPy implementation of Mamba.☆223Updated 10 months ago
- Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 times faster with just a few l…☆284Updated last week
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated 2 years ago