hyperevolnet / Terminator
The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.
☆36Updated 3 weeks ago
Alternatives and similar repositories for Terminator:
Users that are interested in Terminator are comparing it to the libraries listed below
- More dimensions = More fun☆22Updated 8 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆20Updated last week
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆30Updated last month
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- [NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".☆38Updated 5 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 5 months ago
- Implementation of Agent Attention in Pytorch☆89Updated 9 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆50Updated 10 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Explorations into improving ViTArc with Slot Attention☆40Updated 6 months ago
- [ICLR 2025] Official Code Release for Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation☆42Updated last month
- ☆78Updated 8 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆97Updated 8 months ago
- Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)☆45Updated last year
- ☆40Updated 2 months ago
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆218Updated 10 months ago
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆34Updated 5 months ago
- Implementation of Infini-Transformer in Pytorch☆110Updated 3 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 8 months ago
- Official PyTorch Implementation for Task Vectors are Cross-Modal☆22Updated 4 months ago
- ☆51Updated 10 months ago
- σ-GPT: A New Approach to Autoregressive Models☆62Updated 8 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆78Updated last month
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆103Updated 7 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆48Updated 2 months ago
- Code for Principal Masked Autoencoders☆27Updated 3 weeks ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆97Updated 6 months ago
- Official Code Repository for the paper "Continuous Diffusion Model for Language Modeling".☆25Updated last month
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)☆50Updated last month
- Official implementation of "Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers"☆128Updated 2 months ago