hyperevolnet / Terminator
The official repository for HyperZ⋅Z⋅W Operator Connects Slow-Fast Networks for Full Context Interaction.
☆36Updated last month
Alternatives and similar repositories for Terminator
Users that are interested in Terminator are comparing it to the libraries listed below
Sorting:
- HGRN2: Gated Linear RNNs with State Expansion☆54Updated 8 months ago
- NeuMeta transforms neural networks by allowing a single model to adapt on the fly to different sizes, generating the right weights when n…☆42Updated 6 months ago
- Explorations into the recently proposed Taylor Series Linear Attention☆99Updated 8 months ago
- ☆81Updated last year
- Official PyTorch Implementation of "The Hidden Attention of Mamba Models"☆222Updated 11 months ago
- Minimal Implementation of Visual Autoregressive Modelling (VAR)☆33Updated last month
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆99Updated this week
- Explorations into improving ViTArc with Slot Attention☆41Updated 6 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆99Updated 4 months ago
- Pytorch implementation of the PEER block from the paper, Mixture of A Million Experts, by Xu Owen He at Deepmind☆124Updated 8 months ago
- The Gaussian Histogram Loss (HL-Gauss) proposed by Imani et al. with a few convenient wrappers for regression, in Pytorch☆59Updated 2 weeks ago
- σ-GPT: A New Approach to Autoregressive Models☆64Updated 9 months ago
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"☆51Updated 11 months ago
- Here we will test various linear attention designs.☆60Updated last year
- ☆53Updated 7 months ago
- ☆40Updated 3 months ago
- More dimensions = More fun☆22Updated 9 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmind☆50Updated 3 months ago
- Focused on fast experimentation and simplicity☆72Updated 4 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Updated 3 weeks ago
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆51Updated 3 months ago
- ☆78Updated 8 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆80Updated 2 months ago
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆66Updated 10 months ago
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆106Updated 8 months ago
- ☆51Updated 11 months ago
- Official code for the paper "Attention as a Hypernetwork"☆33Updated 10 months ago
- [NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models".☆38Updated 6 months ago
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆102Updated 11 months ago