The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention".
☆16Jun 11, 2025Updated 9 months ago
Alternatives and similar repositories for linear_layer_as_attention
Users that are interested in linear_layer_as_attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Nov 20, 2022Updated 3 years ago
- ☆12Nov 15, 2022Updated 3 years ago
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆34Jun 11, 2025Updated 9 months ago
- Code for the paper "Refining Language Model with Compositional Explanation" (NeurIPS 2021)☆11Oct 25, 2021Updated 4 years ago
- Spectral Attention Autoregressive Model (SAAM)☆16Oct 27, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆18Dec 12, 2025Updated 3 months ago
- This is an official pytorch implementation of 'Group-wise Inhibition based Feature Regularization for Robust Classification' (ICCV 2021 a…☆10Dec 10, 2022Updated 3 years ago
- Official implementation of SIGIR 2022 Paper "Task-Oriented Dialogue System as Natural Language Generation".☆14Apr 6, 2022Updated 3 years ago
- [CVPR 2024] Tune-An-Ellipse: CLIP Has Potential to Find What You Want☆14Jan 5, 2025Updated last year
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- Official implementation of EMNLP 2021 Paper "Rethinking Zero-shot Neural Machine Translation: From a Perspective of Latent Variables"☆12May 15, 2023Updated 2 years ago
- ☆15Jun 17, 2019Updated 6 years ago
- Reimplementation of facebook's DinoV2 in JAX. Inference (with pretrained weights) only; training is unsupported.☆12Jun 25, 2024Updated last year
- Version control for my thesis conducted during the 10th semester in Electrical & Computer Engineering at Aristotle University of Thessalo…☆11Jun 26, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- ☆10Feb 12, 2020Updated 6 years ago
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆26Jul 26, 2023Updated 2 years ago
- ☆10Mar 24, 2023Updated 3 years ago
- Official PyTorch implementation of CD-MOE☆12Mar 18, 2026Updated last week
- Data pre-processing and training code on Open-X-Embodiment with pytorch☆11Jan 20, 2025Updated last year
- ☆10Aug 26, 2022Updated 3 years ago
- A Tool for Intersecting Context-Free Grammars☆10Dec 19, 2017Updated 8 years ago
- An implementation of the Prism layer (https://arxiv.org/abs/2011.04823)☆12Nov 13, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Look for arXiv papers in a Zotero library and find available DOIs of published versions.☆14Apr 11, 2024Updated last year
- ☆12Oct 7, 2024Updated last year
- Deep Networks Grok All the Time and Here is Why☆38May 18, 2024Updated last year
- ☆19Sep 2, 2025Updated 6 months ago
- Fairness-Aware Representation Learning by Suppressing Attribute-Class Associations☆13Mar 19, 2026Updated last week
- ☆12Jul 6, 2022Updated 3 years ago
- Reference implementation of models from Nyonic Model Factory☆12May 13, 2024Updated last year
- The Lean Theorem Proving Environment☆15May 7, 2023Updated 2 years ago
- ☆34Aug 5, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- Clustered Compositional Embeddings☆11Oct 25, 2023Updated 2 years ago
- Residual vector quantization for KV cache compression in large language model☆12Oct 22, 2024Updated last year
- A comprehensive collection of multilingual datasets and large language models, meticulously curated for evaluating and enhancing the perf…☆19May 23, 2024Updated last year
- Plugin for QGIS 2.0 that calculates the complexity of polygon features.☆14Oct 23, 2015Updated 10 years ago
- HiCRISP Full Code, containing VirtualHome, pybullet simulator and Real AGV platform.☆15Apr 8, 2024Updated last year
- An npm package to perform a series of probability calculations with Markov Chains and Hidden Markov Models.☆15May 8, 2019Updated 6 years ago