Code publication to the paper "Normalized Attention Without Probability Cage"
☆17Nov 9, 2021Updated 4 years ago
Alternatives and similar repositories for normalized-attention
Users that are interested in normalized-attention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Suite of 500 procedurally-generated NLP tasks to study language model adaptability☆21Jul 16, 2022Updated 3 years ago
- A GPT, made only of MLPs, in Jax☆59Jun 23, 2021Updated 4 years ago
- 🖼️📊☆11Jun 9, 2020Updated 5 years ago
- MXNet implementation of CapsNet☆29Nov 29, 2017Updated 8 years ago
- Code for the paper "Query-Key Normalization for Transformers"☆53Mar 6, 2021Updated 5 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".☆34Jun 11, 2025Updated 11 months ago
- A JAX nn library☆21Sep 9, 2025Updated 8 months ago
- An implementation of Deep Generalized Canonical Correlation Analysis (DGCCA or Deep GCCA) with pytorch.☆51Jun 11, 2020Updated 5 years ago
- MXNet/Gluon implement of L-GM-Loss☆11Oct 17, 2018Updated 7 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing☆49Jan 27, 2022Updated 4 years ago
- Distributed SDDMM Kernel☆12Jul 8, 2022Updated 3 years ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Feb 10, 2023Updated 3 years ago
- Provides LiveReload.js compatible server as Boot task☆11Nov 28, 2017Updated 8 years ago
- Maximal Mutual Information (MMI) Tagger☆26Jun 6, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆21Mar 14, 2021Updated 5 years ago
- Usable implementation of Emerging Symbol Binding Network (ESBN), in Pytorch☆25Jan 6, 2021Updated 5 years ago
- sigma-MoE layer☆21Jan 5, 2024Updated 2 years ago
- Estimating Q(s,s') with Deep Deterministic Dynamics Gradients☆32Feb 21, 2020Updated 6 years ago
- The Codebase for Causal Distillation for Language Models (NAACL '22)☆26May 1, 2022Updated 4 years ago
- We got a stew going!☆27Oct 3, 2023Updated 2 years ago
- An implementation of 2021 paper by Geoffrey Hinton: "How to represent part-whole hierarchies in a neural network" in Pytorch.☆57Mar 29, 2021Updated 5 years ago
- ☆42May 18, 2020Updated 6 years ago
- ☆11Jun 21, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Immutant adapter for Luminus☆10Sep 12, 2020Updated 5 years ago
- Graph-based and Transition-based dependency parsers based on BiLSTMs☆30Jan 4, 2019Updated 7 years ago
- Code for the 2019 TACL Paper "Trick Me If You Can: Human-in-the-loop Generation of Adversarial Question Answering Examples"☆36Jul 3, 2019Updated 6 years ago
- Implements the SM3-II adaptive optimization algorithm for PyTorch.☆33Sep 3, 2024Updated last year
- Fine-Tuning Pre-trained Transformers into Decaying Fast Weights☆19Oct 9, 2022Updated 3 years ago
- A conda-smithy repository for jaxlib.☆17Apr 23, 2026Updated last month
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆39Nov 4, 2025Updated 6 months ago
- Nonequispaced FFTs on GPUs (based on NFFT: http://www.nfft.org)☆11Apr 30, 2018Updated 8 years ago
- Non-invasive wearable circadian rhythm telemonitoring sensors☆18Apr 16, 2024Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- JUnit XML output for Kaocha☆15Oct 2, 2025Updated 7 months ago
- ☆14Jun 26, 2019Updated 6 years ago
- Code and analyses related to the ExaLearn drug design efforts☆11Sep 30, 2020Updated 5 years ago
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Mar 27, 2021Updated 5 years ago
- JAX implementation of Graph Attention Networks☆13Jan 29, 2022Updated 4 years ago
- Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"☆127Apr 5, 2021Updated 5 years ago
- 🤖 Implementation of Self Normalizing Networks (SNN) in PyTorch.☆13Jun 19, 2017Updated 8 years ago