opallab / positional_attentionLinks
Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"
โ14Updated last week
Alternatives and similar repositories for positional_attention
Users that are interested in positional_attention are comparing it to the libraries listed below
Sorting:
- ๐งฎ Algebraic Positional Encodings.โ13Updated 4 months ago
- PyTorch implementation for "Long Horizon Temperature Scaling", ICML 2023โ20Updated 2 years ago
- Your favourite classical machine learning algos on the GPU/TPUโ20Updated 5 months ago
- Minimum Description Length probing for neural network representationsโ19Updated 4 months ago
- โ23Updated last week
- An annotated implementation of the Hyena Hierarchy paperโ33Updated 2 years ago
- โ31Updated 7 months ago
- โ32Updated 8 months ago
- โ32Updated last year
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.โ40Updated last year
- โ11Updated 3 months ago
- โ53Updated 8 months ago
- Engineering the state of RNN language models (Mamba, RWKV, etc.)โ32Updated last year
- JAX/Flax implementation of the Hyena Hierarchyโ34Updated 2 years ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]โ19Updated last week
- A simple example of VAEs with KANsโ12Updated last year
- Efficient Scaling laws and collaborative pretraining.โ16Updated 4 months ago
- โ11Updated last year
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learninโฆโ21Updated last week
- Official Code Repository for the paper "Key-value memory in the brain"โ26Updated 3 months ago
- โ21Updated 8 months ago
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]โ66Updated 8 months ago
- โ31Updated last year
- A scalable implementation of diffusion and flow-matching with XGBoost models, applied to calorimeter data.โ18Updated 7 months ago
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)โ12Updated 4 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"โ37Updated last year
- The Energy Transformer block, in JAXโ56Updated last year
- Implementation of Spectral State Space Modelsโ16Updated last year
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"โ27Updated last year
- Implementation for robust ViT and scaled attentionโ19Updated 2 months ago