brendel-group / objects-compositional-generalizationLinks
Official code for the paper "Provable Compositional Generalization for Object-Centric Learning" (ICLR 2024, oral)
☆15Updated last year
Alternatives and similar repositories for objects-compositional-generalization
Users that are interested in objects-compositional-generalization are comparing it to the libraries listed below
Sorting:
- Code for GFlowNet-EM, a novel algorithm for fitting latent variable models with compositional latents and an intractable true posterior.☆42Updated last year
- Code Release for "Broken Neural Scaling Laws" (BNSL) paper☆59Updated 2 years ago
- [Preprint] AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling PyTorch Implementation☆37Updated 2 years ago
- Sequence Modeling with Multiresolution Convolutional Memory (ICML 2023)☆127Updated 2 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…☆53Updated 2 years ago
- Revisiting Efficient Training Algorithms For Transformer-based Language Models (NeurIPS 2023)☆81Updated 2 years ago
- Sparse and discrete interpretability tool for neural networks☆64Updated last year
- [NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Se…☆66Updated last year
- Why Do We Need Weight Decay in Modern Deep Learning? [NeurIPS 2024]☆70Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆37Updated 3 years ago
- ☆20Updated 2 years ago
- ☆111Updated 2 years ago
- ☆13Updated last year
- Sequence Modeling with Structured State Spaces☆67Updated 3 years ago
- Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch☆102Updated 2 years ago
- Pytorch Datasets for Easy-To-Hard☆29Updated last year
- Latent Diffusion Language Models☆70Updated 2 years ago
- ☆19Updated 2 years ago
- Latest Weight Averaging (NeurIPS HITY 2022)☆32Updated 2 years ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated 9 months ago
- ☆33Updated last year
- Scalable and Stable Parallelization of Nonlinear RNNS☆28Updated 3 months ago
- The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…☆23Updated 3 years ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆57Updated last year
- Personal implementation of ASIF by Antonio Norelli☆26Updated last year
- ☆67Updated 4 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Updated 2 years ago
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆29Updated 5 years ago
- LayerNorm(SmallInit(Embedding)) in a Transformer to improve convergence☆61Updated 3 years ago
- Official code for "Accelerating Feedforward Computation via Parallel Nonlinear Equation Solving", ICML 2021☆29Updated 4 years ago