lucidrains / MaMMUT-pytorchView external linksLinks
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆104Oct 10, 2023Updated 2 years ago
Alternatives and similar repositories for MaMMUT-pytorch
Users that are interested in MaMMUT-pytorch are comparing it to the libraries listed below
Sorting:
- A Transformer made of Rotation-equivariant Attention using Vector Neurons☆101Aug 1, 2023Updated 2 years ago
- [ICML 2022] Official implementation of "Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems".☆12Jul 19, 2022Updated 3 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer models☆87Nov 1, 2025Updated 3 months ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- Implementation of Metaformer, but in an autoregressive manner☆26Jun 21, 2022Updated 3 years ago
- Implementation of LogAvgExp for Pytorch☆37Apr 10, 2025Updated 10 months ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆91Dec 22, 2023Updated 2 years ago
- Implementation and explorations into Blackbox Gradient Sensing (BGS), an evolutionary strategies approach proposed in a Google Deepmind p…☆20Jul 20, 2025Updated 6 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT☆224Aug 20, 2024Updated last year
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk☆47Jul 16, 2023Updated 2 years ago
- Implementation of GateLoop Transformer in Pytorch and Jax☆92Jun 18, 2024Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Aug 3, 2021Updated 4 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts☆123Oct 17, 2024Updated last year
- Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI☆98Dec 31, 2021Updated 4 years ago
- Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch☆655Dec 27, 2024Updated last year
- Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch☆55Nov 25, 2024Updated last year
- Implementation of MedSegDiff in Pytorch - SOTA medical segmentation using DDPM and filtering of features in fourier space☆237Dec 3, 2023Updated 2 years ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆422Jan 6, 2025Updated last year
- Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in P…☆207Feb 14, 2024Updated 2 years ago
- Research code of Cycle Generative Adversarial Networks for Complementary Item Recommendations.☆21Mar 9, 2023Updated 2 years ago
- Fast Inference in Denoising Diffusion Models via MMD Finetuning☆18Dec 4, 2023Updated 2 years ago
- Implementation of Dreamcraft3D, 3D content generation in Pytorch☆81Oct 29, 2023Updated 2 years ago
- Implementation of Flash Attention in Jax☆225Mar 1, 2024Updated last year
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆344Apr 2, 2025Updated 10 months ago
- Graph Flow: Cross-layer Graph Flow Distillation for Dual Efficient Medical Image Segmentation☆18Dec 29, 2022Updated 3 years ago
- Implementation of Multistream Transformers in Pytorch☆54Jul 31, 2021Updated 4 years ago
- Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch☆1,200Dec 12, 2023Updated 2 years ago
- Pytorch implementation of our paper accepted by ECCV 2022 -- ARM: Any-Time Super-Resolution Method (https://arxiv.org/abs/2203.10812)☆82Sep 28, 2022Updated 3 years ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆126Aug 25, 2025Updated 5 months ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- Thispersondoesnotexist went down, so this time, while building it back up, I am going to open source all of it.☆91Aug 26, 2023Updated 2 years ago
- Implementation of Discrete Key / Value Bottleneck, in Pytorch☆88Jul 9, 2023Updated 2 years ago
- [WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection☆16Mar 23, 2025Updated 10 months ago
- Directed masked autoencoders☆14Feb 5, 2026Updated last week
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Implementation of the proposed Spline-Based Transformer from Disney Research☆105Nov 9, 2024Updated last year
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆98Oct 20, 2023Updated 2 years ago
- Implementation of MetNet-3, SOTA neural weather model out of Google Deepmind, in Pytorch☆237Nov 16, 2023Updated 2 years ago
- Implementation of Block Recurrent Transformer - Pytorch☆224Aug 20, 2024Updated last year