lucidrains / mirasol-pytorchView external linksLinks
Implementation of π» Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
β91Dec 22, 2023Updated 2 years ago
Alternatives and similar repositories for mirasol-pytorch
Users that are interested in mirasol-pytorch are comparing it to the libraries listed below
Sorting:
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topkβ47Jul 16, 2023Updated 2 years ago
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"β59Oct 22, 2023Updated 2 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-expertsβ123Oct 17, 2024Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amountβ¦β53Oct 22, 2023Updated 2 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorchβ104Oct 10, 2023Updated 2 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTβ224Aug 20, 2024Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbolsβ16Aug 3, 2021Updated 4 years ago
- Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorchβ54Mar 30, 2021Updated 4 years ago
- Implementation of the transformer proposed in "Building Blocks for a Complex-Valued Transformer Architecture"β88Oct 13, 2023Updated 2 years ago
- Implementation of Nvidia's NeuralPlexer, for end-to-end differentiable design of functional small-molecules and ligand-binding proteins, β¦β52Nov 20, 2023Updated 2 years ago
- Implementation of MetNet-3, SOTA neural weather model out of Google Deepmind, in Pytorchβ237Nov 16, 2023Updated 2 years ago
- Local Attention - Flax module for Jaxβ22May 26, 2021Updated 4 years ago
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorchβ135Oct 15, 2025Updated 4 months ago
- Utilities for PyTorch distributedβ25Feb 27, 2025Updated 11 months ago
- Implementation of the algorithm detailed in paper "Evolutionary design of molecules based on deep learning and a genetic algorithm"β24Dec 15, 2023Updated 2 years ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β103Dec 22, 2024Updated last year
- Reference implementation of models from Nyonic Model Factoryβ12May 13, 2024Updated last year
- Computing with sed: a compiler from python to sedβ11May 24, 2019Updated 6 years ago
- 2020ε¦ι¨ε½ι ιΆθ‘ζ°ειθζ―建樑倧θ΅-δΌθε₯ζΉζ‘β11Feb 2, 2021Updated 5 years ago
- [ICML2023] Instant Soup Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models. Ajay Jaiswal, Shiwei Liu, Tiβ¦β11Nov 28, 2023Updated 2 years ago
- DatasetResearch: Benchmarking Agent Systems for Demand-Driven Dataset Discoveryβ20Sep 24, 2025Updated 4 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zetaβ13Nov 11, 2024Updated last year
- Python package to download and use the SSB datasetsβ11Aug 3, 2023Updated 2 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Modelβ45Oct 1, 2025Updated 4 months ago
- Fine-tune copilot based on your codebaseβ12Mar 26, 2024Updated last year
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"β182Jun 20, 2024Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pileβ115Mar 22, 2023Updated 2 years ago
- β13May 12, 2025Updated 9 months ago
- ICME2022 Special Session βBeyond Accuracy: Responsible, Responsive, and Robust Multimedia Retrieval ββ12Jun 3, 2024Updated last year
- Original code base for On Pretraining Data Diversity for Self-Supervised Learningβ14Dec 30, 2024Updated last year
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant β¦β15Mar 11, 2024Updated last year
- Implementation of the Belief State Encoder / Decoder in the new breakthrough robotics paper from ETH ZΓΌrichβ84Apr 23, 2025Updated 9 months ago
- A vast array of Multi-Modal Embodied Robotic Foundation Models!β28Mar 18, 2024Updated last year
- Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, β¦β39Aug 3, 2021Updated 4 years ago
- Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixingβ49Jan 27, 2022Updated 4 years ago
- β14Feb 9, 2025Updated last year
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. π The official implementation of https://arxβ¦β29Feb 17, 2025Updated last year
- β16Nov 23, 2023Updated 2 years ago
- ANE accelerated embedding models!β20Dec 11, 2024Updated last year