lucidrains / AMIE-pytorchLinks
Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind
β68Updated last year
Alternatives and similar repositories for AMIE-pytorch
Users that are interested in AMIE-pytorch are comparing it to the libraries listed below
Sorting:
- Implementation of π» Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchβ88Updated last year
- Implementation of Infini-Transformer in Pytorchβ113Updated 9 months ago
- Implementation of the Llama architecture with RLHF + Q-learningβ167Updated 8 months ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-expertsβ119Updated last year
- A repository to house some personal attempts to beat some state-of-the-art for medical datasetsβ100Updated last year
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)β78Updated 2 years ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agentβ92Updated this week
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorchβ25Updated 9 months ago
- β81Updated last year
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amountβ¦β52Updated 2 years ago
- Utilities for Training Very Large Modelsβ58Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models" ICLR 2024β106Updated last year
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"β61Updated 11 months ago
- β69Updated last year
- Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurIβ¦β90Updated last year
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorchβ102Updated 2 years ago
- β80Updated last year
- Explorations into the recently proposed Taylor Series Linear Attentionβ99Updated last year
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorchβ96Updated 2 years ago
- Collection of autoregressive model implementationβ86Updated 6 months ago
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmindβ177Updated last year
- Triton Implementation of HyperAttention Algorithmβ48Updated last year
- β43Updated last year
- HGRN2: Gated Linear RNNs with State Expansionβ55Updated last year
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"β99Updated last year
- β55Updated last year
- Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"β53Updated last year
- Explorations into adversarial losses on top of autoregressive loss for language modelingβ38Updated 8 months ago
- ICLR 2025 - official implementation for "I-Con: A Unifying Framework for Representation Learning"β116Updated 4 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"β84Updated 11 months ago