lucidrains / AMIE-pytorchLinks

Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind

☆66

Alternatives and similar repositories for AMIE-pytorch

Users that are interested in AMIE-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
☆89Updated last year
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆110Updated 7 months ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆120Updated 9 months ago
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆166Updated 6 months ago
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆75Updated last year
lucidrains / medical-ai-experiments
A repository to house some personal attempts to beat some state-of-the-art for medical datasets
☆99Updated last year
lucidrains / zorro-pytorch
Implementation of Zorro, Masked Multimodal Transformer, in Pytorch
☆97Updated last year
eth-easl / fmengine
Utilities for Training Very Large Models
☆58Updated 10 months ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆100Updated 11 months ago
lucidrains / GAF-microbatch-pytorch
Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch
☆25Updated 6 months ago
epfml / DenseFormer
☆81Updated last year
bethgelab / frequency_determines_performance
Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…
☆90Updated last year
naver-ai / model-stock
Model Stock: All we need is just a few fine-tuned models
☆119Updated 10 months ago
taufeeque9 / codebook-features
Sparse and discrete interpretability tool for neural networks
☆63Updated last year
lucidrains / MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆103Updated last year
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆54Updated last year
ml-jku / EVA
One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation
☆41Updated 9 months ago
oripress / EntropyEnigma
Official code for the ICML 2024 paper "The Entropy Enigma: Success and Failure of Entropy Minimization"
☆53Updated last year
multimodal-interpretability / maia
Official implementation of MAIA, A Multimodal Automated Interpretability Agent
☆83Updated last month
OpenNLPLab / HGRN2
HGRN2: Gated Linear RNNs with State Expansion
☆55Updated 11 months ago
lucidrains / tableformer-pytorch
Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch
☆39Updated 3 years ago
bfshi / TOAST
Official code for "TOAST: Transfer Learning via Attention Steering"
☆189Updated last year
KindXiaoming / physics_of_skill_learning
We study toy models of skill learning.
☆29Updated 6 months ago
gregorbachmann / scaling_mlps
☆51Updated last year
GenRobo / MatMamba
Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"
☆60Updated 8 months ago
apple / ml-rpm-bench
☆41Updated last year
lucidrains / simple-hierarchical-transformer
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
☆215Updated 11 months ago
ekinakyurek / google-research
Google Research
☆46Updated 2 years ago
lucidrains / quartic-transformer
Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)
☆52Updated 4 months ago
tum-ai / number-token-loss
A regression-alike loss to improve numerical reasoning in language models
☆24Updated 2 weeks ago