lucidrains / zorro-pytorchLinks

Implementation of Zorro, Masked Multimodal Transformer, in Pytorch

☆96

Alternatives and similar repositories for zorro-pytorch

Users that are interested in zorro-pytorch are comparing it to the libraries listed below

Sorting:

lucidrains / mirasol-pytorch
Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch
☆88Updated last year
lucidrains / MaMMUT-pytorch
Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch
☆102Updated 2 years ago
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆119Updated last year
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
☆78Updated 2 years ago
lucidrains / compositional-attention-pytorch
Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…
☆51Updated 3 years ago
pliang279 / HighMMT
[TMLR 2022] High-Modality Multimodal Transformer
☆117Updated 11 months ago
lucidrains / discrete-key-value-bottleneck-pytorch
Implementation of Discrete Key / Value Bottleneck, in Pytorch
☆88Updated 2 years ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 9 months ago
haoliuhl / language-quantized-autoencoders
Language Quantized AutoEncoders
☆110Updated 2 years ago
lucidrains / gated-state-spaces-pytorch
Implementation of Gated State Spaces, from the paper "Long Range Language Modeling via Gated State Spaces", in Pytorch
☆101Updated 2 years ago
lucidrains / self-reasoning-tokens-pytorch
Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto
☆57Updated last year
lucidrains / agent-attention-pytorch
Implementation of Agent Attention in Pytorch
☆91Updated last year
lucidrains / AMIE-pytorch
Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind
☆68Updated last year
lucidrains / gradnorm-pytorch
A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch
☆110Updated last month
microsoft / ResiDual
ResiDual: Transformer with Dual Residual Connections, https://arxiv.org/abs/2304.14802
☆96Updated 2 years ago
lucidrains / kalman-filtering-attention
Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"
☆59Updated 2 years ago
lucidrains / gateloop-transformer
Implementation of GateLoop Transformer in Pytorch and Jax
☆90Updated last year
pliang279 / FactorCL
[NeurIPS 2023] Factorized Contrastive Learning: Going Beyond Multi-view Redundancy
☆71Updated last year
lucidrains / block-recurrent-transformer-pytorch
Implementation of Block Recurrent Transformer - Pytorch
☆221Updated last year
jiaweizzhao / ZerO-initialization
☆75Updated 2 years ago
lucidrains / bidirectional-cross-attention
A simple cross attention that updates both the source and target in one step
☆182Updated 2 months ago
alextamkin / dabs
A Domain-Agnostic Benchmark for Self-Supervised Learning
☆106Updated 2 years ago
lucidrains / Mega-pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
☆206Updated 2 years ago
lucidrains / perceiver-ar-pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
☆91Updated 2 years ago
lucidrains / medical-ai-experiments
A repository to house some personal attempts to beat some state-of-the-art for medical datasets
☆100Updated last year
lucidrains / pause-transformer
Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount…
☆52Updated 2 years ago
lucidrains / taylor-series-linear-attention
Explorations into the recently proposed Taylor Series Linear Attention
☆99Updated last year
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
☆331Updated 6 months ago
lucidrains / CoLT5-attention
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Updated last year
lucidrains / coordinate-descent-attention
Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topk
☆45Updated 2 years ago