apsdehal / flava-tutorialsLinks
Tutorials for FLAVA model https://arxiv.org/abs/2112.04482
☆12Updated 3 years ago
Alternatives and similar repositories for flava-tutorials
Users that are interested in flava-tutorials are comparing it to the libraries listed below
Sorting:
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆82Updated 6 months ago
- Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)☆37Updated 3 years ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 4 years ago
- ☆98Updated last year
- ☆40Updated last year
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆91Updated 2 years ago
- ☆66Updated 3 years ago
- ☆101Updated 3 years ago
- Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training☆168Updated 2 years ago
- Generate text captions for images from their embeddings.☆117Updated 2 years ago
- ☆133Updated 2 years ago
- ☆64Updated 4 years ago
- A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"☆83Updated 3 years ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆98Updated 2 years ago
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated last year
- code for the ddp tutorial☆32Updated 3 years ago
- ☆30Updated 2 years ago
- Implementation code of the work "Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning"☆94Updated last year
- Implementation of Memformer, a Memory-augmented Transformer, in Pytorch☆126Updated 5 years ago
- Official Implementation of "Geometric Multimodal Contrastive Representation Learning" (https://arxiv.org/abs/2202.03390)☆28Updated last year
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆28Updated 4 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆81Updated 2 years ago
- Implementation of transformers based architecture in PyTorch.☆55Updated 5 years ago
- Video descriptions of research papers relating to foundation models and scaling☆30Updated 2 years ago
- Easiest way of fine-tuning HuggingFace video classification models☆147Updated 2 years ago
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆233Updated 3 years ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆72Updated last year
- ☆60Updated 3 years ago
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISION☆36Updated 3 years ago