apsdehal / flava-tutorialsLinks
Tutorials for FLAVA model https://arxiv.org/abs/2112.04482
☆12Updated 3 years ago
Alternatives and similar repositories for flava-tutorials
Users that are interested in flava-tutorials are comparing it to the libraries listed below
Sorting:
- Exploring multimodal fusion-type transformer models for visual question answering (on DAQUAR dataset)☆37Updated 3 years ago
- ☆134Updated last year
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- ModelSoups for Tensorflow2 and Torch☆50Updated 3 years ago
- This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have b…☆81Updated 4 months ago
- ☆100Updated 3 years ago
- https://slds-lmu.github.io/seminar_multimodal_dl/☆171Updated 2 years ago
- Easiest way of fine-tuning HuggingFace video classification models☆145Updated 2 years ago
- ☆32Updated 2 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆78Updated 2 years ago
- Implementation of Zorro, Masked Multimodal Transformer, in Pytorch☆96Updated last year
- Generate text captions for images from their embeddings.☆115Updated 2 years ago
- TensorFlow implementation of Barlow Twins (https://arxiv.org/abs/2103.03230).☆41Updated 4 years ago
- Repo from the "Learning with limited labeled data" seminar @ Uni of Tuebingen. A collection of notes, notebooks and slideshows to underst…☆17Updated 2 years ago
- A modular PyTorch library for vision transformer models☆163Updated last year
- PyTorch implementation of FNet: Mixing Tokens with Fourier transforms☆28Updated 4 years ago
- The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…☆229Updated 3 years ago
- Fine-tuning OpenAI CLIP Model for Image Search on medical images☆76Updated 3 years ago
- Repository for Multilingual-VQA task created during HuggingFace JAX/Flax community week.☆34Updated 4 years ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆88Updated last year
- Medium Articles Notebooks and Media Files☆16Updated last year
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNC☆15Updated 2 years ago
- Code for any videos☆29Updated last year
- [TMLR 2022] High-Modality Multimodal Transformer☆117Updated 11 months ago
- Implementation of CaiT models in TensorFlow and ImageNet-1k checkpoints. Includes code for inference and fine-tuning.☆12Updated 2 years ago
- Conference schedule, top papers, and analysis of the data for NeurIPS 2023!☆119Updated last year
- Video descriptions of research papers relating to foundation models and scaling☆30Updated 2 years ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆102Updated 2 years ago
- ☆47Updated 4 years ago
- Implementation of Swin Transformers in TensorFlow along with converted pre-trained models, code for off-the-shelf classification and fine…☆60Updated 3 years ago