shan18 / Perceiver-Resampler-XAttn-CaptioningLinks
Generating Captions via Perceiver-Resampler Cross-Attention Networks
☆17Updated 2 years ago
Alternatives and similar repositories for Perceiver-Resampler-XAttn-Captioning
Users that are interested in Perceiver-Resampler-XAttn-Captioning are comparing it to the libraries listed below
Sorting:
- ☆22Updated 7 months ago
- ☆65Updated last year
- ☆51Updated last year
- Code release for "Improved baselines for vision-language pre-training"☆60Updated last year
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆158Updated last year
- Command-line tool for downloading and extending the RedCaps dataset.☆48Updated last year
- codebase for the SIMAT dataset and evaluation☆38Updated 3 years ago
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆114Updated last year
- ☆104Updated last year
- M4 experiment logbook☆58Updated last year
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆115Updated 4 months ago
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆103Updated last year
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆89Updated last year
- Official code and data for NeurIPS 2023 paper "ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of Zoom and Spatial …☆39Updated last year
- Utilities for Training Very Large Models☆58Updated 10 months ago
- Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…☆90Updated last year
- An open source implementation of CLIP.☆32Updated 2 years ago
- ☆83Updated last year
- ☆182Updated 10 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆72Updated last year
- Un-*** 50 billions multimodality dataset☆23Updated 2 years ago
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆55Updated 2 years ago
- gpu tester detects broken and slow gpus in a cluster☆70Updated 2 years ago
- JAX implementation ViT-VQGAN☆83Updated 2 years ago
- ☆208Updated 2 years ago
- ☆51Updated last year
- Patching open-vocabulary models by interpolating weights☆91Updated last year
- These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning☆48Updated last year
- understanding model mistakes with human annotations☆106Updated 2 years ago
- CUDA implementation of autoregressive linear attention, with all the latest research findings☆44Updated 2 years ago