shan18 / Perceiver-Resampler-XAttn-CaptioningLinks
Generating Captions via Perceiver-Resampler Cross-Attention Networks
☆17Updated 3 years ago
Alternatives and similar repositories for Perceiver-Resampler-XAttn-Captioning
Users that are interested in Perceiver-Resampler-XAttn-Captioning are comparing it to the libraries listed below
Sorting:
- ☆65Updated 2 years ago
- ☆23Updated last year
- M4 experiment logbook☆58Updated 2 years ago
- ☆103Updated 2 years ago
- ☆34Updated last year
- ☆211Updated 3 years ago
- Code release for "Improved baselines for vision-language pre-training"☆62Updated last year
- ☆56Updated 2 years ago
- ☆64Updated last year
- Replicating and dissecting the git-re-basin project in one-click-replication Colabs☆37Updated 3 years ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆160Updated last year
- ☆52Updated last year
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆91Updated 2 years ago
- Official code for "TOAST: Transfer Learning via Attention Steering"☆188Updated 2 years ago
- Un-*** 50 billions multimodality dataset☆23Updated 3 years ago
- Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"☆129Updated 2 weeks ago
- Command-line tool for downloading and extending the RedCaps dataset.☆50Updated 2 years ago
- ☆191Updated last year
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆116Updated last year
- Official code and data for NeurIPS 2023 paper "ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of Zoom and Spatial …☆40Updated 2 years ago
- Switch EMA: A Free Lunch for Better Flatness and Sharpness☆28Updated last year
- Official PyTorch Implementation for Paper "No More Adam: Learning Rate Scaling at Initialization is All You Need"☆55Updated last year
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆220Updated 2 years ago
- Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.☆46Updated last week
- Utilities for Training Very Large Models☆58Updated last year
- ☆53Updated 2 years ago
- Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurI…☆94Updated last year
- WIP☆93Updated last year
- ☆16Updated last year
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆57Updated 3 years ago