albanie / foundation-models
Video descriptions of research papers relating to foundation models and scaling
☆30Updated 2 years ago
Alternatives and similar repositories for foundation-models:
Users that are interested in foundation-models are comparing it to the libraries listed below
- ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR 2024, Official Implementation)☆16Updated last year
- ☆64Updated last year
- Code for the paper titled "CiT Curation in Training for Effective Vision-Language Data".☆78Updated 2 years ago
- ☆45Updated 3 months ago
- ☆25Updated 6 months ago
- ☆32Updated last year
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆100Updated last year
- ☆22Updated 3 months ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- ☆29Updated 2 years ago
- Code release for "Improved baselines for vision-language pre-training"☆60Updated 11 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated last year
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆101Updated 7 months ago
- PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)☆33Updated 2 years ago
- Patching open-vocabulary models by interpolating weights☆91Updated last year
- M4 experiment logbook☆57Updated last year
- Official implementation of the paper The Hidden Language of Diffusion Models☆72Updated last year
- [ECCV 2024] Official Release of SILC: Improving vision language pretraining with self-distillation☆42Updated 6 months ago
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- https://arxiv.org/abs/2209.15162☆49Updated 2 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆71Updated last year
- Implementation of Bitune: Bidirectional Instruction-Tuning☆19Updated 10 months ago
- Implementation of 🌻 Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorch☆88Updated last year
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆46Updated last year
- We introduce new approach, Token Reduction using CLIP Metric (TRIM), aimed at improving the efficiency of MLLMs without sacrificing their…☆12Updated 4 months ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Updated last year
- ☆29Updated 2 years ago
- Easily run PyTorch on multiple GPUs & machines☆45Updated last month
- Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"☆35Updated 8 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆72Updated last year