Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training
β171Apr 27, 2023Updated 3 years ago
Alternatives and similar repositories for flamingo-mini
Users that are interested in flamingo-mini are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of 𦩠Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorchβ1,267Oct 18, 2022Updated 3 years ago
- An open-source framework for training large multimodal models.β4,106Aug 31, 2024Updated last year
- β11Nov 21, 2024Updated last year
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.β953Mar 19, 2025Updated last year
- Using pretrained encoder and language models to generate captions from multimedia inputs.β100Mar 11, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- SimVLM ---SIMPLE VISUAL LANGUAGE MODEL PRETRAINING WITH WEAK SUPERVISIONβ36Nov 7, 2022Updated 3 years ago
- π§ Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".β485Oct 30, 2023Updated 2 years ago
- β12Mar 14, 2023Updated 3 years ago
- SVIT: Scaling up Visual Instruction Tuningβ168Jun 20, 2024Updated last year
- β12Feb 11, 2026Updated 4 months ago
- DataComp: In search of the next generation of multimodal datasetsβ782Apr 28, 2025Updated last year
- Implementation of LaTr: Layout-aware transformer for scene-text VQA,a novel multimodal architecture for Scene Text Visual Question Answerβ¦β56Oct 30, 2024Updated last year
- Official implementation of SEED-LLaMA (ICLR 2024).β641Sep 21, 2024Updated last year
- A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"β27Mar 3, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- 𦦠Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing impβ¦β3,409Mar 5, 2024Updated 2 years ago
- β134Dec 22, 2023Updated 2 years ago
- Chain of Images for Intuitively Reasoningβ10Nov 29, 2023Updated 2 years ago
- β25Jun 5, 2023Updated 3 years ago
- This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described β¦β71Dec 20, 2021Updated 4 years ago
- Official code for the ICLR2023 paper Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detectionβ43Jun 4, 2024Updated 2 years ago
- Deep Learning for Video Retrieval by Natural Languageβ11Oct 20, 2019Updated 6 years ago
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parametersβ5,921Mar 14, 2024Updated 2 years ago
- GRiT: A Generative Region-to-text Transformer for Object Understanding (ECCV2024)β341Jan 8, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- β203May 10, 2023Updated 3 years ago
- GIT: A Generative Image-to-text Transformer for Vision and Language