togethercomputer / Dragonfly
☆65Updated 3 months ago
Related projects: ⓘ
- E5-V: Universal Embeddings with Multimodal Large Language Models☆148Updated 2 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆76Updated 6 months ago
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆35Updated 8 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆56Updated last month
- ☆62Updated 5 months ago
- ☆55Updated 3 months ago
- (WACV 2025) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hindi, B…☆77Updated last week
- ☆50Updated 2 months ago
- ☆36Updated last month
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆150Updated 5 months ago
- Multimodal language model benchmark, featuring challenging examples☆144Updated last month
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆115Updated 2 weeks ago
- ☆99Updated 3 weeks ago
- ☆84Updated 8 months ago
- ☆116Updated 3 months ago
- ☆38Updated 4 months ago
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- Matryoshka Multimodal Models☆67Updated 3 weeks ago
- [ACL 2024 Findings & ICLR 2024 WS] An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. Specific…☆51Updated last week
- Official implementation for the paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention M…☆95Updated last month
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆73Updated 6 months ago
- LL3M: Large Language and Multi-Modal Model in Jax☆62Updated 4 months ago
- M4 experiment logbook☆56Updated last year
- ☆68Updated last month
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 3 weeks ago
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆93Updated last month
- This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆115Updated 3 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆51Updated this week
- ☆23Updated last month
- Model Stock: All we need is just a few fine-tuned models☆75Updated 5 months ago