togethercomputer / Dragonfly
☆75Updated 6 months ago
Alternatives and similar repositories for Dragonfly:
Users that are interested in Dragonfly are comparing it to the libraries listed below
- ☆63Updated 6 months ago
- ☆57Updated 9 months ago
- [WACV 2025] Official implementation of "Online-LoRA: Task-free Online Continual Learning via Low Rank Adaptation" by Xiwen Wei, Guihong L…☆35Updated 5 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆50Updated 4 months ago
- (WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, H…☆84Updated 2 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆107Updated last month
- Multimodal language model benchmark, featuring challenging examples☆163Updated 3 months ago
- ☆46Updated last week
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆36Updated last year
- ☆118Updated 7 months ago
- ☆68Updated 9 months ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆65Updated 6 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated last year
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆82Updated last year
- ☆67Updated 8 months ago
- ☆64Updated last year
- ☆41Updated 8 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- ☆47Updated 11 months ago
- ☆13Updated 4 months ago
- This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"☆129Updated 10 months ago
- Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, model…☆35Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 5 months ago
- Code for "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆12Updated last week
- Fine-tuning OpenAI CLIP Model for Image Search on medical images☆76Updated 3 years ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆42Updated last year
- Code and pretrained models for the paper: "MatMamba: A Matryoshka State Space Model"☆59Updated 4 months ago
- A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.☆91Updated 4 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- ☆45Updated 3 months ago