Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆45Jun 14, 2024Updated 2 years ago
Alternatives and similar repositories for Sparkles
Users that are interested in Sparkles are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Oct 10, 2023Updated 2 years ago
- Official repo for StableLLAVA☆95Dec 22, 2023Updated 2 years ago
- [ICML2022] "Identity-Disentangled Adversarial Augmentation for Self-Supervised Learning"☆10Jul 24, 2022Updated 3 years ago
- The released data for paper "Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models".☆34Sep 16, 2023Updated 2 years ago
- Findings of ACL 2021☆24May 8, 2021Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code associated with the paper: "Few-Shot Self-Rationalization with Natural Language Prompts"☆12Apr 27, 2022Updated 4 years ago
- Pipeline to scrape prompt + image url pairs from LAION `share-dalle-3` discord channel☆11Oct 10, 2023Updated 2 years ago
- The official GitHub page for ''What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Ins…☆19Nov 10, 2023Updated 2 years ago
- Code, data, models for the Sherlock corpus☆62Nov 11, 2022Updated 3 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- ☆19Dec 6, 2023Updated 2 years ago
- 🦩 Official repository of paper "Visual Instruction Tuning with Polite Flamingo" (AAAI-24 Oral)☆65Dec 9, 2023Updated 2 years ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Nov 21, 2024Updated last year
- [ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models☆86Aug 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Source code for the paper "Prefix Language Models are Unified Modal Learners"☆45Apr 30, 2023Updated 3 years ago
- (CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.☆364Jan 14, 2025Updated last year
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Jan 25, 2024Updated 2 years ago
- This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regulari…☆21Dec 17, 2022Updated 3 years ago
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Jul 25, 2023Updated 2 years ago
- ☆37Oct 7, 2023Updated 2 years ago
- ☆354May 25, 2024Updated 2 years ago
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆262Aug 21, 2025Updated 9 months ago
- [PACT'24] GraNNDis. A fast and unified distributed graph neural network (GNN) training framework for both full-batch (full-graph) and min…☆10Aug 13, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"☆19Oct 4, 2022Updated 3 years ago
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆338Oct 14, 2025Updated 8 months ago
- MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU☆360Dec 18, 2023Updated 2 years ago
- Official code repo of PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs☆26Jan 14, 2025Updated last year
- AutoHallusion Codebase (EMNLP 2024)☆23Dec 6, 2024Updated last year
- This repository contains the official code for our paper: Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visu…☆25Nov 15, 2024Updated last year
- [NeurIPS2023] Official implementation of the paper "Large Language Models are Visual Reasoning Coordinators"☆106Nov 9, 2023Updated 2 years ago
- Implementation of AlphaZero in PyTorch.☆10Apr 19, 2019Updated 7 years ago
- [ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning☆297Mar 13, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Better Transition-Based AMR Parsing with a Refined Search Space (authors' DyNet implementation for the EMNLP18 paper)☆10Jun 13, 2019Updated 7 years ago
- [NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents☆317Apr 16, 2024Updated 2 years ago
- ☆54Oct 17, 2023Updated 2 years ago
- A Comprehensive Benchmark for Robust Multi-image Understanding☆21Sep 4, 2024Updated last year
- [MIDL 2023] Official Imeplementation of "Making Your First Choice: To Address Cold Start Problem in Vision Active Learning"☆37Aug 3, 2023Updated 2 years ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆46Nov 17, 2023Updated 2 years ago
- ☆30Jan 3, 2023Updated 3 years ago