[NeurIPS2023] Exploring Diverse In-Context Configurations for Image Captioning
☆44Nov 26, 2024Updated last year
Alternatives and similar repositories for ExploreCfg
Users that are interested in ExploreCfg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open source implementation of the paper "MM-Vid: Advancing Video Understanding with GPT-4V(ision)".☆40Jan 4, 2026Updated 4 months ago
- [CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga☆146Jan 19, 2026Updated 3 months ago
- [CVPR 2024] How to Configure Good In-Context Sequence for Visual Question Answering☆21May 28, 2025Updated 11 months ago
- An in-context learning research testbed☆19Mar 16, 2025Updated last year
- [NIPS 25'] Evaluation code of paper "KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models"☆45Oct 19, 2025Updated 6 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆56Mar 31, 2025Updated last year
- Training Vision Transformers for Semi-Supervised Semantic Segmentation☆16Nov 3, 2025Updated 6 months ago
- ☆20Sep 19, 2023Updated 2 years ago
- [ICLR'26] Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?☆52Mar 9, 2026Updated 2 months ago
- DualGNN: Dual Graph Neural Network for Micro-video Recommendation☆17Apr 8, 2026Updated last month
- Implementation of "Interleaved Latent Visual Reasoning with Selective Perceptual Modeling".☆50Apr 8, 2026Updated last month
- A curated list of papers, datasets and resources pertaining to zero-shot object detection.☆29Mar 15, 2023Updated 3 years ago
- Personalized Image Generation with Large Multimodal Models☆15May 13, 2025Updated 11 months ago
- [ICML 2025] Official code of "DAMA: Data- and Model-aware Alignment of Multi-modal LLMs"☆16May 24, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context …☆22Mar 1, 2024Updated 2 years ago
- [ICCV 2025] HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets☆66Aug 6, 2025Updated 9 months ago
- A LLM model for space understanding☆25Sep 12, 2025Updated 7 months ago
- PyTorch implementation of Semi-Supervised Learning with Scarce Annotations https://arxiv.org/pdf/1905.08845.pdf☆13Jan 6, 2020Updated 6 years ago
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆60Feb 4, 2026Updated 3 months ago
- ☆48Apr 5, 2020Updated 6 years ago
- [ICCV 2025] Official Implementation of "Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation". Junyu Xie, Tengda H…☆22Jul 26, 2025Updated 9 months ago
- ☆13Feb 25, 2025Updated last year
- ☆17Feb 23, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models (ICLR2026)☆22Mar 29, 2026Updated last month
- The implementation of Learning Instance and Task-Aware Dynamic Kernels for Few Shot Learning☆13Apr 14, 2024Updated 2 years ago
- ☆12Jul 4, 2024Updated last year
- RO-ViT CVPR 2023 "Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers"☆17Aug 24, 2023Updated 2 years ago
- ☆13May 13, 2025Updated 11 months ago
- The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering☆20May 10, 2022Updated 3 years ago
- ☆27Feb 2, 2024Updated 2 years ago
- Official Repository of Personalized Visual Instruct Tuning☆34Mar 6, 2025Updated last year
- ☆16Sep 12, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 【ICCV 2023】Towards Instance-adaptive Inference for Federated Learning☆12Mar 31, 2025Updated last year
- 微信小程序学习Demo☆15Sep 14, 2018Updated 7 years ago
- ☆12Apr 25, 2025Updated last year
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 6 months ago
- ☆61Mar 23, 2022Updated 4 years ago
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆844May 14, 2025Updated 11 months ago
- ☆43May 30, 2025Updated 11 months ago