Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.
☆32Feb 26, 2025Updated last year
Alternatives and similar repositories for VCR
Users that are interested in VCR are comparing it to the libraries listed below
Sorting:
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation☆13Sep 2, 2024Updated last year
- ☆17Feb 22, 2024Updated 2 years ago
- ☆11Jan 3, 2024Updated 2 years ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆62Nov 7, 2024Updated last year
- [AAAI 2025] Does VLM Classification Benefit from LLM Description Semantics?☆25Aug 5, 2025Updated 7 months ago
- ☆17Jun 12, 2024Updated last year
- ☆19Sep 16, 2025Updated 5 months ago
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆126Sep 28, 2025Updated 5 months ago
- ☆21Apr 2, 2025Updated 11 months ago
- ☆17Oct 22, 2024Updated last year
- ☆47Nov 8, 2024Updated last year
- ☆45May 21, 2024Updated last year
- The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]☆26Dec 28, 2024Updated last year
- ☆19Jan 10, 2025Updated last year
- ☆48Sep 5, 2024Updated last year
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.☆29Oct 18, 2024Updated last year
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models☆86Oct 26, 2025Updated 4 months ago
- The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". …☆64Nov 5, 2024Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆61Aug 23, 2024Updated last year
- The implementation for CIKM 2024: Towards Completeness-Oriented Tool Retrieval for Large Language Models.☆24Nov 6, 2024Updated last year
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"☆31Dec 23, 2024Updated last year
- ☆30May 22, 2024Updated last year
- ☆37Jan 25, 2024Updated 2 years ago
- TAT-DQA: Towards Complex Document Understanding By Discrete Reasoning☆23Sep 17, 2024Updated last year
- [ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆153Dec 5, 2024Updated last year
- Flash Attention in 300-500 lines of CUDA/C++☆36Aug 22, 2025Updated 6 months ago
- Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs [EMNLP 2023 Findings]☆24Nov 18, 2023Updated 2 years ago
- Contrast-guided Feature Adjustment Module for Visual Information Extraction☆30May 23, 2023Updated 2 years ago
- [ICML 2024] | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI☆117Jul 18, 2024Updated last year
- Parse LaTeX math expressions☆30Oct 22, 2024Updated last year
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆141Apr 22, 2025Updated 10 months ago
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆37Mar 9, 2025Updated last year
- The MEDS Decentralized Extensible Validation (MEDS-DEV) Benchmark: Establishing Reproducibility and Comparability in ML for Health☆36Nov 20, 2025Updated 3 months ago
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- [CVPR 2024] Data and benchmark code for the EgoExoLearn dataset☆80Aug 26, 2025Updated 6 months ago
- [ICLR 2024 Oral] Improving Convergence and Generalization Using Parameter Symmetries☆31May 29, 2024Updated last year
- exploring whether LLMs perform case-based or rule-based reasoning☆30Mar 2, 2024Updated 2 years ago
- [NeurIPS 2024] This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"☆203Sep 26, 2024Updated last year
- ☆35Nov 25, 2025Updated 3 months ago