Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.
☆32Feb 26, 2025Updated last year
Alternatives and similar repositories for VCR
Users that are interested in VCR are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Feb 22, 2024Updated 2 years ago
- The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]☆27Dec 28, 2024Updated last year
- ☆13May 9, 2023Updated 2 years ago
- [SCIS 2024] The official implementation of the paper "MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Di…☆62Nov 7, 2024Updated last year
- Experimenting with kernel density estimation and (soft) histograms using tensorflow data flow graphs.☆10Dec 28, 2017Updated 8 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆10Jun 28, 2023Updated 2 years ago
- [CVPR 2024] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation☆13Jun 17, 2024Updated last year
- Official repo for From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models☆31Nov 2, 2025Updated 5 months ago
- ☆17Oct 22, 2024Updated last year
- [ACL 2025] Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL☆15Oct 9, 2025Updated 6 months ago
- QALD-9-Plus Dataset for Knowledge Graph Question Answering☆29Jun 5, 2024Updated last year
- ☆11Feb 14, 2022Updated 4 years ago
- Official repository of MMDU dataset☆107Sep 29, 2024Updated last year
- Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 30+ benchmarks☆15Feb 17, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆134Sep 28, 2025Updated 6 months ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆145Apr 22, 2025Updated 11 months ago
- [ICML 2024] | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI☆118Apr 6, 2026Updated last week
- CatMAE☆14Dec 13, 2023Updated 2 years ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Dec 12, 2024Updated last year
- ☆47Nov 8, 2024Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆43Jan 18, 2026Updated 3 months ago
- [NeurIPS 2024] Efficiency for Free: Ideal Data Are Transportable Representations☆19Jan 19, 2025Updated last year
- DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling☆36Jul 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models☆153Dec 5, 2024Updated last year
- ☆48Sep 5, 2024Updated last year
- Unofficial Paddle implementation of "Swin Transformer V2: Scaling Up Capacity and Resolution"☆33Nov 28, 2021Updated 4 years ago
- 适用于解决公司、学校电脑一段时间不使用网络即自动断网,需要网页登录验证问题,基于python3实现,可实时检测电脑网络连接状态,检测到断网后调用谷歌浏览器自动进行网页端登录验证,电脑不关机、本程序处于运行状态中,可实现电脑永不断网。搭配TeamViewer使用可实现无 人值守…☆22Feb 15, 2019Updated 7 years ago
- ☆18Jun 12, 2024Updated last year
- The code implementation of "M2DF: Multi-grained Multi-curriculum Denoising Framework for Multimodal Aspect-based Sentiment Analysis"☆17Dec 8, 2023Updated 2 years ago
- My GSoC Project Proposal, Presentation and other related stuff for Wikimedia Commons App☆12Aug 23, 2018Updated 7 years ago
- ☆21Apr 2, 2025Updated last year
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [NeurIPS 2024] Official implementation of "ClavaDDPM:Multi-relational Data Synthesis with Cluster-guided Diffusion Models"☆20Oct 27, 2024Updated last year
- ☆10Nov 28, 2023Updated 2 years ago
- [LREC-Coling 2024] PECC: Problem Extraction and Coding Challenges☆14May 30, 2024Updated last year
- This is repository of our SIGIR'19 paper Triple-to-Text: Converting RDF Triples into High-Quality Natural Languages via Optimizing an Inv…☆14Sep 10, 2021Updated 4 years ago
- An open source implementation of CLIP (With TULIP Support)☆164May 14, 2025Updated 11 months ago
- ☆19Sep 16, 2025Updated 7 months ago
- The repository of EMNLP 2023 "A Frustratingly Easy Plug-and-Play Detection-and-Reasoning Module for Chinese Spelling Check"☆20Nov 17, 2023Updated 2 years ago