[ICCV 2025] A Benchmark for Multi-Step Reasoning in Long Narrative Videos
☆24Aug 8, 2025Updated 6 months ago
Alternatives and similar repositories for VRBench
Users that are interested in VRBench are comparing it to the libraries listed below
Sorting:
- Champion Solutions repository for Perception Test challenges in ICCV2023 workshop.☆14Oct 18, 2023Updated 2 years ago
- Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?☆87Jul 13, 2025Updated 7 months ago
- Doodling our way to AGI ✏️ 🖼️ 🧠☆121May 29, 2025Updated 9 months ago
- ☆21May 19, 2025Updated 9 months ago
- Building a multi-agent RAG system with advanced RAG methods☆12Jan 12, 2025Updated last year
- [CVPR 2026] Official repo for "VideoSSR: Video Self-Supervised Reinforcement Learning"☆32Nov 11, 2025Updated 3 months ago
- A simple exam generator and grader written in Python with OpenCV☆14Jan 14, 2026Updated last month
- ☆12Jan 4, 2026Updated last month
- Common template for pytorch project. Easy to extent and modify for new project.☆13Dec 13, 2022Updated 3 years ago
- ☆28Jan 5, 2026Updated last month
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆25May 31, 2025Updated 8 months ago
- Surrogate Modeling of the Aerodynamic Performance for Transonic Regime☆13Feb 12, 2024Updated 2 years ago
- mouse pet-ct image segmentation☆12Feb 19, 2023Updated 3 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- Generate a 3D BIM Model from 2D CAD Drawings☆12Nov 23, 2022Updated 3 years ago
- [ICML2023] Long-Term Rhythmic Video Soundtracker☆62Jul 28, 2025Updated 7 months ago
- A browser based CadQuery server☆12Feb 18, 2025Updated last year
- Evaluating Visual Fidelity of Image Descriptions☆11Aug 15, 2019Updated 6 years ago
- Perceptual Position-aware Shapelet Network, accepted by ECML PKDD 2022☆13Jun 27, 2022Updated 3 years ago
- ☆15Dec 20, 2024Updated last year
- ☆37Nov 14, 2025Updated 3 months ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- A comprehensive collection of data quality resources, tools, papers, and projects across various data types including traditional data, L…☆25Aug 29, 2025Updated 6 months ago
- [AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan☆14Oct 18, 2022Updated 3 years ago
- Karras et al. (2022) diffusion models for PyTorch☆17Oct 5, 2023Updated 2 years ago
- python多进程爬虫+文件/SQL存储☆10Mar 7, 2022Updated 3 years ago
- This is the official repository of Daily-Omni: Towards Audio-Visual Reasoning with Temporal Alignment across Modalities☆38Updated this week
- 常用的ocr数据集☆14Nov 15, 2021Updated 4 years ago
- ☆13Feb 5, 2025Updated last year
- A survey on MM-LLMs for long video understanding: From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long…☆18Sep 12, 2025Updated 5 months ago
- Learning to Copy for Automatic Post-Editing (EMNLP 2019)☆11May 6, 2021Updated 4 years ago
- Code for "Theoretical Foundations of Deep Selective State-Space Models" (NeurIPS 2024)☆15Jan 7, 2025Updated last year
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆12Apr 12, 2024Updated last year
- RLHF for Video Diffusion Models☆23Jul 30, 2025Updated 7 months ago
- ☆14Jul 6, 2025Updated 7 months ago
- A dataset for multimodal machine translation☆13Dec 6, 2021Updated 4 years ago
- Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval And Synthesis For SLMs☆54Oct 7, 2025Updated 4 months ago
- Initially a fork of the GitHub repository for the paper "Informer" accepted by AAAI 2021. Heavily modified since then.☆15Apr 7, 2023Updated 2 years ago
- ☆19Aug 8, 2024Updated last year