Xiuyuan-Chen / AutoEval-VideoView external linksLinks
☆37Jan 25, 2024Updated 2 years ago
Alternatives and similar repositories for AutoEval-Video
Users that are interested in AutoEval-Video are comparing it to the libraries listed below
Sorting:
- ☆17Feb 22, 2024Updated last year
- VideoHallucer, The first comprehensive benchmark for hallucination detection in large video-language models (LVLMs)☆42Dec 16, 2025Updated last month
- [TMLR 2024] Official implementation of "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"☆20Sep 15, 2023Updated 2 years ago
- ☆28Nov 10, 2025Updated 3 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆57May 28, 2025Updated 8 months ago
- [ICCV 2025] LVBench: An Extreme Long Video Understanding Benchmark☆136Jul 9, 2025Updated 7 months ago
- ☆32Apr 18, 2021Updated 4 years ago
- Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.☆32Feb 26, 2025Updated 11 months ago
- Code for our Paper "All in an Aggregated Image for In-Image Learning"☆29Apr 9, 2024Updated last year
- Bayes-Adaptive RL for LLM Reasoning☆45May 28, 2025Updated 8 months ago
- Repository related to Cranfield's AAI MSCs GDP☆11Apr 8, 2023Updated 2 years ago
- Official code for ICLR 2024 paper "Do Generated Data Always Help Contrastive Learning?"☆31Apr 4, 2024Updated last year
- The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆247Aug 21, 2025Updated 5 months ago
- RAMPA: Robotic Augmented Reality for Machine Programming by Demonstration https://arxiv.org/abs/2410.13412☆16Oct 6, 2025Updated 4 months ago
- ☆10Updated this week
- ☆12Jan 11, 2026Updated last month
- ☆12Jun 26, 2024Updated last year
- ☆10Nov 15, 2023Updated 2 years ago
- P1AC: Revisiting Absolute Pose From a Single Affine Correspondence☆11Mar 19, 2024Updated last year
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated 11 months ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- LED : Light Enhanced Depth Estimation at Night☆13Dec 9, 2025Updated 2 months ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆121Sep 28, 2025Updated 4 months ago
- [EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Language Models''☆106Aug 21, 2025Updated 5 months ago
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- ☆13Jan 13, 2025Updated last year
- Shaping Language Models with Cognitive Insights☆15Feb 29, 2024Updated last year
- ☆10Nov 21, 2023Updated 2 years ago
- YOLO for Uniform Directed Object detection☆13Mar 28, 2024Updated last year
- ☆12Nov 5, 2024Updated last year
- ☆10Aug 24, 2023Updated 2 years ago
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- This is IABAC Project. The project's business rationale entails utilizing the dataset's provided features to forecast employee performanc…☆11Dec 28, 2024Updated last year
- ☆12Oct 24, 2023Updated 2 years ago
- ☆10May 12, 2018Updated 7 years ago
- ☆11Aug 12, 2024Updated last year
- ☆10Aug 29, 2023Updated 2 years ago