CLUEbenchmark / SuperCLUE-Video
中文原生多层次文生视频测评基准
☆17Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for SuperCLUE-Video
- ☆66Updated last year
- Chinese CLIP models with SOTA performance.☆48Updated last year
- ☆32Updated 2 years ago
- ☆77Updated 6 months ago
- ☆55Updated 9 months ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 2 months ago
- ☆27Updated 6 months ago
- WuDaoMM this is a data project☆66Updated 2 years ago
- ☆17Updated last year
- ☆46Updated 2 months ago
- 基于baichuan-7b的开源多模态大语言模型☆72Updated 11 months ago
- Research Code for Multimodal-Cognition Team in Ant Group☆123Updated 4 months ago
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆46Updated 9 months ago
- the world's first large-scale multi-modal short-video encyclopedia, where the primitive units are items, aspects, and short videos.☆60Updated 11 months ago
- ☆22Updated 3 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆56Updated this week
- ☆59Updated last year
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆51Updated 3 weeks ago
- Building a VLM model starts from the basic module.☆10Updated 7 months ago
- ☆30Updated 6 months ago
- ☆68Updated last week
- ☆13Updated 3 months ago
- TaiSu(太素)--a large-scale Chinese multimodal dataset(亿级大规模中文视觉语言预训练数据集)☆175Updated last year
- Touchstone: Evaluating Vision-Language Models by Language Models☆78Updated 10 months ago
- ☆15Updated 8 months ago
- ☆19Updated 2 years ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆35Updated last month
- A light proxy solution for HuggingFace hub.☆44Updated last year
- Precision Search through Multi-Style Inputs☆54Updated 3 months ago