☆60Feb 5, 2026Updated 3 weeks ago
Alternatives and similar repositories for MIRBench
Users that are interested in MIRBench are comparing it to the libraries listed below
Sorting:
- ☆55Feb 2, 2026Updated last month
- ☆57Feb 12, 2026Updated 2 weeks ago
- Official repository of the paper "Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video Anomaly"☆83Dec 25, 2024Updated last year
- Self-use code examples for remote management of the vsphere platform using the pyvmomi library☆66Jan 7, 2025Updated last year
- Cloud API-based English speaking practice application☆61Dec 29, 2024Updated last year
- AntiRec is a cross-platform app that uses advanced audio processing to subtly alter microphone input, preventing ASR recognition while ke…☆186Aug 18, 2025Updated 6 months ago
- [CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Vid…☆88Dec 23, 2025Updated 2 months ago
- Simple and efficient -- a novel unsupervised community detection with the fusion of modularity and network structure☆104Dec 26, 2024Updated last year
- Official implement of MIA-DPO☆70Jan 23, 2025Updated last year
- Open foundation models, such LLama2, ChatGLM, etc.☆119Sep 18, 2024Updated last year
- StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding☆149May 16, 2025Updated 9 months ago
- Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"☆149Mar 22, 2025Updated 11 months ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆172Jul 4, 2024Updated last year
- Video Chain of Thought, Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"☆180Feb 25, 2025Updated last year
- PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models☆262Aug 5, 2025Updated 6 months ago
- The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM", IJCV2025☆276May 26, 2025Updated 9 months ago
- Awesome papers & datasets specifically focused on long-term videos.☆355Oct 9, 2025Updated 4 months ago
- [CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allo…☆396Aug 24, 2024Updated last year
- The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Mem…☆395Apr 20, 2024Updated last year
- DeepAudit:人人拥有的 AI 黑客战队,让漏洞挖掘触手可及。国内首个开源的代码漏洞挖掘多智能体系统。小白一键部署 运行,自主协作审计 + 自动化沙箱 PoC 验证。支持 Ollama 私有部署 ,一键生成报告。支持中转站。让安全不再昂贵,让审计不再复杂。☆4,847Updated this week
- R1-onevision, a visual language model capable of deep CoT reasoning.☆576Apr 13, 2025Updated 10 months ago
- MindSpore + 🤗Huggingface: Run any Transformers/Diffusers model on MindSpore with seamless compatibility and acceleration.☆910Updated this week
- 📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).☆983Sep 27, 2025Updated 5 months ago
- This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-bas…☆1,360Updated this week
- [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation☆1,151Sep 13, 2025Updated 5 months ago
- A fork to add multimodal model training to open-r1☆1,493Feb 8, 2025Updated last year
- [ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the cap…☆1,492Aug 5, 2025Updated 6 months ago
- Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources☆2,120Feb 3, 2026Updated last month
- [ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning☆2,130Dec 12, 2025Updated 2 months ago
- A list of awesome papers and resources of recommender system on large language model (LLM).☆2,215Mar 17, 2025Updated 11 months ago
- A collection of AWESOME things about Graph-Related LLMs.☆2,405Nov 5, 2025Updated 3 months ago
- 🔥🔥🔥 [IEEE TCSVT] Latest Papers, Codes and Datasets on Vid-LLMs.☆3,087Dec 20, 2025Updated 2 months ago
- Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models☆3,154Jan 10, 2025Updated last year
- 手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube☆3,795Jul 15, 2024Updated last year
- 从零开始内网渗透学习☆3,010Apr 8, 2016Updated 9 years ago
- 中文nlp解决方案(大模型、数据、模型、训练、推理)☆3,779Aug 5, 2025Updated 6 months ago
- text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。☆4,940Feb 14, 2026Updated 2 weeks ago
- Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.☆5,803Aug 29, 2025Updated 6 months ago
- 中国大模型☆6,388Nov 30, 2024Updated last year