om-ai-lab / OmModel
A collection of strong multimodal models for building multimodal AGI agents
☆38Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for OmModel
- ☆74Updated 8 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆70Updated last week
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- ☆23Updated 3 months ago
- Representing Rule-based Chatbots with Transformers☆18Updated 4 months ago
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation☆134Updated 3 weeks ago
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆17Updated 3 months ago
- This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"☆39Updated 4 months ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆26Updated 2 months ago
- Video dataset dedicated to portrait-mode video recognition.☆38Updated 7 months ago
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆36Updated 2 months ago
- Official implement of MIA-DPO☆41Updated 3 weeks ago
- ☆17Updated last year
- ☆69Updated 6 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆26Updated 4 months ago
- ☆35Updated 3 months ago
- Official repository of MMDU dataset☆75Updated last month
- Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆22Updated last month
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆102Updated last month
- ☆30Updated this week
- Precision Search through Multi-Style Inputs☆54Updated 4 months ago
- A suite of multimodal language models that are powerful and efficient☆16Updated 2 months ago
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆23Updated last year
- ☆30Updated 6 months ago
- ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI☆95Updated 4 months ago
- [EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding☆47Updated 10 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 8 months ago
- MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models☆55Updated 2 months ago
- VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".☆83Updated 4 months ago