RhapsodyAILab / MiniCPM-V-Embedding
☆23Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for MiniCPM-V-Embedding
- ☆74Updated 8 months ago
- The huggingface implementation of Fine-grained Late-interaction Multi-modal Retriever.☆69Updated 2 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆38Updated 4 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆66Updated this week
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering. A comprehensive evaluation of multimodal large model multilingua…☆45Updated last month
- This repo contains the code and data for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks"☆77Updated this week
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆67Updated 4 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆53Updated 3 weeks ago
- ☆35Updated 2 months ago
- ☆30Updated 6 months ago
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆75Updated last month
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆36Updated 2 months ago
- Official Repository of MMLONGBENCH-DOC: Benchmarking Long-context Document Understanding with Visualizations☆57Updated 4 months ago
- ☆17Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆102Updated last month
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆29Updated 5 months ago
- An Experiment on Dynamic NTK Scaling RoPE☆61Updated 11 months ago
- ☆46Updated 2 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- Empirical Study Towards Building An Effective Multi-Modal Large Language Model☆23Updated last year
- Fantastic Data Engineering for Large Language Models☆51Updated 3 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆31Updated this week
- Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊☆236Updated 3 weeks ago
- Video dataset dedicated to portrait-mode video recognition.☆38Updated 7 months ago
- FuseAI Project☆76Updated 3 months ago
- Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context☆17Updated 3 months ago
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆29Updated 10 months ago
- Datasets and Evaluation Scripts for CompHRDoc☆27Updated 7 months ago
- ☆40Updated 5 months ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆33Updated last month