inFaaa / EvolverLinks
[COLING 2025π₯] Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
β15Updated 8 months ago
Alternatives and similar repositories for Evolver
Users that are interested in Evolver are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024 Findingsπ₯] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inβ¦β100Updated 10 months ago
- A Self-Training Framework for Vision-Language Reasoningβ84Updated 8 months ago
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.β81Updated 10 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Taskβ36Updated 5 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentationβ88Updated last month
- β82Updated last year
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)β55Updated 10 months ago
- β55Updated 4 months ago
- Code release for VTW (AAAI 2025 Oral)β49Updated 2 months ago
- Official Code and data for ACL 2024 finding, "An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models"β23Updated 10 months ago
- β26Updated 3 months ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!β51Updated 6 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"β35Updated last year
- β¨β¨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audioβ48Updated 2 months ago
- MME-CoT: Benchmarking Chain-of-Thought in LMMs for Reasoning Quality, Robustness, and Efficiencyβ129Updated last month
- Doodling our way to AGI βοΈ πΌοΈ π§β103Updated 3 months ago
- More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Modelsβ56Updated 3 months ago
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videosβ79Updated last week
- [ICML 2025] Official implementation of paper 'Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation inβ¦β153Updated 2 weeks ago
- [arXiv2505] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chainsβ50Updated last month
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Contβ¦β55Updated this week
- β19Updated 4 months ago
- β46Updated 5 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ69Updated 2 months ago
- Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"β64Updated this week
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generationβ26Updated 2 months ago
- β140Updated 7 months ago
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)β74Updated 3 months ago
- β101Updated 2 months ago
- VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Modelsβ73Updated last year