inFaaa / EvolverLinks
[COLING 2025π₯] Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection
β16Updated 10 months ago
Alternatives and similar repositories for Evolver
Users that are interested in Evolver are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024 Findingsπ₯] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inβ¦β104Updated last year
- Official Code and data for ACL 2024 finding, "An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models"β23Updated last year
- [EMNLP 2024] mDPO: Conditional Preference Optimization for Multimodal Large Language Models.β83Updated last year
- A Self-Training Framework for Vision-Language Reasoningβ87Updated 10 months ago
- β29Updated 5 months ago
- [NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentationβ98Updated 2 months ago
- β22Updated 6 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuningβ87Updated 9 months ago
- [NeurIPS 2025] More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Modelsβ69Updated 6 months ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Taskβ36Updated 7 months ago
- β¨β¨The Curse of Multi-Modalities (CMM): Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audioβ50Updated 4 months ago
- Official repository of the video reasoning benchmark MMR-V. Can Your MLLMs "Think with Video"?β36Updated 5 months ago
- Agentic MLLMsβ87Updated last month
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-ofβ¦β64Updated 6 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ69Updated 4 months ago
- β61Updated 6 months ago
- VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMsβ51Updated 8 months ago
- Code release for VTW (AAAI 2025 Oral)β64Updated 3 weeks ago
- β123Updated last week
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)β55Updated last year
- Official codebase for the paper Latent Visual Reasoningβ42Updated last month
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)β82Updated 2 months ago
- [ICLR 2025] The official pytorch implement of "Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Contβ¦β62Updated 2 months ago
- [ACM MM 2025] TimeChat-online: 80% Visual Tokens are Naturally Redundant in Streaming Videosβ94Updated 2 months ago
- This repository will continuously update the latest papers, technical reports, benchmarks about multimodal reasoning!β52Updated 8 months ago
- π₯An open-source survey of the latest video reasoning tasks, paradigms, and benchmarks.β28Updated this week
- [NeurIPS 2024] Calibrated Self-Rewarding Vision Language Modelsβ81Updated last month
- MoCLE (First MLLM with MoE for instruction customization and generalization!) (https://arxiv.org/abs/2312.12379)β44Updated 4 months ago
- The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"β111Updated last week
- Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"β34Updated last year