patrick-tssn / Awesome-Multimodal-MemoryLinks
Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external memory/knowledge augmented MLLM.
☆39Updated 10 months ago
Alternatives and similar repositories for Awesome-Multimodal-Memory
Users that are interested in Awesome-Multimodal-Memory are comparing it to the libraries listed below
Sorting:
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆254Updated 2 weeks ago
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆77Updated 2 months ago
- [ICCV 2025] The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"☆164Updated 4 months ago
- Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and Dee…☆55Updated 4 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆263Updated last month
- The development and future prospects of multimodal reasoning models.☆436Updated this week
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆60Updated 2 months ago
- Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆201Updated last week
- Collect every awesome work about r1!☆395Updated 2 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆80Updated 2 weeks ago
- Efficient Agent Training for Computer Use☆114Updated last month
- Awesome-Large-Search-Models is a collection of papers and resources (Methods, Datasets and other resources) about awesome agentic search …☆110Updated 3 weeks ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆113Updated 3 weeks ago
- Towards Large Multimodal Models as Visual Foundation Agents☆221Updated 2 months ago
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆45Updated 3 months ago
- ☆280Updated last month
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆280Updated 2 weeks ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆79Updated last month
- MMR1: Advancing the Frontiers of Multimodal Reasoning☆162Updated 4 months ago
- [ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search☆103Updated last month
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆147Updated this week
- ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]☆218Updated last week
- The All-in-one Judge Models introduced by Opencompass☆96Updated 4 months ago
- ☆155Updated 2 months ago
- ☆50Updated 3 weeks ago
- ☆186Updated this week
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆262Updated last month
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆244Updated 8 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆228Updated last month
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆71Updated 7 months ago