rickyang1114 / multimodal-deepresearcherLinks
Multimodal Deepresearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework
☆25Updated 3 months ago
Alternatives and similar repositories for multimodal-deepresearcher
Users that are interested in multimodal-deepresearcher are comparing it to the libraries listed below
Sorting:
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆47Updated 8 months ago
- ☆28Updated 3 weeks ago
- [ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆20Updated 7 months ago
- Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).☆10Updated last month
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆42Updated last week
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆15Updated last year
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆16Updated 3 weeks ago
- ☆32Updated 3 months ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆12Updated 10 months ago
- 🚀 LLM-I: Transform LLMs into natural interleaved multimodal creators! ✨ Tool-use framework supporting image search, generation, code ex…☆33Updated 2 weeks ago
- [NeurIPS 2025] Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆50Updated last month
- ☆16Updated last year
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆120Updated 2 months ago
- Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models☆39Updated last year
- Official repository of Graph RAG-Tool Fusion and ToolLinkOS dataset.☆21Updated 8 months ago
- ☆50Updated 5 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆22Updated 9 months ago
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆52Updated 3 months ago
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆27Updated last year
- ☆23Updated 5 months ago
- Resa: Transparent Reasoning Models via SAEs☆44Updated last month
- Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval And Synthesis For SLMs☆48Updated last month
- Code implementation for DyG-RAG: Dynamic Graph Retrieval-Augmented Generation with Event-Centric Reasoning.☆30Updated 2 months ago
- DELT: Data Efficacy for Language Model Training☆42Updated 2 months ago
- 😊 TPTT: Transforming Pretrained Transformers into Titans☆29Updated 3 weeks ago
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆19Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- Quick Long Video Understanding☆68Updated last week
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆31Updated 2 months ago
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆23Updated 3 weeks ago