Y-Research-SBU / PosterGenLinks
Official Repository for PosterGen
☆170Updated this week
Alternatives and similar repositories for PosterGen
Users that are interested in PosterGen are comparing it to the libraries listed below
Sorting:
- ☆147Updated last week
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆377Updated last week
- The development and future prospects of large multimodal reasoning models.☆518Updated 2 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆335Updated last month
- ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]☆236Updated 3 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆163Updated last week
- Scalable and extensible reinforcement learning for LM agents.☆84Updated last week
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆236Updated 2 months ago
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆345Updated 2 months ago
- Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external …☆46Updated last year
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆87Updated 4 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆315Updated 4 months ago
- MiroThinker is open-source agentic models trained for deep research and complex tool use scenarios.☆467Updated last week
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆263Updated 2 months ago
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆33Updated this week
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆270Updated this week
- Towards a Unified View of Large Language Model Post-Training☆163Updated last month
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆233Updated 3 weeks ago
- Visual Planning: Let's Think Only with Images☆278Updated 5 months ago
- 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…☆185Updated last week
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆85Updated 4 months ago
- Official implementation of X-Master, a general-purpose tool-augmented reasoning agent.☆280Updated this week
- ☆31Updated 3 months ago
- Codebase for paper ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools☆17Updated last month
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆86Updated 8 months ago
- Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL.☆459Updated last month
- 🔥🔥🔥 ICLR 2025 Oral. Automating Agentic Workflow Generation.☆292Updated 2 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆190Updated 3 months ago
- 📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.☆299Updated last week
- ☆561Updated this week