Y-Research-SBU / PosterGenLinks
Official Code for PosterGen
☆139Updated last week
Alternatives and similar repositories for PosterGen
Users that are interested in PosterGen are comparing it to the libraries listed below
Sorting:
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆159Updated 3 weeks ago
- ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations [COLM 2025]☆232Updated 2 months ago
- ☆130Updated 3 weeks ago
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆341Updated 2 months ago
- Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external …☆45Updated last year
- The development and future prospects of multimodal reasoning models.☆497Updated last month
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆84Updated 4 months ago
- (ICLR'25) A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆84Updated 7 months ago
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆341Updated last month
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆322Updated last month
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆84Updated 3 months ago
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆258Updated 2 weeks ago
- 🔥🔥🔥 ICLR 2025 Oral. Automating Agentic Workflow Generation.☆267Updated 2 months ago
- 🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal rei…☆179Updated 2 months ago
- ☆82Updated 5 months ago
- ☆42Updated 10 months ago
- OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.☆306Updated 3 months ago
- Collect every awesome work about r1!☆416Updated 4 months ago
- 💡 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning☆257Updated this week
- ☆30Updated 2 months ago
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆228Updated 3 months ago
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆283Updated 4 months ago
- This is a survey of research on AI scientists, AI researchers, AI engineers, and a series of AI-driven research studies☆125Updated last month
- VeriGUI: Verifiable Long-Chain GUI Dataset☆81Updated last month
- ☆491Updated 3 weeks ago
- Visual Planning: Let's Think Only with Images☆270Updated 4 months ago
- Scalable and extensible reinforcement learning for LM agents.☆71Updated this week
- Deep Research Agent CognitiveKernel-Pro from Tencent AI Lab. Paper: https://arxiv.org/pdf/2508.00414☆348Updated 3 weeks ago
- Towards a Unified View of Large Language Model Post-Training☆134Updated 2 weeks ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆233Updated last month