multimodal-art-projection / P2PLinks
P2P: Automated Paper-to-Poster Generation and Fine-Grained Benchmark
☆24Updated last week
Alternatives and similar repositories for P2P
Users that are interested in P2P are comparing it to the libraries listed below
Sorting:
- Repo for "VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforce…☆156Updated this week
- MLLM @ Game☆14Updated 3 weeks ago
- [ACL 2024] ChartAssistant is a chart-based vision-language model for universal chart comprehension and reasoning.☆118Updated 9 months ago
- The official repository for the Scientific Paper Idea Proposer (SciPIP)☆62Updated 3 months ago
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆45Updated 3 months ago
- An Easy-to-use Hallucination Detection Framework for LLMs.☆59Updated last year
- ☆59Updated last week
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆32Updated 2 months ago
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆27Updated last week
- Code and Data for Our NeurIPS 2024 paper "AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback"☆32Updated 7 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆82Updated 2 months ago
- [ACL'2024 Findings] GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluation☆60Updated last year
- ☆42Updated 3 months ago
- Search, organize, discover anything!☆48Updated last year
- Qwen DianJin: LLMs for the Financial Industry by Alibaba Cloud☆106Updated 2 weeks ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆61Updated 8 months ago
- LLM手撕代码合集☆11Updated 2 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆76Updated 2 weeks ago
- Official PyTorch Implementation of MLLM Is a Strong Reranker: Advancing Multimodal Retrieval-augmented Generation via Knowledge-enhanced …☆78Updated 6 months ago
- ☆81Updated last year
- [IJCAI 2024] CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning☆24Updated last year
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆57Updated 3 weeks ago
- ☆142Updated 11 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆73Updated this week
- something for paper agent☆11Updated 5 months ago
- ☆47Updated 11 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- 从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两层MLP投影层连…☆13Updated 3 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆134Updated this week
- MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.☆23Updated 5 months ago