Mountchicken / ResophyLinks
๐ฏ Read research papers faster with AI. Resophy is an HTML-based AI paper reader with: ๐ค AI Translation & Analysis โ instantly understand structure, contributions, and results ๐ Daily arXiv Recommendations โ discover relevant papers with less noise ๐ ๏ธ Vibe Coding Oriented โ agent-friendly and easy to customize
โ161Updated last month
Alternatives and similar repositories for Resophy
Users that are interested in Resophy are comparing it to the libraries listed below
Sorting:
- [ICCV2025] A Token-level Text Image Foundation Model for Document Understandingโ130Updated 5 months ago
- The official repository of the dots.vlm1 instruct models proposed by rednote-hilab.โ284Updated 4 months ago
- โ74Updated 8 months ago
- Rex-Thinker: Grounded Object Refering via Chain-of-Thought Reasoningโ142Updated 7 months ago
- [NeurIPS 2025] Official code implementation of Perception R1: Pioneering Perception Policy with Reinforcement Learningโ285Updated 6 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Enginesโ130Updated last year
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training โฆโ68Updated 9 months ago
- โ85Updated 5 months ago
- Build a daily academic subscription pipeline! Get daily Arxiv papers and corresponding chatGPT summaries with pre-defined keywords. It isโฆโ46Updated 2 years ago
- ๐ฎ UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning (NeurIPS 2025)โ228Updated last month
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Modelsโ90Updated 8 months ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool woโฆโ29Updated last year
- Vision Manus: Your versatile Visual AI assistantโ318Updated this week
- [arXiv 25] Aesthetics is Cheap, Show me the Text: An Empirical Evaluation of State-of-the-Art Generative Models for OCRโ248Updated 5 months ago
- Official repo of Griffon series including v1(ECCV 2024), v2(ICCV 2025), G, and R, and also the RL tool Vision-R1.โ249Updated 5 months ago
- Code for paper: Reinforced Vision Perception with Toolsโ69Updated 4 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Modelsโ65Updated last year
- Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-stepโ159Updated 6 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tooโฆโ392Updated 5 months ago
- Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understandingโ210Updated 3 months ago
- New generation of CLIP with fine grained discrimination capability, ICML2025โ545Updated 3 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]โ200Updated 6 months ago
- (ICLR 2026) An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"โ186Updated this week
- Margin-based Vision Transformerโ64Updated 2 months ago
- Pixel-Level Reasoning Model trained with RL [NeuIPS25]โ273Updated 3 months ago
- DELT: Data Efficacy for Language Model Trainingโ43Updated 2 weeks ago
- โ43Updated 6 months ago
- [ICLR2025] Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Wantโ93Updated 2 months ago
- โ34Updated 11 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Visionโ316Updated 2 weeks ago