InternLM / CapRLLinks
An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"
☆172Updated 2 weeks ago
Alternatives and similar repositories for CapRL
Users that are interested in CapRL are comparing it to the libraries listed below
Sorting:
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆212Updated last year
- Pixel-Level Reasoning Model trained with RL [NeuIPS25]☆260Updated 2 months ago
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆164Updated last year
- The SAIL-VL2 series model developed by the BytedanceDouyinContent Group☆76Updated 3 months ago
- Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding☆187Updated 3 weeks ago
- A Simple Framework of Small-scale LMMs for Video Understanding☆108Updated 7 months ago
- Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"☆73Updated last month
- [CVPR 2025]Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction