mbzuai-oryx / Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
235Updated 8 months ago

Related projects: