FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
564Updated 5 months ago

Related projects

Alternatives and complementary repositories for Groma