FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
541Updated 8 months ago

Alternatives and similar repositories for Groma:

Users that are interested in Groma are comparing it to the libraries listed below